Patents Examined by Nandini Subramani
  • Patent number: 11721334
    Abstract: A method and apparatus for controlling a device according to an embodiment of the present disclosure may be based on a speech feature of a user reflecting the Lombard effect so as to operate a device located far away from the user, among a plurality of electronic devices. As such, even when the user calls a device located far away from the user without any separate context information, speech recognition neural networks and weight calculation neural networks may be selected and used to operate the device located far away from the user, and reception of a speech signal of the user calling a device located far away from the user may be performed in an Internet of Things (IoT) environment using a 5G network.
    Type: Grant
    Filed: March 5, 2020
    Date of Patent: August 8, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Jong Hoon Chae, Minook Kim, Yongchul Park, Sungmin Han, Siyoung Yang, Sangki Kim, Juyeong Jang
  • Patent number: 11715456
    Abstract: It discloses a serial FFT-based low-power MFCC speech feature extraction circuit, and belongs to the technical field of calculation, reckoning or counting. The circuit is oriented toward the field of intelligence, and is adapted to a hardware circuit design by optimizing an MFCC algorithm, and a serial FFT algorithm and an approximation operation on a multiplication are fully used, thereby greatly reducing a circuit area and power. The entire circuit includes a preprocessing module, a framing and windowing module, an FFT module, a Mel filtration module, and a logarithm and DCT module. The improved FFT algorithm uses a serial pipeline manner to process data, and a time of an audio frame is effectively utilized, thereby reducing a storage area and operation frequency of the circuit under the condition of meeting an output requirement.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: August 1, 2023
    Assignee: SOUTHEAST UNIVERSITY
    Inventors: Weiwei Shan, Lixuan Zhu
  • Patent number: 11677705
    Abstract: An approach is provided that receives a message and applies a deep analytic analysis to the message. The deep analytic analysis results in a set of enriched message embedding (EME) data that is passed to a trained neural network. Based on a set of scores received from the trained neural network, a conversation is identified from a number of available conversations to which the received message belongs. The received first message is then associated with the identified conversation.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: June 13, 2023
    Assignee: International Business Machines Corporation
    Inventors: Devin A. Conley, Priscilla S. Moraes, Lakshminarayanan Krishnamurthy, Oren Sar-Shalom
  • Patent number: 11670292
    Abstract: An electronic device comprising circuitry configured to perform a transcript based voice enhancement based on a transcript to obtain an enhanced audio signal.
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: June 6, 2023
    Assignee: SONY CORPORATION
    Inventors: Fabien Cardinaux, Marc Ferras Font
  • Patent number: 11664013
    Abstract: It discloses a speech feature reuse-based storing and calculating compression method for a keyword-spotting CNN, and belongs to the technical filed of calculating, reckoning or counting. If the updated row number of input data is equal to a convolution step size, every time new input data arrive, an input layer of a neural network replaces the earliest part of the input data with the new input data and meanwhile adjusts an addressing sequence of the input data, thereby performing an operation on the input data and corresponding convolution kernels in an arrival sequence of the input data, and an operation result is stored in an intermediate data memory of the neural network to update corresponding data.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: May 30, 2023
    Assignee: SOUTHEAST UNIVERSITY
    Inventor: Weiwei Shan
  • Patent number: 11631394
    Abstract: A method of detecting occupancy in an area includes obtaining, with a processor, an audio sample from an audio sensor and determining, with the processor, feature functional values of a set of selected feature functionals from the audio sample. The determining of the feature functional values includes extracting features in the set of selected feature functionals from the audio sample, and determining the feature functional values of the set of selected features from the extracted features. The method further includes determining, with the processor, occupancy in the area using a classifier based on the determined feature functional values.
    Type: Grant
    Filed: December 14, 2018
    Date of Patent: April 18, 2023
    Assignee: Robert Bosch GmbH
    Inventors: Zhe Feng, Attila Reiss, Shabnam Ghaffarzadegan, Mirko Ruhs, Robert Duerichen
  • Patent number: 11594212
    Abstract: A method includes receiving a training example for a listen-attend-spell (LAS) decoder of a two-pass streaming neural network model and determining whether the training example corresponds to a supervised audio-text pair or an unpaired text sequence. When the training example corresponds to an unpaired text sequence, the method also includes determining a cross entropy loss based on a log probability associated with a context vector of the training example. The method also includes updating the LAS decoder and the context vector based on the determined cross entropy loss.
    Type: Grant
    Filed: January 21, 2021
    Date of Patent: February 28, 2023
    Assignee: Google LLC
    Inventors: Tara N. Sainath, Ruoming Pang, Ron Weiss, Yanzhang He, Chung-Cheng Chiu, Trevor Strohman
  • Patent number: 11530930
    Abstract: A transportation vehicle having a navigation system and an operating system connected to the navigation system for data transmission via a bus system. The transportation vehicle has a microphone and includes a phoneme generation module for generating phonemes from an acoustic voice signal or the output signal of the microphone; the phonemes are part of a predefined selection of exclusively monosyllabic phonemes; and a phoneme-to-grapheme module for generating inputs to operate the transportation vehicle based on monosyllabic phonemes generated by the phoneme generation module.
    Type: Grant
    Filed: September 12, 2018
    Date of Patent: December 20, 2022
    Assignee: VOLKSWAGEN AKTIENGESELLSCHAFT
    Inventors: Okko Buss, Mark Pleschka
  • Patent number: 11521595
    Abstract: A method for training a speech recognition model with a loss function includes receiving an audio signal including a first segment corresponding to audio spoken by a first speaker, a second segment corresponding to audio spoken by a second speaker, and an overlapping region where the first segment overlaps the second segment. The overlapping region includes a known start time and a known end time. The method also includes generating a respective masked audio embedding for each of the first and second speakers. The method also includes applying a masking loss after the known end time to the respective masked audio embedding for the first speaker when the first speaker was speaking prior to the known start time, or applying the masking loss prior to the known start time when the first speaker was speaking after the known end time.
    Type: Grant
    Filed: May 1, 2020
    Date of Patent: December 6, 2022
    Assignee: Google LLC
    Inventors: Anshuman Tripathi, Han Lu, Hasim Sak
  • Patent number: 11508385
    Abstract: Disclosed is a method of processing a residual signal for audio coding and an audio coding apparatus. The method learns a feature map of a reference signal through a residual signal learning engine including a convolutional layer and a neural network and performs learning based on a result obtained by mapping a node of an output layer of the neural network and a quantization level of index of the residual signal.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: November 22, 2022
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Seung Kwon Beack, Jongmo Sung, Mi Suk Lee, Tae Jin Lee, Hui Yong Kim
  • Patent number: 11462228
    Abstract: A speech intelligibility calculating method is a method executed by a speech intelligibility calculating apparatus, the speech intelligibility calculating method including: a speech intelligibility calculating step of calculating a speech intelligibility that is an objective assessment index of a speech quality, based on a difference component between features found through an analysis of an input clean speech and an input enhanced speech, using one or more filter banks; and a step of outputting the speech intelligibility calculated at the speech intelligibility calculating step. This speech intelligibility calculating method is capable of calculating a speech intelligibility without any dependency on a speech enhancement method.
    Type: Grant
    Filed: August 3, 2018
    Date of Patent: October 4, 2022
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, WAKAYAMA UNIVERSITY
    Inventors: Shoko Araki, Tomohiro Nakatani, Keisuke Kinoshita, Toshio Irino, Toshie Matsui, Katsuhiko Yamamoto
  • Patent number: 11350885
    Abstract: A method includes identifying, by an electronic device, one or more segments within a first audio recording that includes one or more non-speech segments and one or more speech segments. The method also includes generating, by the electronic device, one or more synthetic speech segments that include natural speech audio characteristics and that preserve one or more non-private features of the one or more speech segments. The method also includes generating, by the electronic device, an obfuscated audio recording by replacing the one or more speech segments with the one or more synthetic speech segments while maintaining the one or more non-speech segments, wherein the one or more synthetic speech segments prevent recognition of some content of the obfuscated audio recording.
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: June 7, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Korosh Vatanparvar, Viswam Nathan, Ebrahim Nematihosseinabadi, Md Mahbubur Rahman, Jilong Kuang
  • Patent number: 11335351
    Abstract: Aspects of the disclosure relate to cognitive automation-based engine processing on audio files and streams received from meetings and/or telephone calls. A noise mask can be applied to enhance the audio. Real-time speech analytics separate speech for different speakers into time-stamped streams, which are transcribed and merged into a combined output. The output is parsed by analyzing the combined output for correct syntax, normalized by breaking the parsed data into record groups for efficient processing, validated to ensure that the data satisfies defined formats and input criteria, and enriched to correct for any errors and to augment the audio information. Notifications based on the enriched data may be provided to call or meeting participants.
    Type: Grant
    Filed: March 13, 2020
    Date of Patent: May 17, 2022
    Assignee: Bank of America Corporation
    Inventors: Christine D. Black, Jinna Kim, Todd M. Goodyear, William August Stahlhut, Shola L. Oni, Mardochee Macxis