Patents Examined by Abul K. Azad
  • Patent number: 10381024
    Abstract: A voice activity detection system (100) filters audio input frames (102), on a frame=by-frame basis through a gammatone filterbank (104) to generate filtered gammatone output signals (106). A signal energy calculator (108) takes the filtered gammatone output signals and generates a plurality of energy envelopes. Weighting factors are constructed (112) are applied to each of the energy envelopes thereby producing normalized weighted signal (116), in which voice regions are emphasized and noise regions are minimized. An entropy measurement (118) is taken to extract information from the normalized weighted signals (116) and generate an entropy signal (120). The entropy signal (120) is averaged and compared to an adaptive entropy threshold (122), indicative of a noise floor. Decision logic (124) is used to identifying speech and noise from the comparison of the averaged entropy signal to the adaptive entropy threshold.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: August 13, 2019
    Assignee: MOTOROLA SOLUTIONS, INC.
    Inventors: Cheah Heng Tan, Thean Hai Ooi, Wei Qing Ong, Alan Wee Chiat Tan
  • Patent number: 10381003
    Abstract: A voice acquisition system includes a plurality of mobile objects each of which includes one or more microphones and is movable around a sound source; a sound source number estimating unit which estimates the number of sound sources in a vicinity of any of the mobile objects; and a control unit which controls positions of the mobile objects, wherein the control unit controls, based on the number of sound sources in a vicinity of a first mobile object, a position of a second mobile object which differs from the first mobile object, and the voice acquisition system acquires voice using both a microphone included in the first mobile object and a microphone included in the second mobile object.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: August 13, 2019
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Youhei Wakisaka, Hitoshi Yamada, Tomoya Takatani, Narimasa Watanabe
  • Patent number: 10381014
    Abstract: A comfort noise controller for generating CN (Comfort Noise) control parameters is described. A buffer of a predetermined size is configured to store CN parameters for SID (Silence Insertion Descriptor) frames and active hangover frames. A subset selector is configured to determine a CN parameter subset relevant for SID frames based on the age of the stored CN parameters and on residual energies. A comfort noise control parameter extractor (50B) is configured to use the determined CN parameter subset to determine the CN control parameters for a first SID frame following an active signal frame.
    Type: Grant
    Filed: August 22, 2017
    Date of Patent: August 13, 2019
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventor: Tomas Jansson Toftgård
  • Patent number: 10380247
    Abstract: The present disclosure provides language-based mechanisms for generating acronyms from text input. The language of the text input may be provided or automatically detected. The target acronym length may indicate a maximum length and may vary depending on the input language. The text input may be separated into tokens and organized as a token tree list. Based on the tokens, an acronym may be generated from the available capital words. If there are not enough capital words, all words (e.g., both capitalized and lowercase words) may be used to generate the acronym. If there are not enough words, then all words and segments may be used to generate the acronym. Finally, a background color may be generated based characteristics relating to the text input or the generated acronym. The acronym and background color may be used to create a graphic, such as an icon or thumbnail, for a graphic user interface.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: August 13, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nicholas Anthony Buelich, II, Dmitriy Meyerzon, Vidya Srinivasan
  • Patent number: 10373630
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for distributed automatic speech recognition. An example apparatus includes a detector to process an input audio signal and identify a portion of the input audio signal including a sound to be evaluated, the sound to be evaluated organized into a plurality of audio features representing the sound. The example apparatus includes a quantizer to process the audio features using a quantization process to reduce the audio features to generate a reduced set of audio features for transmission. The example apparatus includes a transmitter to transmit the reduced set of audio features over a low-energy communication channel for processing.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: August 6, 2019
    Assignee: Intel Corporation
    Inventors: Binuraj K. Ravindran, Francis M. Tharappel, Prabhakar R. Datta, Tobias Bocklet, Maciej Muchlinski, Tomasz Dorau, Josef G. Bauer, Saurin Shah, Georg Stemmer
  • Patent number: 10373606
    Abstract: A transliteration support device according to an embodiment includes an acquisition unit, an extraction unit, a generation unit, and a reproduction unit. The acquisition unit acquires a text to be transliterated. The addition unit adds a transliteration tag indicating a transliteration setting of the text to the text. The extraction unit extracts a transliteration pattern in which a frequent appearance transliteration setting frequently appearing in the transliteration settings indicated by the transliteration tags and an applicable condition when the frequent appearance transliteration setting is applied to the text are in association with each other. The generation unit produces a synthesized voice using the transliteration pattern. The reproduction unit reproduces the produced synthesized voice.
    Type: Grant
    Filed: January 27, 2017
    Date of Patent: August 6, 2019
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Taira Ashikawa, Kosei Fume, Yuka Kuroda, Yoshiaki Mizuoka
  • Patent number: 10366697
    Abstract: A method and a device for encoding a high frequency signal, and a method and a device for decoding a high frequency signal are provided, which relate to encoding and decoding technology. The method for encoding a high frequency signal includes: determining a signal type of a high frequency signal of a current frame; smoothing and scaling time envelopes of the high frequency signal of the current frame and obtaining time envelopes of the high frequency signal of the current frame that require to be encoded, if the high frequency signal of the current frame is a non-transient signal and a high frequency signal of the previous frame is a transient signal; and quantizing and encoding the time envelopes of the high frequency signal of the current frame that require to be encoded, and frequency information and signal type information of the high frequency signal of the current frame.
    Type: Grant
    Filed: July 17, 2017
    Date of Patent: July 30, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zexin Liu, Lei Miao, Anisse Taleb
  • Patent number: 10362165
    Abstract: Disclosed are systems, methods, and computer readable media for tracking a person of interest. The method embodiment comprises identifying a person of interest, capturing a voiceprint of the person of interest, comparing a received voiceprint of a caller with the voiceprint of the person of interest, and tracking the caller if the voiceprint of the caller is a substantial match to the voiceprint of the person of interest.
    Type: Grant
    Filed: June 20, 2016
    Date of Patent: July 23, 2019
    Assignee: AT&T INTELLECTUAL PROPERTY II, L.P.
    Inventors: Gustavo De Los Reyes, Sanjay Macwan
  • Patent number: 10347256
    Abstract: A system for generating channel-compensated features of a speech signal includes a channel noise simulator that degrades the speech signal, a feed forward convolutional neural network (CNN) that generates channel-compensated features of the degraded speech signal, and a loss function that computes a difference between the channel-compensated features and handcrafted features for the same raw speech signal. Each loss result may be used to update connection weights of the CNN until a predetermined threshold loss is satisfied, and the CNN may be used as a front-end for a deep neural network (DNN) for speaker recognition/verification. The DNN may include convolutional layers, a bottleneck features layer, multiple fully-connected layers and an output layer. The bottleneck features may be used to update connection weights of the convolutional layers, and dropout may be applied to the convolutional layers.
    Type: Grant
    Filed: September 19, 2017
    Date of Patent: July 9, 2019
    Assignee: Pindrop Security, Inc.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 10335954
    Abstract: A computer-implemented method of handling an audio dialog between a robot and a human user comprises: during the audio dialog, receiving audio data and converting audio data into text data; in response to text data, determining a dialog topic, the dialog topic comprising a dialog content and a dialog voice skin; wherein a dialog content comprises a plurality of sentences; determining a sentence to be rendered in audio by the robot; receiving a modification request of the determined dialog sentence. Described developments for example comprise different regulation schemes (e.g. open-loop or closed-loop), the use of moderation rules (centralized or distributed) and the use of priority levels and/or parameters depending on the environment perceived by the robot.
    Type: Grant
    Filed: April 17, 2015
    Date of Patent: July 2, 2019
    Assignee: SOFTBANK ROBOTICS EUROPE
    Inventors: Jérôme Monceaux, Gwennaël Gate, Gabriele Barbieri, Taylor Veltrop
  • Patent number: 10341694
    Abstract: Data processing methods, live broadcasting methods and devices are disclosed. An example data processing method may comprise converting audio and video data into broadcast data in a predetermined format, and performing speech recognition on audio data in the audio and video data, and adding the text information obtained from speech recognition into the broadcast data. In real time, text information obtained from speech recognition according to the audio data can be inserted.
    Type: Grant
    Filed: August 4, 2017
    Date of Patent: July 2, 2019
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventor: Gang Xu
  • Patent number: 10331789
    Abstract: A semantic analysis apparatus, method, and non-transitory computer readable storage medium thereof are provided. The semantic analysis apparatus performs phrase analysis on a Chinese character string to obtain several groups and semantically analyzes the groups to obtain at least one first probability distribution, wherein each first probability distribution has several first probability values corresponding to several tags one-to-one. The semantic analysis apparatus divides the Chinese character string into several Chinese characters and semantically analyzes the Chinese characters to obtain at least one second probability distribution, wherein each second probability distribution has several second probability values corresponding to the tags one-to-one.
    Type: Grant
    Filed: July 17, 2017
    Date of Patent: June 25, 2019
    Assignee: Institute For Information Industry
    Inventors: Yun-Kai Hsu, Tsung-Chieh Chen, Chih-Li Huo, Keng-Wei Hsu
  • Patent number: 10332536
    Abstract: An apparatus for decoding an encoded audio signal including bandwidth extension control data indicating either a first harmonic bandwidth extension mode or a second non-harmonic bandwidth extension mode, includes: an input interface for receiving the encoded audio signal including the bandwidth extension control data indicating either the first harmonic bandwidth extension mode or the second non-harmonic bandwidth extension mode; a processor for decoding the audio signal using the second non-harmonic bandwidth extension mode; and a controller for controlling the processor to decode the audio signal using the second non-harmonic bandwidth extension mode, even when the bandwidth extension control data indicates the first harmonic bandwidth extension mode for the encoded signal.
    Type: Grant
    Filed: June 13, 2017
    Date of Patent: June 25, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Andreas Niedermeier, Stephan Wilde, Daniel Fischer, Matthias Hildenbrand, Marc Gayer, Max Neuendorf
  • Patent number: 10331784
    Abstract: A system and method is provided of disambiguating natural language processing requests based on smart matching, request confirmations that are used until ambiguities are resolved, and machine learning. Smart matching may match entities (e.g., contact names, place names, etc.) based on user information such as call logs, user preferences, etc. If multiple matches are found and disambiguation has not yet been learned by the system, the system may request that the user identify the intended entity. On the other hand, if disambiguation has been learned by the system, the system may execute the request without confirmations. The system may use a record of confirmations and/or other information to continuously learn a user's inputs in order to reduce ambiguities and no longer prompt for confirmations.
    Type: Grant
    Filed: July 31, 2017
    Date of Patent: June 25, 2019
    Assignee: Voicebox Technologies Corporation
    Inventors: Erik Swart, Emilie Drouin
  • Patent number: 10319368
    Abstract: A meaning generation method, in a meaning generation apparatus, includes acquiring meaning training data including text data of a sentence that can be an utterance sentence and meaning information indicating a meaning of the sentence and associated with the text data of the sentence, acquiring restatement training data including the text data of the sentence and text data of a restatement sentence of the sentence, and learning association between the utterance sentence and the meaning information and the restatement sentence. The learning includes learning of a degree of importance of a word included in the utterance sentence, and the learning is performed by applying the meaning training data and the restatement training data to a common model, and storing a result of the learning as learning result information.
    Type: Grant
    Filed: June 9, 2017
    Date of Patent: June 11, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Takashi Ushio, Katsuyoshi Yamagami
  • Patent number: 10311873
    Abstract: A voice interaction apparatus includes: a voice recognizer configured to recognize content of a speech of a user; an extractor configured to extract profile information based on a result of the voice recognition, and to specify which user the profile information is associated with; a storage configured to store the extracted profile information in association with the user; an exchanger configured to exchange profile information with another voice interaction apparatus; and a generator configured to generate a speech sentence to speak to the user based on the profile information of the user.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: June 4, 2019
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Satoshi Kume, Atsushi Ikeno, Toshihiko Watanabe, Muneaki Shimada, Hayato Sakamoto, Toshifumi Nishijima, Fuminori Kataoka, Hiromi Tonegawa, Norihide Umeyama
  • Patent number: 10298736
    Abstract: A voice signal processing apparatus includes: an input unit which receives a voice signal of a user; a detecting unit which detects an auxiliary signal; and a signal processing unit which transmits the voice signal to an external terminal in a first operation mode and transmits the voice signal and the auxiliary signal to the external terminal using the same or different protocols in a second operation mode.
    Type: Grant
    Filed: July 6, 2016
    Date of Patent: May 21, 2019
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Min Kyu Lee, Sang Hun Kim, Young Ik Kim, Dong Hyun Kim, Mu Yeol Choi
  • Patent number: 10296289
    Abstract: Technology for detecting multimodal commands that enhance the human-computer interaction of a computing device. In an illustrative implementation, a computing device may receive multiple input events from a plurality of input devices. The plurality of input devices may each correspond to a different computer input modalities and the computing device may correlate the input events across different modalities. The computing device may keep the input events in their native form (e.g., input device specific) or may transform the input events into modality independent events. In either example, the computing device may determine the events satisfy a definition for a multimodal command that identifies multiple events from different computer input modalities. Responsive to the determination, the computing device may invoke the multimodal command on the client device to perform one or more computing operations.
    Type: Grant
    Filed: June 1, 2017
    Date of Patent: May 21, 2019
    Assignee: salesforce.com, inc.
    Inventor: Peng-Wen Chen
  • Patent number: 10283113
    Abstract: The disclosure concerns a method for recognizing driving noise in a sound signal that is acquired by a microphone disposed in a vehicle. The sound signal originates from the surface structure of the road. According to the disclosure, a segment of the road lying ahead of the vehicle in the direction of travel is observed with a sensor installed in or on the vehicle. Using the observation data obtained, the start and duration of driving noise originating from the surface structure of the road are predicted.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: May 7, 2019
    Assignee: FORD GLOBAL TECHNOLOGIES, LLC
    Inventors: Christoph Arndt, Mohsen Lakehal-Ayat
  • Patent number: 10276182
    Abstract: A sound processing device includes a processor configured to generate a first frequency spectrum of a first sound signal corresponding to a first sound received at a first input device and a second frequency spectrum of a second sound signal corresponding to the first sound received at a second input device, calculate a transfer characteristic based on a first difference between an intensity of the first frequency spectrum and an intensity of the second frequency spectrum, generate a third frequency spectrum of a third sound signal transmitted from the first input device and a fourth frequency spectrum of a fourth sound signal transmitted from the second input device, specify a suppression level of an intensity of the fourth frequency spectrum based on a second difference between an intensity of the third frequency spectrum and an intensity of the fourth frequency spectrum.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: April 30, 2019
    Assignee: FUJITSU LIMITED
    Inventors: Takeshi Otani, Taro Togawa, Sayuri Nakayama