Patents Examined by Seong-Ah A Shin
  • Patent number: 10978061
    Abstract: A method, a computer system, and a computer program product for detecting voice commands. Audio is recorded by the computer system to form a recorded audio. The computer system then determines whether a voice command spoken by a first person is present in the recorded audio. If the voice command is present in the recorded audio, the computer system determines whether the voice command is directed to a second person by the first person. If the voice command is not being directed to the second person, the computer system processes the voice command, wherein processing of the voice command occurs without a wake word.
    Type: Grant
    Filed: March 9, 2018
    Date of Patent: April 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Gregory J. Boss, Jeremy R. Fox, Andrew R. Jones, John E. Moore, Jr.
  • Patent number: 10964311
    Abstract: According to one embodiment, a word detection system acquires speech data including a plurality of frames, generates the speech characteristic amount, calculates a frame score by matching a reference model based on the speech characteristic amount associated with a target word with the frames in the speech data, calculates a first score of the word from the frame score, detects the word from the speech data based on the first score, calculates a second score of the word based on time information on the start and the end of the detected word and the frame score, compares the value of the second score with the second scores of a plurality of words, and determines a word to be output based on the comparison result.
    Type: Grant
    Filed: September 13, 2018
    Date of Patent: March 30, 2021
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Hiroshi Fujimura
  • Patent number: 10930298
    Abstract: Audio signal processing for adaptive de-reverberation uses a least mean squares (LMS) filter that has improved convergence over conventional LMS filters, making embodiments practical for reducing the effects of reverberation for use in many portable and embedded devices, such as smartphones, tablets, laptops, and hearing aids, for applications such as speech recognition and audio communication in general. The LMS filter employs a frequency-dependent adaptive step size to speed up the convergence of the predictive filter process, requiring fewer computational steps compared to a conventional LMS filter applied to the same inputs. The improved convergence is achieved at low memory consumption cost. Controlling the updates of the prediction filter in a high non-stationary condition of the acoustic channel improves the performance under such conditions. The techniques are suitable for single or multiple channels and are applicable to microphone array processing.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: February 23, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta
  • Patent number: 10931758
    Abstract: Sensor data is received from a physical environment sensor. A component region associated with at least a portion of the sensor data is identified. A physical environment has been defined to include a plurality of component regions. Context information associated with the identified component region is obtained. The context information is utilized in association with a prediction model.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: February 23, 2021
    Assignee: BrainofT Inc.
    Inventors: Ashutosh Saxena, Jinjing Zhou, Maurice Chu, Deng Deng, Kunal Lad, Ashwini Venkatesh
  • Patent number: 10930284
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing voice data are provided. One of methods, implemented by an IoT device, includes: receiving voice data from a server, wherein the voice data is obtained through converting text data to voice data by the server; determining a content attribute associated with the voice data; determining a content attribute type of the content attribute associated with the voice data; determining a first play rule matching the content attribute type based on a matching relationship between content attribute types and respective first play rules, wherein the first play rule including a play starting time and a play mode; and automatically playing the voice data according to the play starting time and the play mode.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: February 23, 2021
    Assignee: ADVANCED NEW TECHNOLOGIES CO., LTD.
    Inventors: Guolai Ma, Tian Chen, Liang Zhang, Zheng Yuan
  • Patent number: 10885271
    Abstract: Various embodiments of the present disclosure relate to systems and methods for dynamically modifying images based on the content of articles associated with the images, particularly the emotional content of an article. Among other things, embodiments of the present disclosure allow users to quickly and easily identify the emotional nature of an article based on such an image. Characteristics of an image associated with an article may also be modified in response to comments from viewers regarding the article.
    Type: Grant
    Filed: July 26, 2018
    Date of Patent: January 5, 2021
    Assignee: VERIZON MEDIA INC.
    Inventor: Agnes Liu
  • Patent number: 10867620
    Abstract: The present disclosure relates to sibilance detection and mitigation in a voice signal. A method of sibilance detection and mitigation is described. In the method, a predetermined spectrum feature is extracted from a voice signal, the predetermined spectrum feature representing a distribution of signal energy over a voice frequency band. Sibilance is then identified based on the predetermined spectrum feature. Excessive sibilance is further identified from the identified sibilance based on a level of the identified sibilance. Then the voice signal is processed by decreasing a level of the excessive sibilance so as to suppress the excessive sibilance. Corresponding system and computer program products are described as well.
    Type: Grant
    Filed: June 19, 2017
    Date of Patent: December 15, 2020
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Kai Li, David Gunawan
  • Patent number: 10869105
    Abstract: Various arrangements for voice-based metadata tagging of video content are presented. A request to add a spoken metadata tag to be linked with a video content instance may be received. A voice clip that includes audio spoken by a user may be received. Speech-to-text conversion of the voice clip to produce a proposed spoken metadata tag may be performed. A metadata integration database to link the spoken metadata tag with the video content instance may be updated.
    Type: Grant
    Filed: March 6, 2018
    Date of Patent: December 15, 2020
    Assignee: DISH Network L.L.C.
    Inventor: Jason Henderson
  • Patent number: 10860648
    Abstract: Systems, methods, and computer-readable media are disclosed for detecting a mismatch between the spoken language in an audio file and the audio language that is tagged as the spoken language in the audio file metadata. Example methods may include receiving a media file including spoken language metadata. Certain methods include generating an audio sample from the media file. Certain methods include generating a text translation of the audio sample based on the spoken language metadata. Certain methods include determining that the spoken language metadata does not match a spoken language in the audio sample based on the text translation. Certain methods include sending an indication that the spoken language metadata does not match the spoken language.
    Type: Grant
    Filed: September 12, 2018
    Date of Patent: December 8, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Manolya McCormick, Vimal Bhat, Shai Ben Nun
  • Patent number: 10847157
    Abstract: Implementations of the present disclosure include methods, systems, and computer-readable storage mediums for utilizing multiple AI service providers by a dialog management system. The dialog management system can include a dispatcher bot, multiple worker bots, and multiple AI adapters that are each associated with a different cloud-based AI service provider. In response to receiving a query, the dispatcher bot selects a particular worker bot to handle the query. The particular worker bot is assigned to a particular AI service provider. An AI adapter associated with the particular AI service provider, generates a query message based on the query. The AI adapter sends the query message to the particular AI service provider and receives a response message. The dialog management system sends a representation of the response message to the particular worker bot, receives an answer for the query from the particular worker bot, and provides the answer for output.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: November 24, 2020
    Assignee: Accenture Global Solutions Limited
    Inventors: Laetitia Cailleteau Eriksson, Christopher Wickes, Marion Danielle Claude Perichaud Duncan, Augusto Gugliotta
  • Patent number: 10811025
    Abstract: A computer implemented method for automatically moderating a system response to a user's voice command. A voice command from a user is received. The voice command is associated with a system command, the system command including command requirements. A determination is the made as to whether the user is experiencing stress based on a stress level detected in the received voice command, and the command requirements dynamically adjusted when the user is determined to be experiencing stress.
    Type: Grant
    Filed: September 9, 2019
    Date of Patent: October 20, 2020
    Assignee: Allscripts Software, LLC
    Inventor: Todd M. Eischeid
  • Patent number: 10789949
    Abstract: An audio device with at least one microphone adapted to receive sound from a sound field and create an output, and a processing system that is responsive to the output of the microphone. The processing system is configured to use a signal processing algorithm to detect a wakeup word, and modify the signal processing algorithm that is used to detect the wakeup word if the sound field changes.
    Type: Grant
    Filed: June 20, 2017
    Date of Patent: September 29, 2020
    Assignee: Bose Corporation
    Inventors: Ricardo Carreras, Alaganandan Ganeshkumar
  • Patent number: 10783883
    Abstract: A method at a first electronic device of a local group of connected electronic devices includes: receiving a first voice command including a request for a first operation; determining a first target device for the first operation from among the local group; establishing a focus session with respect to the first target device; causing the first operation to be performed by the first target device; receiving a second voice command including a request for a second operation; determining that the second voice command does not include an explicit designation of a second target device; determining that the second operation can be performed by the first target device; determining whether the second voice command satisfies one or more focus session maintenance criteria; and if the second voice command satisfies the focus session maintenance criteria, causing the second operation to be performed by the first target device.
    Type: Grant
    Filed: November 1, 2017
    Date of Patent: September 22, 2020
    Assignee: GOOGLE LLC
    Inventors: Kenneth Mixter, Tomer Shekel, Tuan Anh Nguyen
  • Patent number: 10769383
    Abstract: Embodiments of the present application disclose a cluster-based word vector processing method, apparatus, and device. Solutions are include: in a cluster having a server cluster and a worker computer cluster, in which each worker computer in the worker computer cluster separately reads some corpuses in parallel, extracts a word and context words of the word from the read corpuses, obtains corresponding word vectors from a server in the server cluster, and trains the corresponding word vectors, and the server cluster updates word vectors of same words that are stored before the training according to training results of one or more respective worker computers with respect to the word vectors of the same words.
    Type: Grant
    Filed: January 15, 2020
    Date of Patent: September 8, 2020
    Assignee: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou, Xiaolong Li
  • Patent number: 10740555
    Abstract: Performing an operation comprising determining that a parse of an input string comprising a plurality of tokens is incomplete, generating, based on a machine learning (ML) model: (i) a plurality of candidate addition tokens for adding to the input string, and (ii) a plurality of candidate removal tokens for removing from the input string, selecting, from the plurality of candidate addition tokens and the plurality of candidate removal tokens, a first candidate token, and modifying the input string based on the first candidate token to facilitate a complete parse of the modified input string by a parser.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Aysu Ezen Can, Roberto Delima, David Contreras, Corville O. Allen
  • Patent number: 10732258
    Abstract: A system capable of detecting human presence based on output from a model-free detector and model-based detector(s). For example, the model-free detector may identify acoustic events and the model-based detectors can determine specific types of acoustic events and whether the acoustic events are associated with human activity. Using output from the model-based detectors, a device may confirm that an acoustic event identified by the model-free detector is associated with human activity or may determine that the acoustic event is associated with non-human activity and can be ignored. Thus, the device may detect human presence based on a wide variety of noises while reducing a number of false positives associated with the model-free detector.
    Type: Grant
    Filed: September 26, 2016
    Date of Patent: August 4, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Shiva Kumar Sundaram, Rui Wang
  • Patent number: 10726855
    Abstract: Certain example embodiments relate to speech privacy systems and/or associated methods. The techniques described herein disrupt the intelligibility of the perceived speech by, for example, superimposing onto an original speech signal a masking replica of the original speech signal in which portions of it are smeared by a time delay and/or amplitude adjustment, with the time delays and/or amplitude adjustments oscillating over time. In certain example embodiments, smearing of the original signal may be generated in frequency ranges corresponding to formants, consonant sounds, phonemes, and/or other related or non-related information-carrying building blocks of speech. Additionally, or in the alternative, annoying reverberations particular to a room or area in low frequency ranges may be “cut out” of the replica signal, without increasing or substantially increasing perceived loudness.
    Type: Grant
    Filed: March 15, 2017
    Date of Patent: July 28, 2020
    Assignee: GUARDIAN GLASS, LLC.
    Inventor: Alexey Krasnov
  • Patent number: 10726848
    Abstract: Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.
    Type: Grant
    Filed: January 22, 2018
    Date of Patent: July 28, 2020
    Assignee: VERINT SYSTEMS LTD.
    Inventors: Alex Gorodetski, Oana Sidi, Ron Wein, Ido Shapira
  • Patent number: 10714080
    Abstract: A weighted finite-state transducer (WFST) decoding system is provided. The WFST decoding system includes a memory that stores WFST data and a WFST decoder including a data fetch logic. The WFST data has a structure including states, and arcs connecting the states with directivity. The WFST data is compressed in the memory. The WFST data includes body data, and header data including state information for each states that is aligned discontinuously. The body data includes arc information of the arcs that is aligned continuously. The state information includes an arc index of the arcs, a number of the arcs, and compression information of the arcs, and the data fetch logic de-compresses the WFST data using the compression information, and retrieves the WFST data from the memory.
    Type: Grant
    Filed: September 8, 2017
    Date of Patent: July 14, 2020
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jae Sung Yoon, Jun Seok Park
  • Patent number: 10685650
    Abstract: A mobile terminal including a touch screen; a microphone configured to receive voice information from a user; and a controller configured to analyze the voice information using a voice recognition algorithm, extract a term predicted to be unfamiliar to the user from the analyzed voice information based on a pre-stored knowledge database, search for information on the extracted term based on a context of the analyzed voice information, and display the searched information to the touch screen.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: June 16, 2020
    Assignee: LG ELECTRONICS INC.
    Inventor: Hyunjoo Jeon