Patents Examined by Qi Han
  • Patent number: 10614823
    Abstract: A process of an audio stream on a receiving side is facilitated. Encoding processing is performed on audio data and an audio stream in which an audio frame including audio compression data is continuously arranged is generated. Tag information indicating that the audio compression data of a predetermined sound unit is included is inserted into the audio frame including the audio compression data of the predetermined sound unit. A container stream of a predetermined format including the audio stream into which the tag information is inserted is transmitted.
    Type: Grant
    Filed: December 6, 2016
    Date of Patent: April 7, 2020
    Assignee: SONY CORPORATION
    Inventor: Ikuo Tsukagoshi
  • Patent number: 10607611
    Abstract: When transcribing large audio files, such as in the case of legal depositions, there are often many transcribers to choose from. Embodiments described herein enable calculation of expected accuracy of transcriptions by transcribers, which can be used to guide the selection of transcribers for specific tasks. In one embodiment, a computer receives a segment of an audio recording that includes speech of a person, and identifies an accent of the person and a topic of the segment. The computer generates feature values based on data that includes the accent and the topic, and utilizes a model to calculate, based on the feature values, an expected accuracy of a transcription of the segment by a certain transcriber. The model is generated based on training data that includes segments of previous audio recordings and values of accuracies of transcriptions, by the certain transcriber, of the segments.
    Type: Grant
    Filed: October 7, 2019
    Date of Patent: March 31, 2020
    Assignee: Verbit Software Ltd.
    Inventors: Eric Ariel Shellef, Yaakov Kobi Ben Tsvi, Iris Getz, Tom Livne, Elisha Yehuda Rosensweig
  • Patent number: 10607610
    Abstract: An audio firewall system has a microphone that generates audio data. A speech-to-text engine converts the audio data to text data. The text data is parsed for a service wake word and corresponding content data. The service wake word identifies one of a local security system and a remote assistant server. A text-to-speech engine converts the service wake word and the corresponding content data to converted audio data. The converted audio data is provided to the remote assistant server. The content data is provided to the local security system. The audio firewall system receives a response from the remote assistant server or the local security system and outputs an audio signal corresponding to the response.
    Type: Grant
    Filed: May 29, 2018
    Date of Patent: March 31, 2020
    Assignee: Nortek Security & Control LLC
    Inventors: Philip Alan Bunker, Mayank Saxena
  • Patent number: 10606953
    Abstract: According to some embodiments, a system and method are provided to extract relationships from unstructured text documents. The method comprises receiving a training set of sentences that comprise labeled objects and subjects for creating an initial relationship model. A set of unlabeled sentences may be received. Objects and subjects from the set of unlabeled sentences are determined based on the initial model and the determined objects and subjects from the set of unlabeled sentences are displayed to a user for feedback and approval. An indication of whether the determined objects and subjects from the set of unlabeled sentences are correct is received and the initial relationship model is updated based on the received indication.
    Type: Grant
    Filed: December 8, 2017
    Date of Patent: March 31, 2020
    Assignee: General Electric Company
    Inventors: Varish Vyankatesh Mulwad, Kareem Sherif Aggour
  • Patent number: 10593333
    Abstract: Embodiments of the present disclosure provide a method and a device for processing a voice message, a terminal and a storage medium. The method includes: receiving a voice message sent by a user, the voice message being obtained based on an unordered version of language interaction; determining a corresponding spectrum of frequency domain feature based on the voice message, and performing a signal processing on the spectrum of frequency domain feature to obtain a first acoustic feature based on frame sequence and corresponding to the spectrum of frequency domain feature; and performing a feature extraction on the first acoustic feature to obtain a second acoustic feature based on an ivector algorithm and a deep convolutional neural network algorithm with residual processing, converting the second acoustic feature into a voiceprint model corresponding to the user, and storing the voiceprint model in a voiceprint model database.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: March 17, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Cong Gao
  • Patent number: 10586527
    Abstract: Creating and deploying a voice from text-to-speech, with such voice being a new language derived from the original phoneset of a known language, and thus being audio of the new language outputted using a single TTS synthesizer. An end product message is determined in an original language n to be outputted as audio n by a text-to-speech engine, wherein the original language n includes an existing phoneset n including one or more phonemes n. Words and phrases of a new language n+1 are recorded, thereby forming audio file n+1. This new audio file is labeled into unique units, thereby defining one or more phonemes n+1. The new phonemes of the new language are added to the phoneset, thereby forming new phoneset n+1, as a result outputting the end product message as an audio n+1 language different from the original language n.
    Type: Grant
    Filed: October 25, 2017
    Date of Patent: March 10, 2020
    Assignee: Third Pillar, LLC
    Inventors: Patrick Dexter, Kevin Jeffries
  • Patent number: 10559308
    Abstract: A system determines user intent from text. A conversation element is received. An intent is determined by matching a domain independent relationship and a domain dependent term determined from the received conversation element to an intent included in an intent database that stores a plurality of intents and by inputting the matched intent into a trained classifier that computes a likelihood that the matched intent is the intent of the received conversation element. An action is determined based on the determined intent. A response to the received conversation element is generated based on the determined action and output.
    Type: Grant
    Filed: June 7, 2019
    Date of Patent: February 11, 2020
    Assignee: SAS Institute Inc.
    Inventors: Jared Michael Dean Smythe, David Blake Styles, Richard Welland Crowell
  • Patent number: 10553219
    Abstract: A voice recognition apparatus, a voice recognition method, and a non-transitory computer readable recording medium are provided. The voice recognition apparatus includes a storage configured to store a preset threshold value for voice recognition; a voice receiver configured to receive a voice signal of an uttered voice; and a voice recognition processor configured to recognize a voice recognition starting word from the received voice signal, perform the voice recognition on the voice signal in response to a similarity score, which represents a recognition result of the recognized voice recognition starting word, being greater than or equal to the stored preset threshold value, and change the preset threshold value based on the recognition result of the voice recognition starting word.
    Type: Grant
    Filed: July 19, 2016
    Date of Patent: February 4, 2020
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Chi-sang Jung
  • Patent number: 10546592
    Abstract: A audio signal encoding method includes: dividing a frequency band of an audio signal into a plurality of sub-bands, and quantifying a sub-band normalization factor of each sub-band; determining signal bandwidth of bit allocation according to the quantified sub-band normalization factor, or according to the quantified sub-band normalization factor and bit rate information; allocating bits for a sub-band within the determined signal bandwidth; and coding a spectrum coefficient of the audio signal according to the bits allocated for each sub-band. According to embodiments of the present disclosure, during coding and decoding, signal bandwidth of bit allocation is determined according to the quantified sub-band normalization factor and bit rate information. In this manner, the determined signal bandwidth is effectively coded and decoded by centralizing the bits, and audio quality is improved.
    Type: Grant
    Filed: May 16, 2018
    Date of Patent: January 28, 2020
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Fengyan Qi, Zexin Liu, Lei Miao
  • Patent number: 10535347
    Abstract: An approach is provided in which an information handling system sends a request in audio format to a user over a voice channel requesting a user data set. The information handling system receives utterances from the user over the voice channel and determines that the utterances do not provide enough information to complete the requested user data set. In turn, the information handling system establishes a messaging channel with the user and sends a request in digital format to the user over the messaging channel to provide additional data to complete the user data set.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Scott W. Graham, Lior Luker, Nitzan Nissim, Brian L. Pulito
  • Patent number: 10535352
    Abstract: A computer-implemented method includes associating, using a processor, one or more words in an electronic agenda template to at least one agenda item indicative of a point for discussion. The processor captures a real-time interaction comprising speech from one or more participants of a plurality of discussion participants into a digital representation. The processor isolates a portion of the real-time interaction from the digital representation. The portion is associated with a single speaker of the plurality of discussion participants. The processor makes at least one match between an isolated portion of the real-time interaction and the at least one agenda item. The processor determines an intent of the single speaker from the isolated portion and matching the determined intent of the single speaker to the at least one agenda item on the electronic agenda template, and generates discussion minutes output based on the matched intent and agenda item.
    Type: Grant
    Filed: November 16, 2017
    Date of Patent: January 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sharathchandra Pankanti, Stefan Ravizza, Erik Rueger
  • Patent number: 10530719
    Abstract: A computing device includes an interface configured to interface and communicate with a communication system, a memory that stores operational instructions, and processing circuitry operably coupled to the interface and to the memory that is configured to execute the operational instructions to perform various operations. The computing device processes a message that is provided from a sender and is intended for a recipient associated with another computing device in accordance with topic, emotive content, and/or social content to generate a classification model for the message that includes classification parameter value(s). When appropriate to perform message transformation, the computing device selects a tonal transformation based on the classification parameter value(s) and processes the message in accordance with the tonal transformation to generate a normalized message.
    Type: Grant
    Filed: November 16, 2017
    Date of Patent: January 7, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kelley Anders, Jeremy R. Fox, Liam S. Harpur, Jonathan Dunne
  • Patent number: 10528659
    Abstract: [Object] To present a response to a natural sentence in a more suitable aspect even in circumstances in which a natural sentence with ambiguity can be input. [Solution] An information processing device including: an acquisition unit configured to acquire an extraction result of candidates for a response to an input which is based on first information indicating a result of natural language analysis on a natural sentence acquired as the input and second information indicating a state or a situation involved in use of a predetermined device; and a control unit configured to cause a predetermined output unit to present information indicating the candidates for the response in an aspect corresponding to the extraction result of the candidates.
    Type: Grant
    Filed: November 26, 2015
    Date of Patent: January 7, 2020
    Assignee: SONY CORPORATION
    Inventor: Yasuharu Asano
  • Patent number: 10521505
    Abstract: In an approach to generating blockchain smart contracts, one or more computer processors receive a request for a service from a user. The one or more computer processors extract one or more features from the request. The one or more computer processors determine one or more smart contract templates associated with the request based, at least in part, on the extracted one or more features. The one or more computer processors receive one or more responses to the request from one or more service providers. The one or more computer processors generate a draft smart contract based, at least in part on the determined one or more smart contract templates and the one or more received responses.
    Type: Grant
    Filed: December 17, 2017
    Date of Patent: December 31, 2019
    Assignee: International Business Machines Corporation
    Inventors: Ryan Anderson, Joseph Kozhaya, Christopher M. Madison, John Wolpert
  • Patent number: 10504512
    Abstract: Techniques for limiting natural language processing performed on input data are described. A system receives input data from a device. The input data corresponds to a command to be executed by the system. The system determines applications likely configured to execute the command. The system performs named entity recognition and intent classification with respect to only the applications likely configured to execute the command.
    Type: Grant
    Filed: September 22, 2017
    Date of Patent: December 10, 2019
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Ruhi Sarikaya, Rohit Prasad, Kerry Hammil, Spyridon Matsoukas, Nikko Strom, Frédéric Johan Georges Deramat, Stephen Frederick Potter, Young-Bum Kim
  • Patent number: 10489517
    Abstract: In an embodiment, a method of providing an on demand translation service is provided. A subscriber may be charged a reduced fee or no fee for use of the on demand translation service in exchange for displaying commercial messages to the subscriber, the commercial messages being selected based on subscriber information. A multimedia signal including information in a source language may be received. The information may be obtained as text in the source language from the multimedia signal. The text may be translated from the source language to a target language. Translated information, based on the translated text, may be transmitted to a processing device for presentation to the subscriber. The received multimedia signal may be sent to a multimedia device for viewing.
    Type: Grant
    Filed: October 30, 2017
    Date of Patent: November 26, 2019
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Srinivas Bangalore, David Crawford Gibbon, Mazin Gilbert, Patrick Guy Haffner, Zhu Liu, Behzad Shahraray
  • Patent number: 10482903
    Abstract: A method for selectively interacting with multi-devices is provided. The method includes the following steps: receiving identical voice information transmitted by a plurality of terminal devices respectively; performing voice recognition on the received voice information; calculating energy of a wake-up word in respective voice information; and comparing the energy of one wake-up word with another, and transmitting feedback information to the terminal devices according to an energy comparison result and a voice recognition result. By calculating the energy of the wake-up word in respective voice information transmitted by respective devices, the distances between respective device and a user can be distinguished. A unique response can be ensured by determining that the device closest to the user responds to the user's request, thus ensuring the user experience.
    Type: Grant
    Filed: December 26, 2017
    Date of Patent: November 19, 2019
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Sha Tao, Yonghui Zuo, Peng Wang, Guoguo Chen, Ji Zhou, Kaihua Zhu
  • Patent number: 10482886
    Abstract: An interactive robot includes an image capturing device, an audio capturing device, an output device, and a processor. The processor is configured to obtain audio information captured by the audio capturing device and image information captured by the image capturing device, recognize a target from the audio information and the image information, confirm basic information and event information of the target and link the basic information with the event information, obtain key information from the event information of the target, implement a neural network analysis algorithm on the key information to confirm an emotion type of the target, search a preset public knowledge database according to the key information to obtain a relevant result, apply a deep learning algorithm on the relevant result and the emotion type of the target to determine a response, and execute the response through the output device.
    Type: Grant
    Filed: December 25, 2017
    Date of Patent: November 19, 2019
    Assignees: Fu Tai Hua Industry (Shenzhen) Co., Ltd., HON HAI PRECISION INDUSTRY CO., LTD.
    Inventor: Xue-Qin Zhang
  • Patent number: 10482181
    Abstract: A computing device for expert case-based natural language learning includes a blackboard database, a top level mapper, and a bottom level case-based inference engine, and a bottom level translator. The blackboard database is configured to store context information corresponding to case semantics associated with natural language sentential forms. The case semantics include situation semantics and action semantics. The top level mapper is configured to query the blackboard database for the context information, map the situation semantics to the action semantics using the context information to form new case semantics, and store the new case semantics in a bottom level case database. The bottom level case-based inference engine is configured to match an input natural language sentential form to a matching case semantic stored in the bottom level case database. The bottom level translator is configured to translate the matching case semantic into natural language sentential form.
    Type: Grant
    Filed: August 1, 2018
    Date of Patent: November 19, 2019
    Assignee: United States of America as represented by the Secretary of the Navy
    Inventor: Stuart Harvey Rubin
  • Patent number: 10475439
    Abstract: There is provided an information processing system enabling a user to provide easily an instruction on whether to continue speech recognition processing on sound information, the information processing system including: a recognition control portion configured to control a speech recognition portion so that the speech recognition portion performs speech recognition processing on sound information input from a sound collection portion. The recognition control portion controls whether to continue the speech recognition processing on the basis of a gesture of a user detected at predetermined timing.
    Type: Grant
    Filed: December 7, 2015
    Date of Patent: November 12, 2019
    Assignee: SONY CORPORATION
    Inventors: Shinichi Kawano, Yuhei Taki