Similarity Patents (Class 704/239)
  • Patent number: 11929076
    Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.
    Type: Grant
    Filed: December 1, 2022
    Date of Patent: March 12, 2024
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Hosam Adel Khalil, Emilian Stoimenov, Christopher Hakan Basoglu, Kshitiz Kumar, Jian Wu
  • Patent number: 11929062
    Abstract: A method and system of training a spoken language understanding (SLU) model includes receiving natural language training data comprising (i) one or more speech recording, and (ii) a set of semantic entities and/or intents for each corresponding speech recording. For each speech recording, one or more entity labels and corresponding values, and one or more intent labels are extracted from the corresponding semantic entities and/or overall intent. A spoken language understanding (SLU) model is trained based upon the one or more entity labels and corresponding values, and one or more intent labels of the corresponding speech recordings without a need for a transcript of the corresponding speech recording.
    Type: Grant
    Filed: September 15, 2020
    Date of Patent: March 12, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hong-Kwang Jeff Kuo, Zoltan Tueske, Samuel Thomas, Yinghui Huang, Brian E. D. Kingsbury, Kartik Audhkhasi
  • Patent number: 11894011
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to reduce noise from harmonic noise sources. An example apparatus includes at least one memory; at least one processor to execute the computer readable instructions to at least: determine a first amplitude value of a frequency component in a frequency spectrum of an audio sample; determine a set of points in the frequency spectrum having at least one of (a) amplitude values within an amplitude threshold of the first amplitude value, (b) frequency values within a frequency threshold of the first amplitude value, or (c) phase values within a phase threshold of the first amplitude value; increment a counter when a distance between (1) a second amplitude value in the set of points and (2) the first amplitude value satisfies a distance threshold; and when the counter satisfies a counter threshold, generate a contour trace based on the set of points.
    Type: Grant
    Filed: January 9, 2023
    Date of Patent: February 6, 2024
    Assignee: The Nielsen Company (US), LLC
    Inventor: Matthew McCallum
  • Patent number: 11875791
    Abstract: System and methods for processing audio signals are disclosed. In one implementation, a system may include at least one microphone configured to capture sounds from an environment of a user; and at least one processor. The processor may be programmed to receive at least one audio signal representative of at least part of the sounds captured by the microphone; identify at least one word in the at least one audio signal; and in response to identifying the at least one word, cause feedback to be provided the user.
    Type: Grant
    Filed: May 20, 2021
    Date of Patent: January 16, 2024
    Assignee: ORCAM TECHNOLOGIES LTD.
    Inventors: Yonatan Wexler, Amnon Shashua, Nir Sancho, Roi Nathan, Tal Rosenwein, Oren Tadmor
  • Patent number: 11870937
    Abstract: The embodiment of the present disclosure provides a method for smart gas call center feedback management and an Internet of things (IoT) system thereof. The method is implemented based on a smart gas management platform. The method includes: receiving a call message of a target customer through a call center, and a content of the call message being related to a gas business; determining a feedback mode by analyzing the call message through the call center; in response to the feedback mode being manual feedback, determining a target operator through the call center to feed back a call of the target customer; and in response to the feedback mode being automatic feedback, determining a feedback content through the call center and sending the feedback content to the target customer.
    Type: Grant
    Filed: April 16, 2023
    Date of Patent: January 9, 2024
    Assignee: CHENGDU QINCHUAN IOT TECHNOLOGY CO., LTD.
    Inventors: Zehua Shao, Haitang Xiang, Junyan Zhou
  • Patent number: 11817089
    Abstract: A collection of digital video files may contain a large amount of unstructured information in the form of spoken words encoded within audio tracks. The audio tracks are transcribed into digital text. Attributes are extracted from the digital text and mapped to a particular subject matter aspect. Attribute to aspect mappings provide a useful organization for the unstructured information. Furthermore, sentiment scores and trends for one or more aspects may be determined and displayed.
    Type: Grant
    Filed: April 5, 2021
    Date of Patent: November 14, 2023
    Assignee: Pyxis.AI
    Inventors: Eric Owhadi, Bharat Naga Sumanth Banda, Narendra Goyal, Hong Ding
  • Patent number: 11790912
    Abstract: A wake-up word for a digital assistant may be specified by a user to trigger the digital assistant to respond to the wake-up word, with the user providing one or more initial pronunciations of the wake-up word. The wake-up word may be unique, or at least not determined beforehand by a device manufacturer or developer of the digital assistant. The initial pronunciation(s) of the keyword may then be augmented with other potential pronunciations of the wake-up word that might be provided in the future, and those other potential pronunciations may then be pruned down to a threshold number of other potential pronunciations. One or more recordings of the initial pronunciation(s) of the wake-up may then be used to train a phoneme recognizer model to better recognize future instances of the wake-up word being spoken by the user or another person using the initial pronunciation or other potential pronunciations.
    Type: Grant
    Filed: January 3, 2022
    Date of Patent: October 17, 2023
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Lakshmish Kaushik, Zhenhao Ge, Xiaoyu Liu
  • Patent number: 11782976
    Abstract: A method for querying information includes: acquiring an information query instruction; obtaining a query intention by performing intention identification on the works information query instruction; determining a type of the query intention in a plurality of intention types; obtaining an entity detection result by detecting whether the query intention contains an author entity and a painting entity; acquiring a query result based on the query intention and reference information, wherein the reference information includes the type of the query intention and the entity detection result, and the query result includes at least one information of an image, an audio and a text; and displaying the query result.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: October 10, 2023
    Assignee: BOE TECHNOLOGY GROUP CO., LTD.
    Inventor: Chu Xu
  • Patent number: 11615323
    Abstract: A method for verifying a material data chain (MDC) that is maintained by a creator is disclosed. The method includes receiving an unverified portion of the MDC from the creator including a set of consecutive material data blocks (MDBs). Each respective MDB includes respective material data, respective metadata, and a creator verification value. The method includes modifying a genomic differentiation object assigned to the verification cohort based on first genomic regulation instructions (GRI) that were used by the creator to generate the creator verification value. For each MDB in the unverified portion, the method includes determining a verifier verification value based on the MDB, a preceding MDB in the MDC, and a genomic engagement factor (GEF) determined with respect to the MDB. The GEF corresponding to an MDB is determined by extracting a sequence from the metadata of a MDB and mapping the sequence into the modified genomic differentiation object.
    Type: Grant
    Filed: February 10, 2022
    Date of Patent: March 28, 2023
    Assignee: Quantum Digital Solutions Corporation
    Inventors: William C. Johnson, Karen Ispiryan, Gurgen Khachatryan
  • Patent number: 11557217
    Abstract: A communications training system is provided having a user interface, a computer-based simulator and a performance measurement database. The user interface is configured to receive a speech communication input from the user based on a training content and the computer-based simulator is configured to transform the speech communication to a text data whereby the text data can be aligned to performance measurement database values to determine a performance measure of the speech communication. The format of the text data and the performance measurement database values enable the speech communication to be aligned with predefined performance measurement database values representing expected speech communications for that training content.
    Type: Grant
    Filed: October 26, 2020
    Date of Patent: January 17, 2023
    Assignee: Aptima, Inc.
    Inventors: Kevin Sullivan, Matthew Roberts, Michael Knapp, Brian Riordan
  • Patent number: 11502863
    Abstract: Electronic conferences can often be the source of frustration and wasted resources as participants may be forced to contend with extraneous sounds, such as conversations not intended for the conference, provided by an endpoint that should be muted. Similarly, participants may speak with the intention of providing their speech to the conference but speak while their associated endpoint is muted. As a result, the conference may be awkward and lack a productive flow while erroneously muted or non-muted endpoints are addressed. By detecting erroneous audio settings, endpoints can be prompted or automatically corrected to have the appropriate audio state.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: November 15, 2022
    Assignee: Avaya Management L.P.
    Inventors: Pushkar Yashavant Deole, Sandesh Chopdekar, Navin Daga
  • Patent number: 11487837
    Abstract: The present invention generally relates to a method for summarizing content related to a keyword, the content being retrieved from online websites, the method comprising the steps of: searching (1100) the keyword with at least one search engine, thereby obtaining ranked result webpages, deriving (1200, 2200) sentences from a predetermined number of highest ranking webpages, among the ranked result webpages, combining (1300) the sentences, thereby obtaining a combined content, ranking (1400, 3400) the combined content, thereby obtaining a ranked content, outputting (1500) a predetermined number of highest ranking sentences from the ranked content as summary of the keyword.
    Type: Grant
    Filed: September 24, 2019
    Date of Patent: November 1, 2022
    Assignee: Searchmetrics GmbH
    Inventors: Fang Xu, Marcus Tober
  • Patent number: 11488607
    Abstract: Disclosed is an electronic apparatus which identifies utterer characteristics of an uttered voice input received; identifies one utterer group among a plurality of utterer groups based on the identified utterer characteristics; outputs a recognition result among a plurality of recognition results of the uttered voice input based on a voice recognition model corresponding to the identified utterer group among a plurality of voice recognition models provided corresponding to the plurality of utterer groups, the plurality of recognition results being different in recognition accuracy from one another; identifies recognition success or failure in the uttered voice input with respect to the output recognition result; and changes a recognition accuracy of the output recognition result in the voice recognition model corresponding to the recognition success, based on the identified recognition success in the uttered voice input.
    Type: Grant
    Filed: September 2, 2020
    Date of Patent: November 1, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Heekyoung Seo
  • Patent number: 11403060
    Abstract: An information processing device is provided with a processor configured to receive, as an utterance, an instruction for executing a service, and cause the received service to be executed according to a setting determined using a state of the utterance.
    Type: Grant
    Filed: June 30, 2020
    Date of Patent: August 2, 2022
    Assignee: FUJIFILM Business Innovation Corp.
    Inventor: Takafumi Haruta
  • Patent number: 11361763
    Abstract: A speech-processing system capable of receiving and processing audio data to determine if the audio data includes speech that was intended for the system. Non-system directed speech may be filtered out while system-directed speech may be selected for further processing. A system-directed speech detector may use a trained machine learning model (such as a deep neural network or the like) to process a feature vector representing a variety of characteristics of the incoming audio data, including the results of automatic speech recognition and/or other data. Using the feature vector the model may output an indicator as to whether the speech is system-directed. The system may also incorporate other filters such as voice activity detection prior to speech recognition, or the like.
    Type: Grant
    Filed: September 1, 2017
    Date of Patent: June 14, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Roland Maximilian Rolf Maas, Sri Harish Reddy Mallidi, Spyridon Matsoukas, Bjorn Hoffmeister
  • Patent number: 11328722
    Abstract: An electronic device associated with a media-providing service receives a first set of audio streams corresponding to a plurality of microphones. The electronic device generates a second set of audio streams from the first set of audio streams. The second set of audio streams corresponds to a plurality of independent voices and in some cases, ambient noise. The electronic device detects a beginning of a voice command to play media content from the media-providing service in a first audio stream. The electronic device also detects an end of the voice command in the first audio stream. The end of the voice command overlaps with speech in a second audio stream in the second set of audio streams. In response to detecting the voice command, the electronic device plays the media content from the media-providing service.
    Type: Grant
    Filed: February 11, 2020
    Date of Patent: May 10, 2022
    Assignee: Spotify AB
    Inventor: Daniel Bromand
  • Patent number: 11315553
    Abstract: Methods for providing and obtaining data for training and electronic devices thereof are provided. The method for providing data for training includes obtaining first voice data for a voice uttered by a user at a specific time through a microphone of the electronic device and transmitting the voice recognition result to a second electronic device which obtained second voice data for the voice uttered by the user at the specific time, for use as data for training a voice recognition model. In this case, the voice recognition model may be trained using the data for training and an artificial intelligence algorithm such as deep learning.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: April 26, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sangha Kim, Sungchan Kim, Yongchan Lee
  • Patent number: 11308962
    Abstract: A device, such as Network Microphone Device or a playback device, detecting an event associated with the device or a system comprising the device. In response, an input detection window is opened for a given time period. During the given time period the device is arranged to receive an input sound data stream representing sound detected by a microphone. The input sound data stream is analyzed for a plurality of keywords and/or a wake-word for a Voice Assistant Service (VAS) and, based on the analysis, it is determined that the input sound data stream includes voice input data comprising a keyword or a wake-word for a VAS. In response, the device takes appropriate action such as causing the media playback system to perform a command corresponding to the keyword or sending at least part of the input sound data stream to the VAS.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: April 19, 2022
    Assignee: Sonos, Inc.
    Inventors: Connor Kristopher Smith, Matthew David Anderson
  • Patent number: 11281726
    Abstract: Techniques allow a computer to responsively search for graph shapes similar to a user-selected graph shape much faster. Data can be pre-processed and stored as vectors, along with an index. The index can be used to find similar vectors that represent graph shapes similar to a user-selected shape in a computationally efficient manner. Vectors of multiple resolutions can be used to anticipate different sizes of a graph that a user can select, and comparisons can be repeated and refined. When a satisfactorily small number of candidate vectors are determined, more computationally intensive distance calculations can be performed on data reconstructed from the vectors.
    Type: Grant
    Filed: June 4, 2018
    Date of Patent: March 22, 2022
    Assignee: Palantir Technologies Inc.
    Inventors: Christopher Martin, Abdulaziz Alghunaim, Sri Krishna Vempati
  • Patent number: 11217270
    Abstract: Disclosed is a method for generating training data for training a filled pause detecting model and a device therefor, which execute mounted artificial intelligence (AI) and/or machine learning algorithms in a 5G communication environment. The method includes acquiring acoustic data including first speech data including a filled pause, second speech data not including a filled pause, and noise, generating a plurality of noise data based on the acoustic data, and generating first training data including a plurality of filled pauses and second training data not including a plurality of filled pauses by synthesizing the plurality of noise data with the first speech data and the second speech data. According to the present disclosure, training data for training a filled pause detecting model in a simulation noise environment can be generated, and filled pause detection performance for speech data generated in an actual noise environment can be enhanced.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: January 4, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Yun Jin Lee, Jaehun Choi
  • Patent number: 11200898
    Abstract: Implementations set forth herein relate to an automated assistant that uses circumstantial condition data, generated based on circumstantial conditions of an input, to determine whether the input should affect an action been initialized by a particular user. The automated assistant can allow each user to manipulate their respective ongoing action without necessitating interruptions for soliciting explicit user authentication. For example, when an individual in a group of persons interacts with the automated assistant to initialize or affect a particular ongoing action, the automated assistant can generate data that correlates that individual to the particular ongoing action. The data can be generated using a variety of different input modalities, which can be dynamically selected based on changing circumstances of the individual. Therefore, different sets of input modalities can be processed each time a user provides an input for modifying an ongoing action and/or initializing another action.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: December 14, 2021
    Assignee: GOOGLE LLC
    Inventors: Andrew Gallagher, Caroline Pantofaru, Vinay Bettadapura, Utsav Prabhu
  • Patent number: 11195532
    Abstract: The present disclosure relates to chatbot systems, and more particularly, to techniques for detecting that there are multiple intents represented in an utterance and then matching each detected intent to an intent associated with a chatbot in a chatbot system. In certain embodiments, a chatbot system receives an utterance from a user. A language of the utterance is determined and a set of rules identified for the language of the utterance. The utterance is parsed to extract information relating to the sentence structure of the utterance. The set of one or more rules is used to (1) determine whether the utterance is formed of two or more parts that each correspond to a separate intent of a user, and (2) split the utterance into the two or more parts for separate processing including matching of each user intent to an intent configured for a chatbot.
    Type: Grant
    Filed: April 24, 2020
    Date of Patent: December 7, 2021
    Assignee: Oracle International Corporation
    Inventors: Saba Amsalu Teserra, Vishal Vishnoi
  • Patent number: 11194865
    Abstract: Systems, apparatuses, and methods are provided for identifying a corresponding string stored in memory based on an incomplete input string. A system can analyze and produce phonetic and distance metrics for a plurality of strings stored in memory by comparing the plurality of strings to an incomplete input string. These similarity metrics can be used as the input to a machine learning model, which can quickly and accurately provide a classification. This classification can be used to identify a string stored in memory that corresponds to the incomplete input string.
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: December 7, 2021
    Assignee: Visa International Service Association
    Inventors: Pranjal Singh, Soumyajyoti Banerjee
  • Patent number: 11182563
    Abstract: An apparatus and method for generating a dialogue state tracking model. The apparatus retrieves a field feature corresponding to a queried field from a database according to the queried field corresponding to a queried message. The apparatus retrieves a candidate-term feature corresponding to each of at least one candidate-term corresponding to the queried field from the database, and integrates them into an integrated feature. The apparatus also generates at least one relation sub-sentence of a reply message corresponding to the queried message and generates a sentence relation feature according to the at least one relation sub-sentence. The apparatus further generates a queried field related feature according to the field feature, the integrated feature and the sentence relation feature and trains the dialogue state tracking model according to the queried field related feature.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: November 23, 2021
    Assignee: INSTITUTE FOR INFORMATION INDUSTRY
    Inventors: Wei-Jen Yang, Guann-Long Chiou, Yu-Shian Chiu
  • Patent number: 11138978
    Abstract: A method and system of automatically identifying topics of a conversation are provided. An electronic data package comprising a sequence of utterances between conversation entities is received by a computing device. Each utterance is classified to a corresponding social action. One or more utterances in the sequence are grouped into a segment based on a deep learning model. A similarity of topics between adjacent segments is determined. Upon determining that the similarity is above a predetermined threshold, the adjacent segments are grouped together. A transcript of the conversation including the grouping of the adjacent segments is stored in a memory.
    Type: Grant
    Filed: July 24, 2019
    Date of Patent: October 5, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Margaret Helen Szymanski, Lei Huang, Robert John Moore, Raphael Arar, Shun Jiang, Guangjie Ren, Eric Liu, Pawan Chowdhary, Chung-hao Tan, Sunhwan Lee
  • Patent number: 11043214
    Abstract: Described herein is a system for rescoring automatic speech recognition hypotheses for conversational devices that have multi-turn dialogs with a user. The system leverages dialog context by incorporating data related to past user utterances and data related to the system generated response corresponding to the past user utterance. Incorporation of this data improves recognition of a particular user utterance within the dialog.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: June 22, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Behnam Hedayatnia, Anirudh Raju, Ankur Gandhe, Chandra Prakash Khatri, Ariya Rastrow, Anushree Venkatesh, Arindam Mandal, Raefer Christopher Gabriel, Ahmad Shikib Mehri
  • Patent number: 11024297
    Abstract: A method for using speech disfluencies detected in speech input to assist in interpreting the input is provided. The method includes providing access to a set of content items with metadata describing the content items, and receiving a speech input intended to identify a desired content item. The method further includes detecting a speech disfluency in the speech input and determining a measure of confidence of a user in a portion of the speech input following the speech disfluency. If the confidence measure is lower than a threshold value, the method includes determining an alternative query input based on replacing the portion of the speech input following the speech disfluency with another word or phrase. The method further includes selecting content items based on comparing the speech input, the alternative query input (when the confidence measure is low), and the metadata associated with the content items.
    Type: Grant
    Filed: October 25, 2018
    Date of Patent: June 1, 2021
    Assignee: Veveo, Inc.
    Inventors: Murali Aravamudan, Daren Gill, Sashikumar Venkataraman, Vineet Agarwal, Ganesh Ramamoorthy
  • Patent number: 10991365
    Abstract: A method of enhancing an automated speech recognition confidence classifier includes receiving a set of baseline confidence features from one or more decoded words, deriving word embedding confidence features from the baseline confidence features, joining the baseline confidence features with word embedding confidence features to create a feature vector, and executing the confidence classifier to generate a confidence score, wherein the confidence classifier is trained with a set of training examples having labeled features corresponding to the feature vector.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: April 27, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Kshitiz Kumar, Anastasios Anastasakos, Yifan Gong
  • Patent number: 10978058
    Abstract: Disclosed are an electronic apparatus, a control method thereof, and a computer program product for the same, the electronic apparatus including: a receiver comprising receiving circuitry configured to receive a sound; and a processor configured to: identify with a given sensitivity whether a characteristic of a received sound corresponds to a voice command of a user in response to the sound being received through the receiver, identify the voice command based on identifying that the characteristic of the received sound corresponds to the voice command, and perform an operation corresponding to the identified voice command, and change the sensitivity based on identifying that the characteristic of the received sound does not correspond to the voice command. Thus, the electronic apparatus performs the optimum and/or improved audio process by properly controlling the sensitivity based on the circumstances.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: April 13, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jonguk Yoo, Kihoon Shin
  • Patent number: 10964316
    Abstract: One non-limiting embodiment provides a method, including: receiving, from a user, user input comprising a trigger event; identifying, using at least one processor, active media content; and performing, based upon the trigger event, an action with respect to the active media content. This embodiment is intended to be non-limiting and other embodiments are contemplated, disclosed, and discussed.
    Type: Grant
    Filed: August 9, 2017
    Date of Patent: March 30, 2021
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Roderick Echols, Ryan Charles Knudson, Timothy Winthrop Kingsbury, Jonathan Gaither Knox
  • Patent number: 10943580
    Abstract: Methods and systems for phonological clustering are disclosed. A method includes: segmenting, by a computing device, a sentence into a plurality of tokens; determining, by the computing device, a plurality of phoneme variants corresponding to the plurality of tokens; clustering, by the computing device, the plurality of phoneme variants; creating, by the computing device, an initial vectorization of the plurality of phoneme variants based on the clustering; embedding, by the computing device, the initial vectorization of the plurality of phoneme variants into a deep learning model; and determining, by the computing device, a radial set of phoneme variants using the deep learning model.
    Type: Grant
    Filed: May 11, 2018
    Date of Patent: March 9, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Craig M. Trim, John M. Ganci, Jr., James E. Bostick, Carlos A. Fonseca
  • Patent number: 10929448
    Abstract: A computer-implemented method determines a category of a request provided by a user by means of a user device. The user device includes connection means and means for receiving a request description relating to said request from said user. The method includes receiving, from the user, the request description, by means of the device, and uploading the request description to a server. The server has access to a database which includes a number of previously categorized requests each including a category and a vocabulary, which includes a number of word vector representations. The method further includes identifying, by the server, a number of component words belonging to a natural language text string included in the request description; obtaining, for at least one of the component words, an associated word vector representation from the vocabulary, and determining a request vector, based on at least one obtained word vector representation.
    Type: Grant
    Filed: August 10, 2018
    Date of Patent: February 23, 2021
    Assignee: KBC GROEP NV
    Inventors: Hans Verstraete, Hans Verstraete, Pieter Van Hertum, Rahul Maheshwari, Jeroen D'Haen, Michaël Mariën, Barak Chizi, Frank Fripon, Sven Evens
  • Patent number: 10923111
    Abstract: A system configured to recognize text represented by speech may determine that a first portion of audio data corresponds to speech from a first speaker and that a second portion of audio data corresponds to speech from the first speaker and a second speaker. Features of the first portion are compared to features of the second portion to determine a similarity therebetween. Based on this similarity, speech from the first speaker is distinguished from speech from the second speaker and text corresponding to speech from the first speaker is determined.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: February 16, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Xing Fan, I-Fan Chen, Yuzong Liu, Bjorn Hoffmeister, Yiming Wang, Tongfei Chen
  • Patent number: 10915435
    Abstract: Methods and systems for a deep learning based problem advisor are disclosed. A method includes: obtaining, by a computing device, a log file including events generated during execution of a software application; determining, by the computing device, at least one possible cause for a problem in the software application using the obtained log file and a knowledge base including calling paths for each of a plurality of methods in source code of the software application; for each of the at least one possible cause for the problem, the computing device simulating user actions in the software application; and determining, by the computing device, a root cause based on the simulating user actions in the software application.
    Type: Grant
    Filed: November 28, 2018
    Date of Patent: February 9, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jian Zhang, Yi Bin Wang, Wu Weilin, Mu Dan Cao, Dan Tan
  • Patent number: 10896671
    Abstract: A command-processing server provides natural language services to applications. More specifically, the command-processing server receives natural language inputs from users for use in applications such as virtual assistants. Some user inputs create user-defined rules that consist of trigger conditions and of corresponding actions that are executed when the triggers fire. The command-processing server stores the rules received from a user in association with the specific user. The command-processing server also identifies rules that can be generalized across users and promoted into generic rules applicable to many or all users. The generic rules may or may not have an associated context constraining their application.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: January 19, 2021
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Christopher S. Wilson, Bernard Mont-Reynaud, Robert MacRae
  • Patent number: 10818193
    Abstract: A communications training system is provided having a user interface, a computer-based simulator and a performance measurement database. The user interface is configured to receive a speech communication input from the user based on a training content and the computer-based simulator is configured to transform the speech communication to a text data whereby the text data can be aligned to performance measurement database values to determine a performance measure of the speech communication. The format of the text data and the performance measurement database values enable the speech communication to be aligned with predefined performance measurement database values representing expected speech communications for that training content.
    Type: Grant
    Filed: February 20, 2017
    Date of Patent: October 27, 2020
    Assignee: Aptima, Inc.
    Inventors: Kevin Sullivan, Matthew Roberts, Michael Knapp, Brian Riordan
  • Patent number: 10657327
    Abstract: Mechanisms are provided for clarifying homophone usage in natural language content. The mechanisms analyze natural language content to identify a homophone instance in the natural language content, the homophone instance being a first term having a first definition and a first pronunciation for which there is a second term having the first pronunciation and a second definition different from the first definition. The mechanisms, in response to identifying the homophone instance, analyze the natural language content to identify a third term that is a synonym for the second term. The third term has a third definition that is nearly the same as the second definition. The mechanisms, in response to the natural language content comprising the third term, perform a clarifying operation to modify the natural language content to clarify the homophone instance and generate a modified natural language content.
    Type: Grant
    Filed: August 1, 2017
    Date of Patent: May 19, 2020
    Assignee: International Business Machines Corporation
    Inventors: Kelley L. Anders, Paul R. Bastide, Stacy M. Cannon, Trudy L. Hewitt
  • Patent number: 10628567
    Abstract: Methods, computing systems and computer program products implement embodiments of the present invention that include defining a verification string including a sequence of verification characters and a delimiter character between each sequential pair of the verification characters, the delimiter character being different from the verification characters. The verification string to a user, and upon receiving, from the user, a series of verification vocal inputs in response to presenting the verification string, a set of verification features from each of the verification vocal inputs are computed so as to generate sets of verification features. A one-to-one correspondence is established between each of the verification vocal inputs and each of the verification characters, and the user is authenticated based on the verification vocal inputs and their corresponding sets of verification features.
    Type: Grant
    Filed: September 5, 2016
    Date of Patent: April 21, 2020
    Assignee: International Business Machines Corporation
    Inventor: Hagai Aronowitz
  • Patent number: 10559305
    Abstract: [Object] To provide an information processing system and an information processing method capable of auditing the utterance data of an agent more flexibly. [Solution] An information processing system including: a storage section that stores utterance data of an agent; a communication section that receives request information transmitted from a client terminal and requesting utterance data of a specific agent from a user; and a control section that, when the request information is received through the communication section, replies to the client terminal with corresponding utterance data, and in accordance with feedback from the user with respect to the utterance data, updates an utterance probability level expressing a probability that the specific agent will utter utterance content indicated by the utterance data, and records the updated utterance probability level in association with the specific agent and the utterance content in the storage section.
    Type: Grant
    Filed: February 2, 2017
    Date of Patent: February 11, 2020
    Assignee: SONY CORPORATION
    Inventor: Akihiro Komori
  • Patent number: 10540963
    Abstract: A computer-implemented method for generating an input for a classifier. The method includes obtaining n-best hypotheses which is an output of an automatic speech recognition (ASR) for an utterance, combining the n-best hypotheses horizontally in a predetermined order with a separator between each pair of hypotheses, and outputting the combined n-best hypotheses as a single text input to a classifier.
    Type: Grant
    Filed: February 2, 2017
    Date of Patent: January 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana
  • Patent number: 10387805
    Abstract: The present invention provides a method for ranking an incoming news feed comprising a header. The method comprising the steps of: receiving the incoming news feed with headers, extracting the incoming news feed's header, performing part-of-speech-tagging of the extracted header's words and associating to each of the header's words a code characterizing its grammatical function, generating the list of the incoming header's words codes, associating the generated list to the incoming news feed, as its pattern and computing a score for the incoming news feed according to predefined rules defining the score based on the its pattern.
    Type: Grant
    Filed: July 16, 2014
    Date of Patent: August 20, 2019
    Assignee: DEEP IT LTD
    Inventors: Eliezer Katz, Ofer Weintraub
  • Patent number: 10212181
    Abstract: A method comprises creating a word vector from a message, wherein the word vector comprises an entry for each word of a plurality of words, and wherein each word of the plurality of words is assigned a weight. The method further comprises calculating a value for the word vector based on each entry of the word vector and the weights assigned to the plurality of words, and identifying that the message belongs to a first group by comparing the value for the word vector to a threshold. The word vector comprises an entry for each word of a plurality of words, and wherein each word of the plurality of words is assigned a weight.
    Type: Grant
    Filed: November 18, 2016
    Date of Patent: February 19, 2019
    Assignee: Bank of America Corporation
    Inventors: Pinak Chakraborty, Vidhu Beohar, Chetan Phanse
  • Patent number: 10102732
    Abstract: A danger monitoring system is disclosed. A danger monitoring device comprises a microphone configured to continuously digitize environmental sound, a first memory, a first processor configured to determine whether a stored interval meets a threshold criteria for a dangerous event, and a first network interface configured to send a danger observation data to a server. The danger monitoring server comprises a second memory, a second processor configured to verify the dangerous event digitized by the danger monitoring device and determine an event location of the verified dangerous event, and a second network interface configured to send a danger alert. A danger mitigation device comprises a third network interface configured to receive the danger alert, a GPS receiver, a screen, a third memory comprising map data, and a third processor configured to render a map indicating at least a current location of the danger mitigation device, and the event location.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: October 16, 2018
    Assignee: INFINITE DESIGNS, LLC
    Inventor: Adam Gersten
  • Patent number: 10089061
    Abstract: According to one embodiment, an electronic device includes a memory and a hardware processor. The hardware processor is in communication with the memory. The hardware processor is configured to obtain a sound file including sound data and attached data, determine a type of meeting of the sound file classified based on an utterance state of the sound data, and display the sound file based on at least one of the sound data and the attached data such that the type of meeting is visually recognizable.
    Type: Grant
    Filed: February 16, 2016
    Date of Patent: October 2, 2018
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Yusaku Kikugawa
  • Patent number: 10062385
    Abstract: A system and method for selecting a speech-to-text engine are disclosed. The method includes selecting, by an engine selection component, at least two speech-to-text engines to decode a portion of computer-readable speech data. The portion of speech data can be decoded simultaneously by the selected speech-to-text engines for a designated length of time. In some embodiments portions of the speech data can be simultaneously decoded with selected speech-to-text engines at periodic intervals. An accuracy of decoding can be determined for each selected speech-to-text engine by an accuracy testing component. Additionally, the relative accuracies and speeds of the selected speech-to-text engines can be compared by an output comparison component. The engine selection component can then select the most accurate speech-to-text engine accurate to decode a next portion of speech data. Further, the engine selection module may select a speech-to-text engine that meets or exceeds a speed and/or accuracy threshold.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: August 28, 2018
    Assignee: International Business Machines Corporation
    Inventors: Alexander Cook, Manuel Orozco, Christopher R. Sabotta, John M. Santosuosso
  • Patent number: 10049672
    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.
    Type: Grant
    Filed: June 2, 2016
    Date of Patent: August 14, 2018
    Assignee: Google LLC
    Inventors: Brian Patrick Strope, Francoise Beaufays, Olivier Siohan
  • Patent number: 9984689
    Abstract: Disclosed is an apparatus and method for correcting pronunciation by contextual recognition. The apparatus may include an interface configured to receive, from a speech recognition server, first text data obtained by converting speech data to a text, and a processor configured to extract a keyword from the received first text data, calculate a suitability of a word in the first text data in association with the extracted keyword, and update the first text data to second text data by replacing, with an alternative word, a word in the first text data having a suitability less than a preset reference value.
    Type: Grant
    Filed: November 10, 2016
    Date of Patent: May 29, 2018
    Inventor: Sung Hyuk Kim
  • Patent number: 9953640
    Abstract: Method and systems are provided for interpreting speech data. A method and system for recognizing speech involving a filter module to generate a set of processed audio data based on raw audio data; a translation module to provide a set of translation results for the raw audio data; and a decision module to select the text data that represents the raw audio data. A method for minimizing noise in audio signals received by a microphone array is also described. A method and system of automatic entry of data into one or more data fields involving receiving a processed audio data; and operating a processing module to: search in a trigger dictionary for a field identifier that corresponds to the trigger identifier; identify a data field associated with a data field identifier corresponding to the field identifier; and providing content data associated with the trigger identifier to the identified data field.
    Type: Grant
    Filed: June 5, 2015
    Date of Patent: April 24, 2018
    Assignee: INTERDEV TECHNOLOGIES INC.
    Inventors: Janet M. Rice, Peng Liang, Terence W. Kuehn
  • Patent number: 9940318
    Abstract: Methods, apparatus, systems, and computer-readable media are provided for generating and applying outgoing communication templates. In various implementations a corpus of outgoing communications sent by a user may be grouped into a plurality of clusters based on one or more attributes of a context of the user. One or more segments of each outgoing communication of a particular cluster may be classified as fixed in response to a determination that a count of occurrences of the one or more segments across the particular cluster satisfies a criterion. One or more remaining segments of each communication of the particular cluster may or may not be classified as transient. Based on sequences of classified segments associated with each communication of the particular cluster, an outgoing communication template may be generated to automatically populate at least a portion of a draft outgoing communication being prepared by the user.
    Type: Grant
    Filed: January 1, 2016
    Date of Patent: April 10, 2018
    Assignee: GOOGLE LLC
    Inventors: Balint Miklos, Julia Proskurnia, Luis Garcia Pueyo, Marc-Allen Cartright, Tobias Kaufmann, Ivo Krka
  • Patent number: 9928851
    Abstract: A voice verifying system, which comprises: a microphone, which is always turned on to output at least one input audio signal; a speech determining device, for determining if the input audio signal is valid or not according to a reference value, wherein the speech determining device passes the input audio signal if the input audio signal is valid; and a verifying module, for verifying a speech signal generated from the input audio signal and for outputting a device activating signal to activate a target device if the speech signal matches a predetermined rule; and a reference value generating device, for generating the reference value according to speech signal information from the verifying module.
    Type: Grant
    Filed: September 12, 2013
    Date of Patent: March 27, 2018
    Assignee: MEDIATEK INC.
    Inventors: Liang-Che Sun, Yiou-Wen Cheng, Ting-Yuan Chiu