Patents Examined by Jialong He
  • Patent number: 11816445
    Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: November 14, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
  • Patent number: 11810572
    Abstract: A system, method, and computer-program product includes distributing a plurality of audio data files of a speech data corpus to a plurality of computing nodes that each implement a plurality of audio processing threads, executing the plurality of audio processing threads associated with each of the plurality of computing nodes to detect a plurality of tentative speakers participating in each of the plurality of audio data files, generating, via a clustering algorithm, a plurality of clusters of embedding signatures based on a plurality of embedding signatures associated with the plurality of tentative speakers in each of the plurality of audio data files, and detecting a plurality of global speakers associated with the speech data corpus based on the plurality of clusters of embedding signatures.
    Type: Grant
    Filed: June 8, 2023
    Date of Patent: November 7, 2023
    Assignee: SAS INSTITUTE INC.
    Inventors: Xiaozhuo Cheng, Xiaolong Li, Xu Yang
  • Patent number: 11797772
    Abstract: Speech processing techniques are disclosed that enable determining a text representation of named entities in captured audio data. Various implementations include determining the location of a carrier phrase in a word lattice representation of the captured audio data, where the carrier phrase provides an indication of a named entity. Additional or alternative implementations include matching a candidate named entity with the portion of the word lattice, and augmenting the word lattice with the matched candidate named entity.
    Type: Grant
    Filed: January 31, 2022
    Date of Patent: October 24, 2023
    Assignee: GOOGLE LLC
    Inventors: Leonid Velikovich, Petar Aleksic, Pedro Moreno
  • Patent number: 11798529
    Abstract: A language module is joint trained with a knowledge module for natural language understanding by aligning a first knowledge graph with a second knowledge graph. The knowledge module is trained on the aligned knowledge graphs. Then, the knowledge module is integrated with the language module to generate an integrated knowledge-language module.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: October 24, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Chenguang Zhu, Nanshan Zeng
  • Patent number: 11790910
    Abstract: A device obtains information concerning a plurality of customer service representatives to identify a customer service representative and a user device associated with the customer service representative. The device selects a test issue of a plurality of test issues to be presented to the customer service representative, and, based on the test issue, a virtual assistant to converse with the customer service representative. The device initiates, based on an availability of the user device, a communication session with the user device, and causes the virtual assistant to converse with the customer service representative regarding the test issue. The device obtains data concerning a performance of the customer service representative during the communication session, processes the data using a machine learning model to determine a performance score for the customer service representative, and causes, based on the performance score for the customer service representative, at least one action to be performed.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: October 17, 2023
    Assignee: Capital One Services, LLC
    Inventors: Abdelkadar M'Hamed Benkreira, Joshua Edwards, Michael Mossoba, Alexandra Colevas
  • Patent number: 11790411
    Abstract: An example computing system for classifying messages is described. For example, a computing system includes a memory and one or more processors in communication with the memory. The one or more processors are configured to receive data indicative of a message from a user device, wherein the data indicative of the message comprises a string of characters, identify, based on the string of characters, a set of token vectors from a plurality of token vectors generated based on a set of training data, and determine, using a machine learning model and based on the set of token vectors, a probability that a risk level associated with the message is a first risk level indicating that the message is to be elevated, wherein the first risk level is greater than a second risk level.
    Type: Grant
    Filed: February 4, 2020
    Date of Patent: October 17, 2023
    Assignee: Wells Fargo Bank, N.A.
    Inventors: Thomas Mann, Michael W. Soistman, Nathan R Parrish, Noel P. Volin, Raja Ranganathan, Dane Arnesen, Shawn Bocketti, Kevin Portis, Josh Engebretson
  • Patent number: 11789966
    Abstract: A system and method for the automated monitoring of inmate telephone calls as well as multi-modal search, retrieval and playback capabilities for said calls. A general term for such capabilities is multi-modal audio mining. The invention is designed to provide an efficient means for organizations such as correctional facilities to identify and monitor the contents of telephone conversations and to provide evidence of possible inappropriate conduct and/or criminal activity of inmates by analyzing monitored telephone conversations for events, including, but not limited to, the addition of third parties, the discussion of particular topics, and the mention of certain entities.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: October 17, 2023
    Assignee: Global Tel*Link Corporation
    Inventors: Andreas M. Olligschlaeger, James P. Rokosky
  • Patent number: 11790913
    Abstract: A computer-implemented information providing method includes: storing, at an information providing apparatus, a plurality of pieces of related information in association with a same identification information in a storage device, wherein each of the plurality of the pieces of the related information is representative of a content of a corresponding one of a series of a plurality of speech voices uttered to a first device; and in response to a request from a second device that receives the identification information, causing, at the information providing apparatus, a communication device to transmit to the second device one or more of the plurality of the pieces of the related information associated with the same identification information.
    Type: Grant
    Filed: March 15, 2022
    Date of Patent: October 17, 2023
    Assignee: YAMAHA CORPORATION
    Inventors: Takahiro Iwata, Yuki Seto
  • Patent number: 11790886
    Abstract: A system for synthesizing real IVR scenarios, automatically reproducing test scenarios, and providing accurate reports on those scenarios. An IVR tester uses a telephony device (website, mobile app, VOIP, etc.) and an automated IVR test case generator to place test calls to an IVR system. The automated IVR test case generator intercepts and collects the phone call's audio stream, dual-tone multi-frequency signals, and available metadata (e.g., response times, post-speech silence timeout, etc.). Once the call is complete, the automated IVR test case generator pre-populates a webpage or other user-interface technology and a database with the audio stream, dual-tone multi-frequency signals, and available metadata from the test scenario phone call. The IVR tester may then save that test scenario and continue to the next scenario. The saved scenario may now be used automatically by the IVR test case generator for regression testing or duplicated and edited for use in other systems.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: October 17, 2023
    Assignee: CYARA SOLUTIONS PTY LTD
    Inventors: Alok Kulkarni, Geoff Willshire, Thomas Fejes
  • Patent number: 11783137
    Abstract: A translation method includes steps of providing a translation device, inputting a first conversational voice corresponded to a first language, converting the first conversational voice into a general language content, converting the general language content into a second conversational voice corresponded to a second language, and outputting the second conversational voice. As a result, any language may be translated into a general language, and then translated into any target language, so that the advantages of implementing multi-lingual translations and conversations with simple setting are achieved.
    Type: Grant
    Filed: August 2, 2021
    Date of Patent: October 10, 2023
    Assignee: COMPAL ELECTRONICS, INC.
    Inventor: Dao-Wen Tang
  • Patent number: 11783849
    Abstract: This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.
    Type: Grant
    Filed: June 8, 2021
    Date of Patent: October 10, 2023
    Assignee: Google LLC
    Inventors: Ehsan Variani, Kevin William Wilson, Ron J. Weiss, Tara N. Sainath, Arun Narayanan
  • Patent number: 11776545
    Abstract: An apparatus includes a processor to: receive a request to perform speech-to-text conversion of a speech data set; perform pause detection to identify a set of likely sentence pauses and/or speaker diarization technique to identify a set of likely speaker changes; based the set of likely sentence pauses and/or the set of likely speaker changes, divide the speech data set into data segments representing speech segments; use an acoustic model with the data segments to derive sets of probabilities of speech sounds uttered; store the sets of probabilities in temporal order within a buffer queue; distribute the sets of probabilities from the buffer queue in temporal order among threads of a thread pool; and within each thread, and based on set(s) of probabilities, derive one candidate word and select either the candidate word or an alternate candidate word derived from a language model as the next word most likely spoken.
    Type: Grant
    Filed: November 28, 2022
    Date of Patent: October 3, 2023
    Assignee: SAS Institute Inc.
    Inventors: Xiaolong Li, Xiaozhuo Cheng, Samuel Norris Henderson, Xu Yang
  • Patent number: 11769012
    Abstract: A system and method for updating computerized language models is provided that automatically adds or deletes terms from the language model to capture trending events or products, while maximizing computer efficiencies by deleting terms that are no longer trending and use of knowledge bases, machine learning model training and evaluation corpora, analysis tools and databases.
    Type: Grant
    Filed: March 25, 2020
    Date of Patent: September 26, 2023
    Assignee: Verint Americas Inc.
    Inventors: Ian Roy Beaver, Christopher James Jeffs
  • Patent number: 11755832
    Abstract: A voice recognition part performs voice recognition on a voice data and generates a first text which is a text indicating an utterance content. A text acquisition part acquires a second text which is a text indicating an utterance content according to an operation. A display processing part moves a position of a display text displayed on a display part, displays a text of at least one of the first text and the second text as a display text in a free region generated by the movement, and when fixing of a display position of the second text is instructed according to an operation, fixes the second text as a fixed text at a predetermined display position and displays the second text on the display part.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: September 12, 2023
    Assignee: Honda Motor Co., Ltd.
    Inventors: Naoaki Sumida, Masaki Nakatsuka, Kazuhiro Nakadai, Yuichi Yoshida, Takashi Yamauchi, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
  • Patent number: 11756576
    Abstract: An audio signal classification method includes determining, according to voice activity of a current audio frame, whether to obtain a frequency spectrum fluctuation of the current audio frame and store the frequency spectrum fluctuation in a frequency spectrum fluctuation memory, and updating, according to whether the audio frame is percussive music or activity of a historical audio frame, frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory, and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory.
    Type: Grant
    Filed: March 11, 2022
    Date of Patent: September 12, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Zhe Wang
  • Patent number: 11710488
    Abstract: A method may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to a first speech recognition system to generate a first transcript based on the audio data and directing the first transcript to the second device. The method may also include in response to obtaining a quality indication regarding a quality of the first transcript, multiplexing the audio data to provide the audio data to a second speech recognition system to generate a second transcript based on the audio data while continuing to provide the audio data to the first speech recognition system and direct the first transcript to the second device, and in response to obtaining a transfer indication that occurs after multiplexing of the audio data, directing the second transcript to the second device instead of the first transcript.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: July 25, 2023
    Assignee: Sorenson IP Holdings, LLC
    Inventors: Kenneth Boehme, Michael Holm, Shane Roylance
  • Patent number: 11710490
    Abstract: An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.
    Type: Grant
    Filed: February 17, 2021
    Date of Patent: July 25, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Yi Gao
  • Patent number: 11710474
    Abstract: A text-to-speech engine creates audio output that includes synthesized speech and one or more media content item snippets. The input text is obtained and partitioned into text sets. A track having lyrics that match a part of one of the text sets is identified. The location of the track's audio that contains the lyric is extracted based on forced alignment data. The extracted audio is combined with synthesized speech corresponding to the remainder of the input text to form audio output.
    Type: Grant
    Filed: January 12, 2021
    Date of Patent: July 25, 2023
    Assignee: Spotify AB
    Inventors: Rohit Kumar, Henrik Lindström, Henriette Cramer, Sarah Mennicken, Sravana Reddy, Jennifer Thom-Santelli
  • Patent number: 11699455
    Abstract: Systems and methods for viseme data generation are disclosed. Uncompressed audio data is generated and/or utilized to determine the beats per minute of the audio data. Visemes are associated with the audio data utilizing a Viterbi algorithm and the beats per minute. A time-stamped list of viseme data is generated that associates the visemes with the portions of the audio data that they correspond to. An animatronic toy and/or an animation is caused to lip sync using the viseme data while audio corresponding to the audio data is output.
    Type: Grant
    Filed: September 4, 2020
    Date of Patent: July 11, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Zoe Adams, Pete Klein, Derick Deller, Bradley Michael Richards, Anirudh Ranganath
  • Patent number: 11694030
    Abstract: A system and method for automated theming of thought objects is disclosed. In a preferred embodiment, an object theming computer creates aggregated text by aggregating text associated with a first thought object, the aggregated text comprising descriptive and important text associated with the first thought object. The object theming computer then tokenizes the aggregated text into tokens. Further, a theme is associated to thought objects using one or more previously themed thought objects. If one or more thought objects have no associated theme, then a theme is generated and associated to the un-themed thought objects using a current plurality of the one or more thought objects.
    Type: Grant
    Filed: April 6, 2022
    Date of Patent: July 4, 2023
    Assignee: Fulcrum Management Solutions Ltd.
    Inventors: Farhad Imani, Thomas John Procter