Patents Examined by Jialong He

Translating language characters in media content

Patent number: 11816445

Abstract: Some implementations disclosed herein provide techniques and arrangements to enable translating language characters in media content. For example, some implementations receive a user selection of a first portion of media content. Some implementations disclosed herein may, based on the first portion, identify a second portion of the media content. The second portion of the media content may include one or more first characters of a first language. Some implementations disclosed herein may create an image that includes the second portion of the media content and may send the image to a server. Some implementations disclosed herein may receive one or more second characters of a second language corresponding to a translation of the one or more first characters of the first language from the server.

Type: Grant

Filed: May 17, 2021

Date of Patent: November 14, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jun Du, Lei Sun, Jian Sun, Qiang Huo
Multi-threaded speaker identification

Patent number: 11810572

Abstract: A system, method, and computer-program product includes distributing a plurality of audio data files of a speech data corpus to a plurality of computing nodes that each implement a plurality of audio processing threads, executing the plurality of audio processing threads associated with each of the plurality of computing nodes to detect a plurality of tentative speakers participating in each of the plurality of audio data files, generating, via a clustering algorithm, a plurality of clusters of embedding signatures based on a plurality of embedding signatures associated with the plurality of tentative speakers in each of the plurality of audio data files, and detecting a plurality of global speakers associated with the speech data corpus based on the plurality of clusters of embedding signatures.

Type: Grant

Filed: June 8, 2023

Date of Patent: November 7, 2023

Assignee: SAS INSTITUTE INC.

Inventors: Xiaozhuo Cheng, Xiaolong Li, Xu Yang
Word lattice augmentation for automatic speech recognition

Patent number: 11797772

Abstract: Speech processing techniques are disclosed that enable determining a text representation of named entities in captured audio data. Various implementations include determining the location of a carrier phrase in a word lattice representation of the captured audio data, where the carrier phrase provides an indication of a named entity. Additional or alternative implementations include matching a candidate named entity with the portion of the word lattice, and augmenting the word lattice with the matched candidate named entity.

Type: Grant

Filed: January 31, 2022

Date of Patent: October 24, 2023

Assignee: GOOGLE LLC

Inventors: Leonid Velikovich, Petar Aleksic, Pedro Moreno
Generation of optimized knowledge-based language model through knowledge graph multi-alignment

Patent number: 11798529

Abstract: A language module is joint trained with a knowledge module for natural language understanding by aligning a first knowledge graph with a second knowledge graph. The knowledge module is trained on the aligned knowledge graphs. Then, the knowledge module is integrated with the language module to generate an integrated knowledge-language module.

Type: Grant

Filed: May 18, 2021

Date of Patent: October 24, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Chenguang Zhu, Nanshan Zeng
Interacting with a user device to provide automated testing of a customer service representative

Patent number: 11790910

Abstract: A device obtains information concerning a plurality of customer service representatives to identify a customer service representative and a user device associated with the customer service representative. The device selects a test issue of a plurality of test issues to be presented to the customer service representative, and, based on the test issue, a virtual assistant to converse with the customer service representative. The device initiates, based on an availability of the user device, a communication session with the user device, and causes the virtual assistant to converse with the customer service representative regarding the test issue. The device obtains data concerning a performance of the customer service representative during the communication session, processes the data using a machine learning model to determine a performance score for the customer service representative, and causes, based on the performance score for the customer service representative, at least one action to be performed.

Type: Grant

Filed: May 10, 2021

Date of Patent: October 17, 2023

Assignee: Capital One Services, LLC

Inventors: Abdelkadar M'Hamed Benkreira, Joshua Edwards, Michael Mossoba, Alexandra Colevas
Complaint classification in customer communications using machine learning models

Patent number: 11790411

Abstract: An example computing system for classifying messages is described. For example, a computing system includes a memory and one or more processors in communication with the memory. The one or more processors are configured to receive data indicative of a message from a user device, wherein the data indicative of the message comprises a string of characters, identify, based on the string of characters, a set of token vectors from a plurality of token vectors generated based on a set of training data, and determine, using a machine learning model and based on the set of token vectors, a probability that a risk level associated with the message is a first risk level indicating that the message is to be elevated, wherein the first risk level is greater than a second risk level.

Type: Grant

Filed: February 4, 2020

Date of Patent: October 17, 2023

Assignee: Wells Fargo Bank, N.A.

Inventors: Thomas Mann, Michael W. Soistman, Nathan R Parrish, Noel P. Volin, Raja Ranganathan, Dane Arnesen, Shawn Bocketti, Kevin Portis, Josh Engebretson
System and method for multi-modal audio mining of telephone conversations

Patent number: 11789966

Abstract: A system and method for the automated monitoring of inmate telephone calls as well as multi-modal search, retrieval and playback capabilities for said calls. A general term for such capabilities is multi-modal audio mining. The invention is designed to provide an efficient means for organizations such as correctional facilities to identify and monitor the contents of telephone conversations and to provide evidence of possible inappropriate conduct and/or criminal activity of inmates by analyzing monitored telephone conversations for events, including, but not limited to, the addition of third parties, the discussion of particular topics, and the mention of certain entities.

Type: Grant

Filed: November 30, 2020

Date of Patent: October 17, 2023

Assignee: Global Tel*Link Corporation

Inventors: Andreas M. Olligschlaeger, James P. Rokosky
Information providing method, apparatus, and storage medium, that transmit related information to a remote terminal based on identification information received from the remote terminal

Patent number: 11790913

Abstract: A computer-implemented information providing method includes: storing, at an information providing apparatus, a plurality of pieces of related information in association with a same identification information in a storage device, wherein each of the plurality of the pieces of the related information is representative of a content of a corresponding one of a series of a plurality of speech voices uttered to a first device; and in response to a request from a second device that receives the identification information, causing, at the information providing apparatus, a communication device to transmit to the second device one or more of the plurality of the pieces of the related information associated with the same identification information.

Type: Grant

Filed: March 15, 2022

Date of Patent: October 17, 2023

Assignee: YAMAHA CORPORATION

Inventors: Takahiro Iwata, Yuki Seto
System and method for synthesizing automated test cases from natural interactions

Patent number: 11790886

Abstract: A system for synthesizing real IVR scenarios, automatically reproducing test scenarios, and providing accurate reports on those scenarios. An IVR tester uses a telephony device (website, mobile app, VOIP, etc.) and an automated IVR test case generator to place test calls to an IVR system. The automated IVR test case generator intercepts and collects the phone call's audio stream, dual-tone multi-frequency signals, and available metadata (e.g., response times, post-speech silence timeout, etc.). Once the call is complete, the automated IVR test case generator pre-populates a webpage or other user-interface technology and a database with the audio stream, dual-tone multi-frequency signals, and available metadata from the test scenario phone call. The IVR tester may then save that test scenario and continue to the next scenario. The saved scenario may now be used automatically by the IVR test case generator for regression testing or duplicated and edited for use in other systems.

Type: Grant

Filed: November 6, 2020

Date of Patent: October 17, 2023

Assignee: CYARA SOLUTIONS PTY LTD

Inventors: Alok Kulkarni, Geoff Willshire, Thomas Fejes
Translation method and translation device

Patent number: 11783137

Abstract: A translation method includes steps of providing a translation device, inputting a first conversational voice corresponded to a first language, converting the first conversational voice into a general language content, converting the general language content into a second conversational voice corresponded to a second language, and outputting the second conversational voice. As a result, any language may be translated into a general language, and then translated into any target language, so that the advantages of implementing multi-lingual translations and conversations with simple setting are achieved.

Type: Grant

Filed: August 2, 2021

Date of Patent: October 10, 2023

Assignee: COMPAL ELECTRONICS, INC.

Inventor: Dao-Wen Tang
Enhanced multi-channel acoustic models

Patent number: 11783849

Abstract: This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.

Type: Grant

Filed: June 8, 2021

Date of Patent: October 10, 2023

Assignee: Google LLC

Inventors: Ehsan Variani, Kevin William Wilson, Ron J. Weiss, Tara N. Sainath, Arun Narayanan
Multithreaded speech-to-text processing

Patent number: 11776545

Abstract: An apparatus includes a processor to: receive a request to perform speech-to-text conversion of a speech data set; perform pause detection to identify a set of likely sentence pauses and/or speaker diarization technique to identify a set of likely speaker changes; based the set of likely sentence pauses and/or the set of likely speaker changes, divide the speech data set into data segments representing speech segments; use an acoustic model with the data segments to derive sets of probabilities of speech sounds uttered; store the sets of probabilities in temporal order within a buffer queue; distribute the sets of probabilities from the buffer queue in temporal order among threads of a thread pool; and within each thread, and based on set(s) of probabilities, derive one candidate word and select either the candidate word or an alternate candidate word derived from a language model as the next word most likely spoken.

Type: Grant

Filed: November 28, 2022

Date of Patent: October 3, 2023

Assignee: SAS Institute Inc.

Inventors: Xiaolong Li, Xiaozhuo Cheng, Samuel Norris Henderson, Xu Yang
Automated system and method to prioritize language model and ontology expansion and pruning

Patent number: 11769012

Abstract: A system and method for updating computerized language models is provided that automatically adds or deletes terms from the language model to capture trending events or products, while maximizing computer efficiencies by deleting terms that are no longer trending and use of knowledge bases, machine learning model training and evaluation corpora, analysis tools and databases.

Type: Grant

Filed: March 25, 2020

Date of Patent: September 26, 2023

Assignee: Verint Americas Inc.

Inventors: Ian Roy Beaver, Christopher James Jeffs
Conversation support device, conversation support system, conversation support method, and storage medium

Patent number: 11755832

Abstract: A voice recognition part performs voice recognition on a voice data and generates a first text which is a text indicating an utterance content. A text acquisition part acquires a second text which is a text indicating an utterance content according to an operation. A display processing part moves a position of a display text displayed on a display part, displays a text of at least one of the first text and the second text as a display text in a free region generated by the movement, and when fixing of a display position of the second text is instructed according to an operation, fixes the second text as a fixed text at a predetermined display position and displays the second text on the display part.

Type: Grant

Filed: March 29, 2021

Date of Patent: September 12, 2023

Assignee: Honda Motor Co., Ltd.

Inventors: Naoaki Sumida, Masaki Nakatsuka, Kazuhiro Nakadai, Yuichi Yoshida, Takashi Yamauchi, Kazuya Maura, Kyosuke Hineno, Syozo Yokoo
Classification of audio signal as speech or music based on energy fluctuation of frequency spectrum

Patent number: 11756576

Abstract: An audio signal classification method includes determining, according to voice activity of a current audio frame, whether to obtain a frequency spectrum fluctuation of the current audio frame and store the frequency spectrum fluctuation in a frequency spectrum fluctuation memory, and updating, according to whether the audio frame is percussive music or activity of a historical audio frame, frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory, and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory.

Type: Grant

Filed: March 11, 2022

Date of Patent: September 12, 2023

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventor: Zhe Wang
Transcription of communications using multiple speech recognition systems

Patent number: 11710488

Abstract: A method may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to a first speech recognition system to generate a first transcript based on the audio data and directing the first transcript to the second device. The method may also include in response to obtaining a quality indication regarding a quality of the first transcript, multiplexing the audio data to provide the audio data to a second speech recognition system to generate a second transcript based on the audio data while continuing to provide the audio data to the first speech recognition system and direct the first transcript to the second device, and in response to obtaining a transfer indication that occurs after multiplexing of the audio data, directing the second transcript to the second device instead of the first transcript.

Type: Grant

Filed: December 19, 2018

Date of Patent: July 25, 2023

Assignee: Sorenson IP Holdings, LLC

Inventors: Kenneth Boehme, Michael Holm, Shane Roylance
Audio data processing method, apparatus and storage medium for detecting wake-up words based on multi-path audio from microphone array

Patent number: 11710490

Abstract: An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.

Type: Grant

Filed: February 17, 2021

Date of Patent: July 25, 2023

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor: Yi Gao
Text-to-speech from media content item snippets

Patent number: 11710474

Abstract: A text-to-speech engine creates audio output that includes synthesized speech and one or more media content item snippets. The input text is obtained and partitioned into text sets. A track having lyrics that match a part of one of the text sets is identified. The location of the track's audio that contains the lyric is extracted based on forced alignment data. The extracted audio is combined with synthesized speech corresponding to the remainder of the input text to form audio output.

Type: Grant

Filed: January 12, 2021

Date of Patent: July 25, 2023

Assignee: Spotify AB

Inventors: Rohit Kumar, Henrik Lindström, Henriette Cramer, Sarah Mennicken, Sravana Reddy, Jennifer Thom-Santelli
Viseme data generation for presentation while content is output

Patent number: 11699455

Abstract: Systems and methods for viseme data generation are disclosed. Uncompressed audio data is generated and/or utilized to determine the beats per minute of the audio data. Visemes are associated with the audio data utilizing a Viterbi algorithm and the beats per minute. A time-stamped list of viseme data is generated that associates the visemes with the portions of the audio data that they correspond to. An animatronic toy and/or an animation is caused to lip sync using the viseme data while audio corresponding to the audio data is output.

Type: Grant

Filed: September 4, 2020

Date of Patent: July 11, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Zoe Adams, Pete Klein, Derick Deller, Bradley Michael Richards, Anirudh Ranganath
System and method for automatic theming of a plurality of thought objects

Patent number: 11694030

Abstract: A system and method for automated theming of thought objects is disclosed. In a preferred embodiment, an object theming computer creates aggregated text by aggregating text associated with a first thought object, the aggregated text comprising descriptive and important text associated with the first thought object. The object theming computer then tokenizes the aggregated text into tokens. Further, a theme is associated to thought objects using one or more previously themed thought objects. If one or more thought objects have no associated theme, then a theme is generated and associated to the un-themed thought objects using a current plurality of the one or more thought objects.

Type: Grant

Filed: April 6, 2022

Date of Patent: July 4, 2023

Assignee: Fulcrum Management Solutions Ltd.

Inventors: Farhad Imani, Thomas John Procter

prev 1 2 3 4 5 6 … next