Speech To Image Patents (Class 704/235)

Apparatus and system for providing content based on user utterance

Patent number: 11432045

Abstract: Disclosed is a display device. According to an embodiment, a display device may include a voice signal receiver, a display, at least one memory storing an application supporting a contents providing service and storing instructions, a communication circuit communicating with at least one external server supporting the contents providing service, and at least one processor. The contents providing service may provide contents files of a first type and contents files of a second type.

Type: Grant

Filed: February 19, 2019

Date of Patent: August 30, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jibum Moon, Gyungchan Seol, Kyerim Lee
Natural language processing and candidate response identification

Patent number: 11429789

Abstract: Embodiments relate to an intelligent computer platform to identify and evaluate candidate passage response data in natural language form. Natural language processing is applied to analyze a passage against one or more input tokens to identify matching content. A structure representing the analyzed passage is populated with matching input and passage tokens. A first count of matching token entries and a second count of evaluated token entries are determined and qualified by closeness criteria. An alignment of the passage to a candidate question is calculated, including assessing a ratio of the first and second counts as a confidence value. Matching passage data is returned from the passage with the confidence value.

Type: Grant

Filed: June 12, 2019

Date of Patent: August 30, 2022

Assignee: International Business Machines Corporation

Inventors: Stephen A. Boxwell, Keith G. Frost, Kyle M. Brake, Stanley J. Vernier
Method, apparatus, and storage medium for segmenting sentences for speech recognition

Patent number: 11430428

Abstract: The present disclosure describes a method, apparatus, and storage medium for performing speech recognition. The method includes acquiring, by an apparatus, first to-be-processed speech information. The apparatus includes a memory storing instructions and a processor in communication with the memory. The method includes acquiring, by the apparatus, a first pause duration according to the first to-be-processed speech information; and in response to the first pause duration being greater than or equal to a first threshold, performing, by the apparatus, speech recognition on the first to-be-processed speech information to obtain a first result of sentence segmentation of speech, the first result of sentence segmentation of speech being text information, the first threshold being determined according to speech information corresponding to a previous moment.

Type: Grant

Filed: September 10, 2020

Date of Patent: August 30, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Lianwu Chen, Jingliang Bai, Min Luo
Audio effectiveness heatmap

Patent number: 11432090

Abstract: An audio system can be configured to generate an audio heatmap for the audio emission potential profiles for one or more speakers, in specific or arbitrary locations. The audio heatmap maybe based on speaker location and orientation, speaker acoustic properties, and optionally environmental properties. The audio heatmap often shows areas of low sound density when there are few speakers, and areas of high sound density when there are a lot of speakers. An audio system may be configured to normalize audio signals for a set of speakers that cooperatively emit sound to render an audio object in a defined audio object location. The audio signals for each speaker can be normalized to ensure accurate rendering of the audio object without volume spikes or dropout.

Type: Grant

Filed: January 12, 2021

Date of Patent: August 30, 2022

Assignee: SPATIALX INC.

Inventors: Xavier Prospero, Aric Marshall, Michael Plitkins, Calin Pacurariu
Techniques for providing adaptive responses

Patent number: 11423897

Abstract: Systems and methods are described herein for generating an adaptive response to a user request. Input indicative of a user request may be received and utilized to identify an item in an electronic catalog. Title segments may be identified from the item's title. Significant segments of the user request may be determined. In response to the user request, a shortened title may be generated from the identified title segments and provided as output at the user device (e.g., via audible output provided at a speaker of the user device, via textual output, or the like). At least one of the title segments provided in the shortened title may correlate to the significant segment identified from the user request. In some embodiments, the length and content of the shortened title may vary based at least in part on the contextual intent of the user's request.

Type: Grant

Filed: January 30, 2020

Date of Patent: August 23, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Ran Levy, Ori Rozen, Leon Portman, Knaan Ratosh, Ido Arad, Hadar Neumann
Computer-based systems for performing a candidate phrase search in a text document and methods of use thereof

Patent number: 11423236

Abstract: A method for identifying phrases in a text document having a similar discourse to a candidate phrase includes separating text in a document file into a plurality of phrases and generating a plurality of embedding vectors in a textual embedding space by inputting the plurality of phrases into an embedding engine. A mapping of each embedding vector in the textual embedding space is generated with each corresponding phrase and a document location of each corresponding phrase in the document file. A candidate phrase is received by a user and a candidate embedding vector is generated using the embedding engine. Similarity scores are computed based on the plurality of embedding space distances between the candidate phrase embedding vector location and each respective location of each embedding vector in the textual embedding space. A listing of phrases with the highest similarity scores are outputted with respective document locations in the text.

Type: Grant

Filed: June 12, 2020

Date of Patent: August 23, 2022

Assignee: Capital One Services, LLC

Inventors: Austin Walters, Vincent Pham, Ernest Kwak, Galen Rafferty, Reza Farivar, Jeremy Goodsitt, Anh Truong
Video communications method and apparatus, terminal, and computer-readable storage medium

Patent number: 11425315

Abstract: A video communications method is provided, including: respectively displaying video images of at least two terminals in at least two display subareas of a video communication interface in a video chat session of the at least two terminal; obtaining a first special effect display instruction; and adding a first special effect to the at least two display subareas based on the first special effect display instruction. The method also includes transmitting the first special effect display instruction to a second terminal of the at least two terminals, the second terminal being an action recipient of the first special effect; and selecting, among multiple end special effects, a target end special effect to be added to the video images of the at least two terminals according to a body action occurred in the video image of the second terminal.

Type: Grant

Filed: February 3, 2020

Date of Patent: August 23, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Ying Zhu, Chao Wang, Yinglei Liang, Haoqi Kuang, Lin Shi, Jinjie Wang, Weisong Zhu
Translation system using sound vibration microphone

Patent number: 11425487

Abstract: Disclosed is a translation system using a sound vibration microphone capable of fundamentally blocking noise by detecting vibrations generated from a user's voice using a piezo element. The translation system using a sound vibration microphone includes a first earset including a sound vibration sensor and a speaker, a first wearable acoustic device including a microphone and a speaker and performing wireless communication with the first earset, and a first electronic communication device performing wireless communication with the first wearable acoustic device and performing communication with a translation server, wherein the first wearable acoustic device and the first electronic communication device are in a communicatable state.

Type: Grant

Filed: November 23, 2020

Date of Patent: August 23, 2022

Assignee: EM-TECH CO., LTD.

Inventors: Cheon Myeong Kim, Byung Min Yu, Jae Hwa Lim, Jeong Ae Ha, Ji Young Lee
Content management systems for providing automated generation of content suggestions

Patent number: 11423114

Abstract: Systems for automatically suggesting content items to an author of a digital page so that they can be added to the page. One embodiment comprises a system having a page editor that launches a component browser to enable viewing of content items stored in a repository. A web content management server extracts textual information from a digital page that is open for editing and analyzes the information to identify words indicative of the page's subject matter. The system generates a query using the identified words and a search engine searches the repository for items associated with the identified words. The content items identified by the search are displayed by the component browser and are selectable to add particular suggested content items to the page.

Type: Grant

Filed: November 7, 2019

Date of Patent: August 23, 2022

Assignee: OPEN TEXT HOLDINGS, INC.

Inventors: Paul Kieron Carl Narth, Gregory Aaron Parmiter, Jason Jackson, Partha Saradhi Govind
System and management of semantic indicators during document presentations

Patent number: 11423073

Abstract: An audio-visual management system and process for facilitating comprehension of electronic documents. As an example, a user may access contents of an electronic document via a user interface and request a speech-synthesized recitation of any text in the electronic document. In addition, the user interface can include a navigation tool configured to visually represent significant events in the electronic document via selectable indicators, including but not limited to comments, slides, images, graphs, and/or other internal contextual markers. These indicators can serve as navigation options that can provide guidance to users as they listen to a reading of the electronic document. User selection of an indicator can initiate an audio presentation of textual content that is directly associated with the specified event. In addition, users can view availability of others currently listening to the document and connect with them to create a joint listening experience.

Type: Grant

Filed: November 16, 2018

Date of Patent: August 23, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Erez Kikin-Gil, Benjamin David Smith, Colleen Griffiths Estrada
Fault detection and management in a real-time communication

Patent number: 11417340

Abstract: Electronic conferences are a common method of conducting a meeting when the participants are not at the same location. When a conference drops a portion of the audio, such carrying speech from a speaker, participants may be excluded from the full content of the conference. By having a device associated with a speaking participant convey both audio and generated text from the speech provided by the speaking participant, a server may be able to determine that speech is missing from the audio portion and automatically insert text content. As a result, poor audio quality may be mitigated with text and omitting unwanted text when the audio quality is sufficient.

Type: Grant

Filed: December 16, 2019

Date of Patent: August 16, 2022

Assignee: Avaya Inc.

Inventors: Darren Holmes, Jibin George, Salil Dhawan, Sandeep Goynar, Harsimran Jeet Singh
Systems and methods for creating multitrack recordings

Patent number: 11418644

Abstract: Systems and methods for making a multitrack recording of a telephony communication, such as a conference call, record the contributions of each participant its own respective, separate recording track. In some instances, the contribution(s) of one or more participants is recorded in separate recording tracks, and the contributions of multiple other participants is mixed and recorded in a single recording track. An organizer or administrator of a telephony communication, such as a conference call, can instruct a multitrack recording system as to how to format a multitrack recording of the telephony communication via commands submitted through an application programming interface (API).

Type: Grant

Filed: June 28, 2019

Date of Patent: August 16, 2022

Assignee: Vonage Business Inc.

Inventors: Jonathan Farrow, Dragos Oancea
Generating training datasets for a supervised learning topic model from outputs of a discovery topic model

Patent number: 11410644

Abstract: Systems and methods for generating training data for a supervised topic modeling system from outputs of a topic discovery model are described herein. In an embodiment, a system receives a plurality of digitally stored call transcripts and, using a topic model, generates an output which identifies a plurality of topics represented in the plurality of digitally stored call transcripts. Using the output of the topic model, the system generates an input dataset for a supervised learning model by identify a first subset of the plurality of digitally stored call transcripts that include the particular topic, storing a positive value for the first subset, identifying a second subset that do not include the particular topic, and storing a negative value for the second subset. The input training dataset is then used to train a supervised learning model.

Type: Grant

Filed: August 18, 2020

Date of Patent: August 9, 2022

Assignee: INVOCA, INC.

Inventors: Michael McCourt, Anoop Praturu
Classification of auditory and visual meeting data to infer importance of user utterances

Patent number: 11410426

Abstract: In non-limiting examples of the present disclosure, systems, methods and devices for generating summary content are presented. Voice audio data and video data for an electronic meeting may be received. A language processing model may be applied to a transcript of the audio data and textual importance scores may be calculated. A video/image model may be applied to the video data and visual importance scores may be calculated. A combined importance score may be calculated for sections of the electronic meeting based on the textual importance scores and the visual importance scores. A meeting summary that includes summary content from sections for which combined importance scores exceed a threshold value may be generated.

Type: Grant

Filed: June 4, 2020

Date of Patent: August 9, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Erez Kikin-Gil, Daniel Yancy Parish
Data analysis method and data analysis system thereof for searching learning sections

Patent number: 11409804

Abstract: The present disclosure relates to a data analysis method and a data analysis system thereof. The data analysis method includes steps of: receiving a first learning content data, and adding multiple first segmentation marks to the first learning content data to divide multiple first learning sections on the first learning content data. The first learning sections are arranged according to a time axis. Searching a first keyword string corresponding to each first learning section from the first learning sections. Receiving a analysis command, and analyzing the analysis command with the first keyword string of each of the first learning content data to obtain multiple first similarities, corresponding to the analysis command and each of the first learning sections. Finally, searching for the first learning section with the highest similarity.

Type: Grant

Filed: August 2, 2019

Date of Patent: August 9, 2022

Assignee: DELTA ELECTRONICS, INC.

Inventors: Shih-Gong Li, Shih-Han Chan, Chao-Hsuan Ko, Guo-Cheng Lan
Electronic device and method for providing location data

Patent number: 11411961

Abstract: An electronic device is provided. The electronic device includes a user interface, a location sensor configured to sense a location of the electronic device, a processor electrically connected with the user interface and the location sensor, and a memory electrically connected with the processor and configured to store a first application program and a second application program. The memory is further configured to store instructions that, when executed, enable the processor to receive first location data with a first degree of accuracy regarding the location of the electronic device from the location sensor, process at least part of the first location data to generate second location data with a second degree of accuracy lower than the first degree of accuracy regarding the location of the electronic device, provide the at least part of the first location data to execute the first application program, and provide at least part of the second location data to execute the second application program.

Type: Grant

Filed: August 6, 2021

Date of Patent: August 9, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jin-Ho Moon, Jong-Eun Yang, Jae-Yung Yeo
Service data processing method and apparatus and related device

Patent number: 11404052

Abstract: In a service data processing method performed by a server, user speech information collected by a first terminal is received. A target service operation code according to the user speech information is obtained. The target service operation code is used for identifying target service operation information. The target service operation code is transmitted from the server to the first terminal, so that the first terminal plays the target service operation code by using a speech. The target service operation code obtained by a second terminal is received. A target execution page corresponding to the target service operation code is searched for. The target execution page is transmitted to the second terminal, so that the second terminal executes a service operation corresponding to the target.

Type: Grant

Filed: September 17, 2020

Date of Patent: August 2, 2022

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventors: Jinglin Ma, Xuewei Fang
Audio guidance generation device, audio guidance generation method, and broadcasting system

Patent number: 11404041

Abstract: A message management unit receives and accumulates a message, wherein the message is distributed for every update, is the message data representing a latest situation of a competition, an explanation generation unit generates an explanatory text for conveying unconveyed information detected from the message, based on conveyed information, a speech synthesis unit outputs a speech converted from the explanatory text, wherein the explanation generation unit stores the unconveyed information for the explanatory text as the conveyed information, stands by until completion of completion of the speech, and initiates a procedure for generating a new explanatory text based on updated unconveyed information.

Type: Grant

Filed: May 23, 2018

Date of Patent: August 2, 2022

Assignees: NIPPON HOSO KYOKAI, NHK Engineering System, Inc.

Inventors: Tadashi Kumano, Ichiro Yamada, Atsushi Imai, Hideki Sumiyoshi, Yuko Yamanouchi, Toshihiro Shimizu, Nobumasa Seiyama, Shoei Sato, Reiko Saito, Taro Miyazaki, Kiyoshi Kurihara, Manon Ichiki, Tohru Takagi
System and method for handling multi-turn conversations and context management for voice enabled ecommerce transactions

Patent number: 11404058

Abstract: Systems and methods for e-commerce systems using natural language understanding are described. A computing device is configured receive a user utterance including at least one identified semantic component and at least one missing semantic component and generate a context stack including a set of context entries. Each of the context entries includes a root intent element, an entity list element, and a dialogue stack and each context entry in the set of context entries is associated with one of a user utterance or a system utterance. The computing device is further configured to review at least one context entry in the set of context entries to locate the at least one missing semantic element within the dialogue stack and generate an intent flow execution request including the at least one semantic element from the first speech data and the missing semantic element.

Type: Grant

Filed: October 31, 2018

Date of Patent: August 2, 2022

Assignee: Walmart Apollo, LLC

Inventors: Snehasish Mukherjee, Shankara Bhargava Subramanya
Dynamic speech output configuration

Patent number: 11398218

Abstract: Techniques are described for providing dynamically configured speech output, through which text data from a message is presented as speech output through a text-to-speech (TTS) engine that employs a voice profile to provide a machine-generated voice that approximates that of the sender of the message. The sender can also indicate the type of voice they would prefer the TTS engine use to render their text to a recipient, and the voice to be used can be specified in a sender's user profile, as a preference or attribute of the sending user. In some examples, the voice profile to be used can be indicated as metadata included in the message. A voice profile can specify voice attributes such as the tone, pitch, register, timbre, pacing, gender, accent, and so forth. A voice profile can be generated through a machine learning (ML) process.

Type: Grant

Filed: April 25, 2019

Date of Patent: July 26, 2022

Assignee: United Services Automobile Association (USAA)

Inventors: Justin Dax Haslam, Robert Wilson Barner
System for generation and maintenance of verified data records

Patent number: 11386902

Abstract: Embodiments of the present invention provide systems and methods for generation and maintenance of verified data records. The system may receive a data submission from a user device over one or more communication channels and convert the data submission into a normalized text format for processing and analysis. The data submission may then be analyzed using one or more trained machined learning models in order to identify factual statements and modifiers within the data submission, and generate a confidence score of verified factual information based on corroboration with one or more additional data sources. Additionally, identified modifiers may be analyzed to determine positive or negative sentiment.

Type: Grant

Filed: April 28, 2020

Date of Patent: July 12, 2022

Assignee: BANK OF AMERICA CORPORATION

Inventors: Kelly Renee-Drop Keiter, Christopher Daniel Birch, Susan R. Hart, Lisa Matthews, Cody Dean Searl
Systems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments

Patent number: 11373672

Abstract: Disclosed are devices, systems, apparatus, methods, products, and other implementations, including a method comprising obtaining, by a device, a combined sound signal for signals combined from multiple sound sources in an area in which a person is located, and applying, by the device, speech-separation processing (e.g., deep attractor network (DAN) processing, online DAN processing, LSTM-TasNet processing, Conv-TasNet processing), to the combined sound signal from the multiple sound sources to derive a plurality of separated signals that each contains signals corresponding to different groups of the multiple sound sources. The method further includes obtaining, by the device, neural signals for the person, the neural signals being indicative of one or more of the multiple sound sources the person is attentive to, and selecting one of the plurality of separated signals based on the obtained neural signals. The selected signal may then be processed (amplified, attenuated).

Type: Grant

Filed: October 24, 2018

Date of Patent: June 28, 2022

Assignee: The Trustees of Columbia University in the City of New York

Inventors: Nima Mesgarani, Yi Luo, James O'Sullivan, Zhuo Chen
Electronic device for recognizing abbreviated content name and control method thereof

Patent number: 11373634

Abstract: An electronic device secures diversity of a user utterance with respect to a content name when a user searches a content through a display device by utilizing a voice. A method by an electronic device includes steps of receiving input of a user voice, acquiring a keyword related to a content included in the user voice, and acquiring at least one modified keyword based on the keyword, acquiring a plurality of search results corresponding to the keyword and the at least one modified keyword, comparing the keyword and the modified keyword with the plurality of search results and acquiring a content name corresponding to the keyword, and updating a database of content names based on the keyword, the modified keyword, and the final content name.

Type: Grant

Filed: October 30, 2019

Date of Patent: June 28, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jiwon Yoo, Jihun Park
Gallery of messages from individuals with a shared interest

Patent number: 11372608

Abstract: A machine includes a processor and a memory connected to the processor. The memory stores instructions executed by the processor to receive a message and a message parameter indicative of a characteristic of the message, where the message includes a photograph or a video. A determination is made that the message parameter corresponds to a selected gallery, where the selected gallery includes a sequence of photographs or videos. The message is posted to the selected gallery in response to the determination. The selected gallery is supplied in response to a request.

Type: Grant

Filed: October 24, 2019

Date of Patent: June 28, 2022

Assignee: Snap Inc.

Inventor: Timothy Sehn
Semiautomated relay method and apparatus

Patent number: 11368581

Abstract: A method to transcribe communications includes obtaining an audio message from a first device during a voice communication session with a second device including a display screen, providing the message to a first speech recognition system to generate a first message transcript, providing the transcript to the second device for presentation on the screen, obtaining an indication that a transcript quality is below a threshold, providing, in response, the message to a second system to generate a second transcript while still providing it to the first system to generate the first transcript and providing the first transcript to the second device for presentation on the display screen, and in response to an event occurring that indicates the second transcript is to be provided to the second device instead of the first transcript, providing the second transcript to the second device for presentation on the screen instead of the first transcript.

Type: Grant

Filed: September 11, 2020

Date of Patent: June 21, 2022

Assignee: Ultratec, Inc.

Inventors: Robert M. Engelke, Kevin R. Colwell, Christopher Engelke
Interactive interface display method, apparatus and storage medium

Patent number: 11366569

Abstract: The disclosure relates to an interactive interface display method, apparatus, and storage medium. The method includes displaying an information display interface including a call entry of an intelligent interactive application; calling the intelligent interactive application when a trigger operation on the call entry is detected; displaying a first dynamic effect in which the call entry moves in the information display interface; and displaying an interactive interface of the intelligent interactive application after displaying the first dynamic effect in which the call entry moves in the information display interface.

Type: Grant

Filed: March 12, 2020

Date of Patent: June 21, 2022

Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.

Inventors: Yuhang Xia, Zekui Li
System and method of generating facial expression of a user for virtual environment

Patent number: 11361491

Abstract: The present invention relates to a method of generating a facial expression of a user for a virtual environment. The method comprises obtaining a video and an associated speech of the user. Further, extracting in real-time at least one of one or more voice features and one or more text features based on the speech. Furthermore, identifying one or more phonemes in the speech. Thereafter, determining one or more facial features relating to the speech of the user using a pre-trained second learning model based on the one or more voice features, the one or more phonemes, the video and one or more previously generated facial features of the user. Finally, generating the facial expression of the user corresponding to the speech for an avatar representing the user in the virtual environment.

Type: Grant

Filed: September 3, 2020

Date of Patent: June 14, 2022

Assignee: Wipro Limited

Inventors: Vivek Kumar Varma Nadimpalli, Gopichand Agnihotram
Preventing audio dropout

Patent number: 11356492

Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention detect an audio stream comprising one or more voice packets from a first computing system. Embodiments of the present invention can, in response to detecting an audio stream, dynamically prevent audio drop out on a second computing system using circular buffers based on network consistency.

Type: Grant

Filed: September 16, 2020

Date of Patent: June 7, 2022

Assignee: Kyndryl, Inc.

Inventors: Tiberiu Suto, Nadiya Kochura, Vinod A. Valecha
Information processor and information processing program

Patent number: 11354516

Abstract: An information processor includes a generation section that generates a specified character string on the basis of at least one of voice information corresponding to a content of speech detected by a voice detection section and vehicle information acquired from a vehicle. With this configuration, a user can input the specified character string, which is a hashtag, without an operation. Thus, compared to the related art in which the hashtag is generated on the basis of the operation (manual input) by the user, a burden on the user can significantly be reduced, and an input error can be prevented.

Type: Grant

Filed: November 22, 2019

Date of Patent: June 7, 2022

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Ryotaro Fujiwara, Keiko Suzuki, Makoto Honda, Chikage Kubo, Ryota Okubi, Takeshi Fujiki
Method and apparatus for content curation

Patent number: 11350148

Abstract: Aspects of the subject disclosure may include, for example, modifying a user profile associated with a user associated with a content service to generate an updated user profile according to consumption of media content by the user and user feedback information associated with the consumption of the media content, determining a user context according to information associated with user device, where the user context includes current activity of the user, modifying a set of media content according to the user context that is determined to generate an updated set of media content, where a type of media content is eliminated from the set of media content in the updated set of media content according to the user context, and presenting the updated set of content at a presentation device of the user via a personal media channel of the user associated with the content service. Other embodiments are disclosed.

Type: Grant

Filed: October 29, 2020

Date of Patent: May 31, 2022

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Zhu Liu, Eric Zavesky, Bernard S. Renger, Behzad Shahraray, David Crawford Gibbon, Tan Xu, Lee Begeja, Raghuraman Gopalan
Customizable animations for text messages

Patent number: 11341707

Abstract: A method and system for transforming simple user input into customizable animated images for use in text-messaging applications.

Type: Grant

Filed: March 23, 2021

Date of Patent: May 24, 2022

Assignee: EMONSTER INC

Inventor: Enrique Bonansea
Speaking technique improvement assistant

Patent number: 11341331

Abstract: An intelligent speech assistant receives information collected while a user is speaking. The information can comprise speech data, vision data, or both, where the speech data is from the user speaking and the vision data is of the user while speaking. The assistant evaluates the speech data against a script which can contain information that the user should speak, information that the user should not speak, or both. The assistant collects instances where the user utters phrases that match the script or instances where the user utters phrases that do not match the script, depending on whether phases should or should not be spoken. The assistant evaluates vision data to identify gestures, facial expressions, and/or emotions of the user. Instances where the gestures, facial expressions, and/or emotions are not appropriate to the context are flagged. Real-time prompts and/or a summary is presented to the user as feedback.

Type: Grant

Filed: October 4, 2019

Date of Patent: May 24, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Huakai Liao, Priyanka Vikram Sinha, Kevin Dara Khieu, Derek Martin Johnson, Siliang Kang, Huey-Ru Tsai, Amit Srivastava
Method for detecting keyword in speech signal, terminal, and storage medium

Patent number: 11341957

Abstract: A method for detecting a keyword, applied to a terminal, includes: extracting a speech eigenvector of a speech signal; obtaining, according to the speech eigenvector, a posterior probability of each target character being a key character in any keyword in an acquisition time period of the speech signal; obtaining confidences of at least two target character combinations according to the posterior probability of each target character; and determining that the speech signal includes the keyword upon determining that all the confidences of the at least two target character combinations meet a preset condition. The target character is a character in the speech signal whose pronunciation matches a pronunciation of the key character. Each target character combination includes at least one target character, and a confidence of a target character combination represents a probability of the target character combination being the keyword or a part of the keyword.

Type: Grant

Filed: July 20, 2020

Date of Patent: May 24, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Yi Gao, Meng Yu, Dan Su, Jie Chen, Min Luo
Human-machine interfaces for utterance-based playlist selection

Patent number: 11334315

Abstract: Systems, methods, and devices for human-machine interfaces for utterance-based playlist selection are disclosed. In one method, a list of playlists is traversed and a portion of each is audibly output until a playlist command is received. Based on the playlist command, the traversing is stopped and a playlist is selected for playback. In examples, the list of playlists is modified based on a modification input.

Type: Grant

Filed: July 8, 2019

Date of Patent: May 17, 2022

Assignee: Spotify AB

Inventors: Daniel Bromand, Richard Mitic, Horia-Dragos Jurcut, Henriette Susanne Martine Cramer, Ruth Brillman
Techniques to enhance transcript of speech with indications of speaker emotion

Patent number: 11335360

Abstract: In one aspect, a device includes at least one processor and storage accessible to the at least one processor. The storage includes instructions executable by the at least one processor to analyze the decibel levels of audio of a user's speech. The instructions are executable to, based on the analysis, enhance a transcript of the user's speech with indications of particular words from the user's speech as being associated with one or more emotions of the user.

Type: Grant

Filed: September 21, 2019

Date of Patent: May 17, 2022

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Johnathan Co Lee, Jonathan Jen-Wei Yu
Automated clinical documentation system and method

Patent number: 11322231

Abstract: A method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a patient to provide encounter information via a virtual assistant during a pre-visit portion of a patient encounter. Encounter information is obtained from the patient in response to the prompting by the virtual assistant.

Type: Grant

Filed: August 9, 2018

Date of Patent: May 3, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Donald E. Owen, Garret N. Erskine, Mehmet Mert Öz
Event-based data intake and query system employing non-text machine data

Patent number: 11314799

Abstract: Described herein are technologies that facilitate effective use (e.g., indexing and searching) of non-text machine data (e.g., audio/visual data) with text-based indexes of an event-based machine-data intake and query system.

Type: Grant

Filed: July 29, 2016

Date of Patent: April 26, 2022

Assignee: Splunk Inc.

Inventor: Adam Oliner
Machine learning-based speech-to-text transcription cloud intermediary

Patent number: 11315570

Abstract: The technology disclosed relates to a machine learning based speech-to-text transcription intermediary which, from among multiple speech recognition engines, selects a speech recognition engine for accurately transcribing an audio channel based on sound and speech characteristics of the audio channel.

Type: Grant

Filed: April 2, 2019

Date of Patent: April 26, 2022

Assignee: Facebook Technologies, LLC

Inventor: Shamir Allibhai
Transcription and analysis of meeting recordings

Patent number: 11315569

Abstract: Disclosed is a system for generating a transcript of a meeting using individual audio recordings of speakers in the meeting. The system obtains an audio recording file from each speaker in the meeting, generates a speaker-specific transcript for each speaker using the audio recording of the corresponding speaker, and merges the speaker-specific transcripts to generate a meeting transcript that includes text of a speech from all speakers in the meeting. As the system generates speaker specific transcripts using speaker-specific (high quality) audio recordings, the need for “diarization” is removed, the audio quality of recording of each speaker is maximized, leading to virtually lossless recordings, and resulting in an improved transcription quality and analysis.

Type: Grant

Filed: February 7, 2020

Date of Patent: April 26, 2022

Assignee: Memoria, Inc.

Inventors: Homayoun Talieh, Rémi Berson, Eric Pellish
Information processing apparatus, information processing method, and program

Patent number: 11308951

Abstract: There is provided an information processing apparatus, an information processing method, and a program capable of providing a more convenient speech recognition service. The processing of recognizing, as an edited portion, a desired word configuring a sentence presented to a user as a speech recognition result, acquiring speech information repeatedly uttered for editing a word of the edited portion, and connecting speech information other than a repeated utterance to the speech information is performed, and speech information for speech recognition for editing is generated. Then, speech recognition is performed on the generated speech information for speech recognition for editing.

Type: Grant

Filed: January 4, 2018

Date of Patent: April 19, 2022

Assignee: SONY CORPORATION

Inventors: Shinichi Kawano, Yuhei Taki
Synthesizing speech recognition training data

Patent number: 11308938

Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

Type: Grant

Filed: December 5, 2019

Date of Patent: April 19, 2022

Assignee: SoundHound, Inc.

Inventors: Maisy Wieman, Jonah Probell, Sudharsan Krishnaswamy
Data-preserving text redaction for text utterance data

Patent number: 11308945

Abstract: A hypernym of a word in utterance data may be probabilistically determined. The utterance data may correspond to a spoken query or command. A redacted utterance may be derived by replacing the word with the hypernym. The hypernym may be determined by applying noise to a position in a hierarchical embedding that corresponds to the word. The word may be identified as being potentially sensitive. The hierarchical embedding may be a Hyperbolic embedding that may indicate hierarchical relationships between individual words of a corpus of words, such as “red” is a “color” or “Austin” is in “Texas.” Noise may be applied by obtaining a first value in Euclidean space based on a second value in Hyperbolic space, and obtaining a third value in Hyperbolic space based on the first value in Euclidean space. The second value in Hyperbolic space may correspond to the word.

Type: Grant

Filed: September 4, 2019

Date of Patent: April 19, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Thomas Drake, Oluwaseyi Feyisetan, Thomas Diethe
Systems and methods for aligning lyrics using a neural network

Patent number: 11308943

Abstract: An electronic device receives audio data for a media item. The electronic device generates, from the audio data, a plurality of samples, each sample having a predefined maximum length. The electronic device, using a neural network trained to predict character probabilities, generates a probability matrix of characters for a first portion of a first sample of the plurality of samples. The probability matrix includes character information, timing information, and respective probabilities of respective characters at respective times. The electronic device identifies, for the first portion of the first sample, a first sequence of characters based on the generated probability matrix.

Type: Grant

Filed: September 12, 2019

Date of Patent: April 19, 2022

Assignee: Spotify AB

Inventors: Daniel Stoller, Simon René Georges Durand, Sebastian Ewert
Identity authentication method and apparatus

Patent number: 11310223

Abstract: An identity authentication method, includes: at an electronic device having one or more processors and memory, the electronic device coupled with a display and one or more input devices: receiving an identity authentication request; in response to receiving the identity authentication request, performing an interactive authentication information exchange between the electronic device and a user, including: displaying, on the display, first visual information in a first manner; displaying, on the display, the first visual information in a second manner that is distinct from the first manner, wherein the first visual information displayed in the second manner includes a timing characteristic that is absent from the first visual information displayed in the first manner; receiving user input entered in accordance with the first visual information displayed in the second manner; and verifying that the user input conforms to the timing characteristic in the first visual information displayed in the second manner.

Type: Grant

Filed: May 26, 2020

Date of Patent: April 19, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Lu Zheng, Shuai Zhang, Tingting Shang, Rui Rao, Yan Chen, Yaode Huang, Zhenhua Wang
Vehicle device, display method for displaying information obtained from an external electronic device in vehicle device and electronic device, and information transmission method in electronic device

Patent number: 11302290

Abstract: Described are various embodiments related to a vehicle device and an electronic device, wherein the vehicle device according to one embodiment can include: a display; a memory; at least one or more sensors; communication circuitry configured to communicate with an external electronic device; and a processor configured to display first display information according to execution of a first application on a first area on the display, perform control to transfer vehicle-related context information to the electronic device based on information obtained by the at least one or more sensors and, if information related to a second application corresponding to the vehicle-related context information is received from the electronic device, display second display information associated with the second application on a second area on the display using the received information.

Type: Grant

Filed: January 11, 2018

Date of Patent: April 12, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Yong-Jun Lim, Tae-Young Ha
Biasing voice correction suggestions

Patent number: 11302305

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the method includes receiving a voice input from a user device; generating a recognition output; receiving a user selection of one or more terms in the recognition output; receiving a user input of one or more letters replacing the user selected one or more terms; determining suggested correction candidates based in part on the user input and the voice input; and providing one or more suggested correction candidates to the user device as suggested corrected recognition outputs.

Type: Grant

Filed: May 14, 2020

Date of Patent: April 12, 2022

Assignee: Google LLC

Inventors: Evgeny A. Cherepanov, Jakob Nicolaus Foerster, Vikram Sridar, Ishai Rabinovitz, Omer Tabach
Systems and methods for speech recognition

Patent number: 11302313

Abstract: Systems and methods for speech recognition are provided. The method may include obtaining a plurality of candidate recognition results of speech information uttered by a user and a plurality of preliminary scores corresponding to the plurality of candidate recognition results, respectively. The method may further include, for each of the plurality of candidate recognition results, extracting one or more keywords from the candidate recognition result and determining at least one parameter associated with the one or more extracted keywords. The method may further include, for each of the plurality of candidate recognition results, generating an updating coefficient based on the at least one parameter and updating the preliminary score based on the updating coefficient to generate an updated score. The method may further include determining, from the plurality of candidate recognition results, a target recognition result based on the plurality of updated scores.

Type: Grant

Filed: December 14, 2019

Date of Patent: April 12, 2022

Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventor: Xiulin Li
Automated digital asset tagging using multiple vocabulary sets

Patent number: 11301506

Abstract: Automated digital asset tagging techniques and systems are described that support use of multiple vocabulary sets. In one example, a plurality of digital assets are obtained having first-vocabulary tags taken from a first-vocabulary set. Second-vocabulary tags taken from a second-vocabulary set are assigned to the plurality of digital assets through machine learning. A determination is made that at least one first-vocabulary tag includes a plurality of visual classes based on the assignment of at least one second-vocabulary tag. Digital assets are collected from the plurality of digital assets that correspond to one visual class of the plurality of visual classes. The model is generated using machine learning based on the collected digital assets.

Type: Grant

Filed: June 29, 2017

Date of Patent: April 12, 2022

Assignee: Adobe Inc.

Inventors: Mayur Hemani, Balaji Krishnamurthy
Automated clinical documentation system and method

Patent number: 11295839

Abstract: A method, computer program product, and computing system for automating a follow-up process is executed on a computing device and includes prompting a patient to provide encounter information via a virtual assistant during a post-visit portion of a patient encounter. Encounter information is obtained from the patient in response to the prompting by the virtual assistant.

Type: Grant

Filed: August 9, 2018

Date of Patent: April 5, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Donald E. Owen, Garret N. Erskine, Mehmet Mert Öz
Controlling video data content using computer vision

Patent number: 11294474

Abstract: A virtual collaboration system receives input video data including a participant. The system analyzes the input video data to identify a gesture or a movement made by the participant. The system selects an overlay image as a function of the gesture or the movement made by the participant, incorporates the overlay image into the input video data, thereby generating output video data that includes the overlay image, and transmits the output video data to one or more participant devices.

Type: Grant

Filed: February 5, 2021

Date of Patent: April 5, 2022

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Aaron Michael Stewart, Alden Rose, Ellis Anderson

prev … 2 3 4 5 6 7 8 9 10 … next