Speech To Image Patents (Class 704/235)
  • Patent number: 11432045
    Abstract: Disclosed is a display device. According to an embodiment, a display device may include a voice signal receiver, a display, at least one memory storing an application supporting a contents providing service and storing instructions, a communication circuit communicating with at least one external server supporting the contents providing service, and at least one processor. The contents providing service may provide contents files of a first type and contents files of a second type.
    Type: Grant
    Filed: February 19, 2019
    Date of Patent: August 30, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jibum Moon, Gyungchan Seol, Kyerim Lee
  • Patent number: 11429789
    Abstract: Embodiments relate to an intelligent computer platform to identify and evaluate candidate passage response data in natural language form. Natural language processing is applied to analyze a passage against one or more input tokens to identify matching content. A structure representing the analyzed passage is populated with matching input and passage tokens. A first count of matching token entries and a second count of evaluated token entries are determined and qualified by closeness criteria. An alignment of the passage to a candidate question is calculated, including assessing a ratio of the first and second counts as a confidence value. Matching passage data is returned from the passage with the confidence value.
    Type: Grant
    Filed: June 12, 2019
    Date of Patent: August 30, 2022
    Assignee: International Business Machines Corporation
    Inventors: Stephen A. Boxwell, Keith G. Frost, Kyle M. Brake, Stanley J. Vernier
  • Patent number: 11430428
    Abstract: The present disclosure describes a method, apparatus, and storage medium for performing speech recognition. The method includes acquiring, by an apparatus, first to-be-processed speech information. The apparatus includes a memory storing instructions and a processor in communication with the memory. The method includes acquiring, by the apparatus, a first pause duration according to the first to-be-processed speech information; and in response to the first pause duration being greater than or equal to a first threshold, performing, by the apparatus, speech recognition on the first to-be-processed speech information to obtain a first result of sentence segmentation of speech, the first result of sentence segmentation of speech being text information, the first threshold being determined according to speech information corresponding to a previous moment.
    Type: Grant
    Filed: September 10, 2020
    Date of Patent: August 30, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Lianwu Chen, Jingliang Bai, Min Luo
  • Patent number: 11432090
    Abstract: An audio system can be configured to generate an audio heatmap for the audio emission potential profiles for one or more speakers, in specific or arbitrary locations. The audio heatmap maybe based on speaker location and orientation, speaker acoustic properties, and optionally environmental properties. The audio heatmap often shows areas of low sound density when there are few speakers, and areas of high sound density when there are a lot of speakers. An audio system may be configured to normalize audio signals for a set of speakers that cooperatively emit sound to render an audio object in a defined audio object location. The audio signals for each speaker can be normalized to ensure accurate rendering of the audio object without volume spikes or dropout.
    Type: Grant
    Filed: January 12, 2021
    Date of Patent: August 30, 2022
    Assignee: SPATIALX INC.
    Inventors: Xavier Prospero, Aric Marshall, Michael Plitkins, Calin Pacurariu
  • Patent number: 11423897
    Abstract: Systems and methods are described herein for generating an adaptive response to a user request. Input indicative of a user request may be received and utilized to identify an item in an electronic catalog. Title segments may be identified from the item's title. Significant segments of the user request may be determined. In response to the user request, a shortened title may be generated from the identified title segments and provided as output at the user device (e.g., via audible output provided at a speaker of the user device, via textual output, or the like). At least one of the title segments provided in the shortened title may correlate to the significant segment identified from the user request. In some embodiments, the length and content of the shortened title may vary based at least in part on the contextual intent of the user's request.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: August 23, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Ran Levy, Ori Rozen, Leon Portman, Knaan Ratosh, Ido Arad, Hadar Neumann
  • Patent number: 11423236
    Abstract: A method for identifying phrases in a text document having a similar discourse to a candidate phrase includes separating text in a document file into a plurality of phrases and generating a plurality of embedding vectors in a textual embedding space by inputting the plurality of phrases into an embedding engine. A mapping of each embedding vector in the textual embedding space is generated with each corresponding phrase and a document location of each corresponding phrase in the document file. A candidate phrase is received by a user and a candidate embedding vector is generated using the embedding engine. Similarity scores are computed based on the plurality of embedding space distances between the candidate phrase embedding vector location and each respective location of each embedding vector in the textual embedding space. A listing of phrases with the highest similarity scores are outputted with respective document locations in the text.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: August 23, 2022
    Assignee: Capital One Services, LLC
    Inventors: Austin Walters, Vincent Pham, Ernest Kwak, Galen Rafferty, Reza Farivar, Jeremy Goodsitt, Anh Truong
  • Patent number: 11425315
    Abstract: A video communications method is provided, including: respectively displaying video images of at least two terminals in at least two display subareas of a video communication interface in a video chat session of the at least two terminal; obtaining a first special effect display instruction; and adding a first special effect to the at least two display subareas based on the first special effect display instruction. The method also includes transmitting the first special effect display instruction to a second terminal of the at least two terminals, the second terminal being an action recipient of the first special effect; and selecting, among multiple end special effects, a target end special effect to be added to the video images of the at least two terminals according to a body action occurred in the video image of the second terminal.
    Type: Grant
    Filed: February 3, 2020
    Date of Patent: August 23, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Ying Zhu, Chao Wang, Yinglei Liang, Haoqi Kuang, Lin Shi, Jinjie Wang, Weisong Zhu
  • Patent number: 11425487
    Abstract: Disclosed is a translation system using a sound vibration microphone capable of fundamentally blocking noise by detecting vibrations generated from a user's voice using a piezo element. The translation system using a sound vibration microphone includes a first earset including a sound vibration sensor and a speaker, a first wearable acoustic device including a microphone and a speaker and performing wireless communication with the first earset, and a first electronic communication device performing wireless communication with the first wearable acoustic device and performing communication with a translation server, wherein the first wearable acoustic device and the first electronic communication device are in a communicatable state.
    Type: Grant
    Filed: November 23, 2020
    Date of Patent: August 23, 2022
    Assignee: EM-TECH CO., LTD.
    Inventors: Cheon Myeong Kim, Byung Min Yu, Jae Hwa Lim, Jeong Ae Ha, Ji Young Lee
  • Patent number: 11423114
    Abstract: Systems for automatically suggesting content items to an author of a digital page so that they can be added to the page. One embodiment comprises a system having a page editor that launches a component browser to enable viewing of content items stored in a repository. A web content management server extracts textual information from a digital page that is open for editing and analyzes the information to identify words indicative of the page's subject matter. The system generates a query using the identified words and a search engine searches the repository for items associated with the identified words. The content items identified by the search are displayed by the component browser and are selectable to add particular suggested content items to the page.
    Type: Grant
    Filed: November 7, 2019
    Date of Patent: August 23, 2022
    Assignee: OPEN TEXT HOLDINGS, INC.
    Inventors: Paul Kieron Carl Narth, Gregory Aaron Parmiter, Jason Jackson, Partha Saradhi Govind
  • Patent number: 11423073
    Abstract: An audio-visual management system and process for facilitating comprehension of electronic documents. As an example, a user may access contents of an electronic document via a user interface and request a speech-synthesized recitation of any text in the electronic document. In addition, the user interface can include a navigation tool configured to visually represent significant events in the electronic document via selectable indicators, including but not limited to comments, slides, images, graphs, and/or other internal contextual markers. These indicators can serve as navigation options that can provide guidance to users as they listen to a reading of the electronic document. User selection of an indicator can initiate an audio presentation of textual content that is directly associated with the specified event. In addition, users can view availability of others currently listening to the document and connect with them to create a joint listening experience.
    Type: Grant
    Filed: November 16, 2018
    Date of Patent: August 23, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Erez Kikin-Gil, Benjamin David Smith, Colleen Griffiths Estrada
  • Patent number: 11417340
    Abstract: Electronic conferences are a common method of conducting a meeting when the participants are not at the same location. When a conference drops a portion of the audio, such carrying speech from a speaker, participants may be excluded from the full content of the conference. By having a device associated with a speaking participant convey both audio and generated text from the speech provided by the speaking participant, a server may be able to determine that speech is missing from the audio portion and automatically insert text content. As a result, poor audio quality may be mitigated with text and omitting unwanted text when the audio quality is sufficient.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: August 16, 2022
    Assignee: Avaya Inc.
    Inventors: Darren Holmes, Jibin George, Salil Dhawan, Sandeep Goynar, Harsimran Jeet Singh
  • Patent number: 11418644
    Abstract: Systems and methods for making a multitrack recording of a telephony communication, such as a conference call, record the contributions of each participant its own respective, separate recording track. In some instances, the contribution(s) of one or more participants is recorded in separate recording tracks, and the contributions of multiple other participants is mixed and recorded in a single recording track. An organizer or administrator of a telephony communication, such as a conference call, can instruct a multitrack recording system as to how to format a multitrack recording of the telephony communication via commands submitted through an application programming interface (API).
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: August 16, 2022
    Assignee: Vonage Business Inc.
    Inventors: Jonathan Farrow, Dragos Oancea
  • Patent number: 11410644
    Abstract: Systems and methods for generating training data for a supervised topic modeling system from outputs of a topic discovery model are described herein. In an embodiment, a system receives a plurality of digitally stored call transcripts and, using a topic model, generates an output which identifies a plurality of topics represented in the plurality of digitally stored call transcripts. Using the output of the topic model, the system generates an input dataset for a supervised learning model by identify a first subset of the plurality of digitally stored call transcripts that include the particular topic, storing a positive value for the first subset, identifying a second subset that do not include the particular topic, and storing a negative value for the second subset. The input training dataset is then used to train a supervised learning model.
    Type: Grant
    Filed: August 18, 2020
    Date of Patent: August 9, 2022
    Assignee: INVOCA, INC.
    Inventors: Michael McCourt, Anoop Praturu
  • Patent number: 11410426
    Abstract: In non-limiting examples of the present disclosure, systems, methods and devices for generating summary content are presented. Voice audio data and video data for an electronic meeting may be received. A language processing model may be applied to a transcript of the audio data and textual importance scores may be calculated. A video/image model may be applied to the video data and visual importance scores may be calculated. A combined importance score may be calculated for sections of the electronic meeting based on the textual importance scores and the visual importance scores. A meeting summary that includes summary content from sections for which combined importance scores exceed a threshold value may be generated.
    Type: Grant
    Filed: June 4, 2020
    Date of Patent: August 9, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Erez Kikin-Gil, Daniel Yancy Parish
  • Patent number: 11409804
    Abstract: The present disclosure relates to a data analysis method and a data analysis system thereof. The data analysis method includes steps of: receiving a first learning content data, and adding multiple first segmentation marks to the first learning content data to divide multiple first learning sections on the first learning content data. The first learning sections are arranged according to a time axis. Searching a first keyword string corresponding to each first learning section from the first learning sections. Receiving a analysis command, and analyzing the analysis command with the first keyword string of each of the first learning content data to obtain multiple first similarities, corresponding to the analysis command and each of the first learning sections. Finally, searching for the first learning section with the highest similarity.
    Type: Grant
    Filed: August 2, 2019
    Date of Patent: August 9, 2022
    Assignee: DELTA ELECTRONICS, INC.
    Inventors: Shih-Gong Li, Shih-Han Chan, Chao-Hsuan Ko, Guo-Cheng Lan
  • Patent number: 11411961
    Abstract: An electronic device is provided. The electronic device includes a user interface, a location sensor configured to sense a location of the electronic device, a processor electrically connected with the user interface and the location sensor, and a memory electrically connected with the processor and configured to store a first application program and a second application program. The memory is further configured to store instructions that, when executed, enable the processor to receive first location data with a first degree of accuracy regarding the location of the electronic device from the location sensor, process at least part of the first location data to generate second location data with a second degree of accuracy lower than the first degree of accuracy regarding the location of the electronic device, provide the at least part of the first location data to execute the first application program, and provide at least part of the second location data to execute the second application program.
    Type: Grant
    Filed: August 6, 2021
    Date of Patent: August 9, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jin-Ho Moon, Jong-Eun Yang, Jae-Yung Yeo
  • Patent number: 11404052
    Abstract: In a service data processing method performed by a server, user speech information collected by a first terminal is received. A target service operation code according to the user speech information is obtained. The target service operation code is used for identifying target service operation information. The target service operation code is transmitted from the server to the first terminal, so that the first terminal plays the target service operation code by using a speech. The target service operation code obtained by a second terminal is received. A target execution page corresponding to the target service operation code is searched for. The target execution page is transmitted to the second terminal, so that the second terminal executes a service operation corresponding to the target.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: August 2, 2022
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventors: Jinglin Ma, Xuewei Fang
  • Patent number: 11404041
    Abstract: A message management unit receives and accumulates a message, wherein the message is distributed for every update, is the message data representing a latest situation of a competition, an explanation generation unit generates an explanatory text for conveying unconveyed information detected from the message, based on conveyed information, a speech synthesis unit outputs a speech converted from the explanatory text, wherein the explanation generation unit stores the unconveyed information for the explanatory text as the conveyed information, stands by until completion of completion of the speech, and initiates a procedure for generating a new explanatory text based on updated unconveyed information.
    Type: Grant
    Filed: May 23, 2018
    Date of Patent: August 2, 2022
    Assignees: NIPPON HOSO KYOKAI, NHK Engineering System, Inc.
    Inventors: Tadashi Kumano, Ichiro Yamada, Atsushi Imai, Hideki Sumiyoshi, Yuko Yamanouchi, Toshihiro Shimizu, Nobumasa Seiyama, Shoei Sato, Reiko Saito, Taro Miyazaki, Kiyoshi Kurihara, Manon Ichiki, Tohru Takagi
  • Patent number: 11404058
    Abstract: Systems and methods for e-commerce systems using natural language understanding are described. A computing device is configured receive a user utterance including at least one identified semantic component and at least one missing semantic component and generate a context stack including a set of context entries. Each of the context entries includes a root intent element, an entity list element, and a dialogue stack and each context entry in the set of context entries is associated with one of a user utterance or a system utterance. The computing device is further configured to review at least one context entry in the set of context entries to locate the at least one missing semantic element within the dialogue stack and generate an intent flow execution request including the at least one semantic element from the first speech data and the missing semantic element.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: August 2, 2022
    Assignee: Walmart Apollo, LLC
    Inventors: Snehasish Mukherjee, Shankara Bhargava Subramanya
  • Patent number: 11398218
    Abstract: Techniques are described for providing dynamically configured speech output, through which text data from a message is presented as speech output through a text-to-speech (TTS) engine that employs a voice profile to provide a machine-generated voice that approximates that of the sender of the message. The sender can also indicate the type of voice they would prefer the TTS engine use to render their text to a recipient, and the voice to be used can be specified in a sender's user profile, as a preference or attribute of the sending user. In some examples, the voice profile to be used can be indicated as metadata included in the message. A voice profile can specify voice attributes such as the tone, pitch, register, timbre, pacing, gender, accent, and so forth. A voice profile can be generated through a machine learning (ML) process.
    Type: Grant
    Filed: April 25, 2019
    Date of Patent: July 26, 2022
    Assignee: United Services Automobile Association (USAA)
    Inventors: Justin Dax Haslam, Robert Wilson Barner
  • Patent number: 11386902
    Abstract: Embodiments of the present invention provide systems and methods for generation and maintenance of verified data records. The system may receive a data submission from a user device over one or more communication channels and convert the data submission into a normalized text format for processing and analysis. The data submission may then be analyzed using one or more trained machined learning models in order to identify factual statements and modifiers within the data submission, and generate a confidence score of verified factual information based on corroboration with one or more additional data sources. Additionally, identified modifiers may be analyzed to determine positive or negative sentiment.
    Type: Grant
    Filed: April 28, 2020
    Date of Patent: July 12, 2022
    Assignee: BANK OF AMERICA CORPORATION
    Inventors: Kelly Renee-Drop Keiter, Christopher Daniel Birch, Susan R. Hart, Lisa Matthews, Cody Dean Searl
  • Patent number: 11373672
    Abstract: Disclosed are devices, systems, apparatus, methods, products, and other implementations, including a method comprising obtaining, by a device, a combined sound signal for signals combined from multiple sound sources in an area in which a person is located, and applying, by the device, speech-separation processing (e.g., deep attractor network (DAN) processing, online DAN processing, LSTM-TasNet processing, Conv-TasNet processing), to the combined sound signal from the multiple sound sources to derive a plurality of separated signals that each contains signals corresponding to different groups of the multiple sound sources. The method further includes obtaining, by the device, neural signals for the person, the neural signals being indicative of one or more of the multiple sound sources the person is attentive to, and selecting one of the plurality of separated signals based on the obtained neural signals. The selected signal may then be processed (amplified, attenuated).
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: June 28, 2022
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Nima Mesgarani, Yi Luo, James O'Sullivan, Zhuo Chen
  • Patent number: 11373634
    Abstract: An electronic device secures diversity of a user utterance with respect to a content name when a user searches a content through a display device by utilizing a voice. A method by an electronic device includes steps of receiving input of a user voice, acquiring a keyword related to a content included in the user voice, and acquiring at least one modified keyword based on the keyword, acquiring a plurality of search results corresponding to the keyword and the at least one modified keyword, comparing the keyword and the modified keyword with the plurality of search results and acquiring a content name corresponding to the keyword, and updating a database of content names based on the keyword, the modified keyword, and the final content name.
    Type: Grant
    Filed: October 30, 2019
    Date of Patent: June 28, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jiwon Yoo, Jihun Park
  • Patent number: 11372608
    Abstract: A machine includes a processor and a memory connected to the processor. The memory stores instructions executed by the processor to receive a message and a message parameter indicative of a characteristic of the message, where the message includes a photograph or a video. A determination is made that the message parameter corresponds to a selected gallery, where the selected gallery includes a sequence of photographs or videos. The message is posted to the selected gallery in response to the determination. The selected gallery is supplied in response to a request.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: June 28, 2022
    Assignee: Snap Inc.
    Inventor: Timothy Sehn
  • Patent number: 11368581
    Abstract: A method to transcribe communications includes obtaining an audio message from a first device during a voice communication session with a second device including a display screen, providing the message to a first speech recognition system to generate a first message transcript, providing the transcript to the second device for presentation on the screen, obtaining an indication that a transcript quality is below a threshold, providing, in response, the message to a second system to generate a second transcript while still providing it to the first system to generate the first transcript and providing the first transcript to the second device for presentation on the display screen, and in response to an event occurring that indicates the second transcript is to be provided to the second device instead of the first transcript, providing the second transcript to the second device for presentation on the screen instead of the first transcript.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: June 21, 2022
    Assignee: Ultratec, Inc.
    Inventors: Robert M. Engelke, Kevin R. Colwell, Christopher Engelke
  • Patent number: 11366569
    Abstract: The disclosure relates to an interactive interface display method, apparatus, and storage medium. The method includes displaying an information display interface including a call entry of an intelligent interactive application; calling the intelligent interactive application when a trigger operation on the call entry is detected; displaying a first dynamic effect in which the call entry moves in the information display interface; and displaying an interactive interface of the intelligent interactive application after displaying the first dynamic effect in which the call entry moves in the information display interface.
    Type: Grant
    Filed: March 12, 2020
    Date of Patent: June 21, 2022
    Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.
    Inventors: Yuhang Xia, Zekui Li
  • Patent number: 11361491
    Abstract: The present invention relates to a method of generating a facial expression of a user for a virtual environment. The method comprises obtaining a video and an associated speech of the user. Further, extracting in real-time at least one of one or more voice features and one or more text features based on the speech. Furthermore, identifying one or more phonemes in the speech. Thereafter, determining one or more facial features relating to the speech of the user using a pre-trained second learning model based on the one or more voice features, the one or more phonemes, the video and one or more previously generated facial features of the user. Finally, generating the facial expression of the user corresponding to the speech for an avatar representing the user in the virtual environment.
    Type: Grant
    Filed: September 3, 2020
    Date of Patent: June 14, 2022
    Assignee: Wipro Limited
    Inventors: Vivek Kumar Varma Nadimpalli, Gopichand Agnihotram
  • Patent number: 11356492
    Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention detect an audio stream comprising one or more voice packets from a first computing system. Embodiments of the present invention can, in response to detecting an audio stream, dynamically prevent audio drop out on a second computing system using circular buffers based on network consistency.
    Type: Grant
    Filed: September 16, 2020
    Date of Patent: June 7, 2022
    Assignee: Kyndryl, Inc.
    Inventors: Tiberiu Suto, Nadiya Kochura, Vinod A. Valecha
  • Patent number: 11354516
    Abstract: An information processor includes a generation section that generates a specified character string on the basis of at least one of voice information corresponding to a content of speech detected by a voice detection section and vehicle information acquired from a vehicle. With this configuration, a user can input the specified character string, which is a hashtag, without an operation. Thus, compared to the related art in which the hashtag is generated on the basis of the operation (manual input) by the user, a burden on the user can significantly be reduced, and an input error can be prevented.
    Type: Grant
    Filed: November 22, 2019
    Date of Patent: June 7, 2022
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Ryotaro Fujiwara, Keiko Suzuki, Makoto Honda, Chikage Kubo, Ryota Okubi, Takeshi Fujiki
  • Patent number: 11350148
    Abstract: Aspects of the subject disclosure may include, for example, modifying a user profile associated with a user associated with a content service to generate an updated user profile according to consumption of media content by the user and user feedback information associated with the consumption of the media content, determining a user context according to information associated with user device, where the user context includes current activity of the user, modifying a set of media content according to the user context that is determined to generate an updated set of media content, where a type of media content is eliminated from the set of media content in the updated set of media content according to the user context, and presenting the updated set of content at a presentation device of the user via a personal media channel of the user associated with the content service. Other embodiments are disclosed.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: May 31, 2022
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Zhu Liu, Eric Zavesky, Bernard S. Renger, Behzad Shahraray, David Crawford Gibbon, Tan Xu, Lee Begeja, Raghuraman Gopalan
  • Patent number: 11341707
    Abstract: A method and system for transforming simple user input into customizable animated images for use in text-messaging applications.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: May 24, 2022
    Assignee: EMONSTER INC
    Inventor: Enrique Bonansea
  • Patent number: 11341331
    Abstract: An intelligent speech assistant receives information collected while a user is speaking. The information can comprise speech data, vision data, or both, where the speech data is from the user speaking and the vision data is of the user while speaking. The assistant evaluates the speech data against a script which can contain information that the user should speak, information that the user should not speak, or both. The assistant collects instances where the user utters phrases that match the script or instances where the user utters phrases that do not match the script, depending on whether phases should or should not be spoken. The assistant evaluates vision data to identify gestures, facial expressions, and/or emotions of the user. Instances where the gestures, facial expressions, and/or emotions are not appropriate to the context are flagged. Real-time prompts and/or a summary is presented to the user as feedback.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: May 24, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Huakai Liao, Priyanka Vikram Sinha, Kevin Dara Khieu, Derek Martin Johnson, Siliang Kang, Huey-Ru Tsai, Amit Srivastava
  • Patent number: 11341957
    Abstract: A method for detecting a keyword, applied to a terminal, includes: extracting a speech eigenvector of a speech signal; obtaining, according to the speech eigenvector, a posterior probability of each target character being a key character in any keyword in an acquisition time period of the speech signal; obtaining confidences of at least two target character combinations according to the posterior probability of each target character; and determining that the speech signal includes the keyword upon determining that all the confidences of the at least two target character combinations meet a preset condition. The target character is a character in the speech signal whose pronunciation matches a pronunciation of the key character. Each target character combination includes at least one target character, and a confidence of a target character combination represents a probability of the target character combination being the keyword or a part of the keyword.
    Type: Grant
    Filed: July 20, 2020
    Date of Patent: May 24, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Yi Gao, Meng Yu, Dan Su, Jie Chen, Min Luo
  • Patent number: 11334315
    Abstract: Systems, methods, and devices for human-machine interfaces for utterance-based playlist selection are disclosed. In one method, a list of playlists is traversed and a portion of each is audibly output until a playlist command is received. Based on the playlist command, the traversing is stopped and a playlist is selected for playback. In examples, the list of playlists is modified based on a modification input.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: May 17, 2022
    Assignee: Spotify AB
    Inventors: Daniel Bromand, Richard Mitic, Horia-Dragos Jurcut, Henriette Susanne Martine Cramer, Ruth Brillman
  • Patent number: 11335360
    Abstract: In one aspect, a device includes at least one processor and storage accessible to the at least one processor. The storage includes instructions executable by the at least one processor to analyze the decibel levels of audio of a user's speech. The instructions are executable to, based on the analysis, enhance a transcript of the user's speech with indications of particular words from the user's speech as being associated with one or more emotions of the user.
    Type: Grant
    Filed: September 21, 2019
    Date of Patent: May 17, 2022
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Johnathan Co Lee, Jonathan Jen-Wei Yu
  • Patent number: 11322231
    Abstract: A method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a patient to provide encounter information via a virtual assistant during a pre-visit portion of a patient encounter. Encounter information is obtained from the patient in response to the prompting by the virtual assistant.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: May 3, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Donald E. Owen, Garret N. Erskine, Mehmet Mert Öz
  • Patent number: 11314799
    Abstract: Described herein are technologies that facilitate effective use (e.g., indexing and searching) of non-text machine data (e.g., audio/visual data) with text-based indexes of an event-based machine-data intake and query system.
    Type: Grant
    Filed: July 29, 2016
    Date of Patent: April 26, 2022
    Assignee: Splunk Inc.
    Inventor: Adam Oliner
  • Patent number: 11315570
    Abstract: The technology disclosed relates to a machine learning based speech-to-text transcription intermediary which, from among multiple speech recognition engines, selects a speech recognition engine for accurately transcribing an audio channel based on sound and speech characteristics of the audio channel.
    Type: Grant
    Filed: April 2, 2019
    Date of Patent: April 26, 2022
    Assignee: Facebook Technologies, LLC
    Inventor: Shamir Allibhai
  • Patent number: 11315569
    Abstract: Disclosed is a system for generating a transcript of a meeting using individual audio recordings of speakers in the meeting. The system obtains an audio recording file from each speaker in the meeting, generates a speaker-specific transcript for each speaker using the audio recording of the corresponding speaker, and merges the speaker-specific transcripts to generate a meeting transcript that includes text of a speech from all speakers in the meeting. As the system generates speaker specific transcripts using speaker-specific (high quality) audio recordings, the need for “diarization” is removed, the audio quality of recording of each speaker is maximized, leading to virtually lossless recordings, and resulting in an improved transcription quality and analysis.
    Type: Grant
    Filed: February 7, 2020
    Date of Patent: April 26, 2022
    Assignee: Memoria, Inc.
    Inventors: Homayoun Talieh, Rémi Berson, Eric Pellish
  • Patent number: 11308951
    Abstract: There is provided an information processing apparatus, an information processing method, and a program capable of providing a more convenient speech recognition service. The processing of recognizing, as an edited portion, a desired word configuring a sentence presented to a user as a speech recognition result, acquiring speech information repeatedly uttered for editing a word of the edited portion, and connecting speech information other than a repeated utterance to the speech information is performed, and speech information for speech recognition for editing is generated. Then, speech recognition is performed on the generated speech information for speech recognition for editing.
    Type: Grant
    Filed: January 4, 2018
    Date of Patent: April 19, 2022
    Assignee: SONY CORPORATION
    Inventors: Shinichi Kawano, Yuhei Taki
  • Patent number: 11308938
    Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: April 19, 2022
    Assignee: SoundHound, Inc.
    Inventors: Maisy Wieman, Jonah Probell, Sudharsan Krishnaswamy
  • Patent number: 11308945
    Abstract: A hypernym of a word in utterance data may be probabilistically determined. The utterance data may correspond to a spoken query or command. A redacted utterance may be derived by replacing the word with the hypernym. The hypernym may be determined by applying noise to a position in a hierarchical embedding that corresponds to the word. The word may be identified as being potentially sensitive. The hierarchical embedding may be a Hyperbolic embedding that may indicate hierarchical relationships between individual words of a corpus of words, such as “red” is a “color” or “Austin” is in “Texas.” Noise may be applied by obtaining a first value in Euclidean space based on a second value in Hyperbolic space, and obtaining a third value in Hyperbolic space based on the first value in Euclidean space. The second value in Hyperbolic space may correspond to the word.
    Type: Grant
    Filed: September 4, 2019
    Date of Patent: April 19, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Thomas Drake, Oluwaseyi Feyisetan, Thomas Diethe
  • Patent number: 11308943
    Abstract: An electronic device receives audio data for a media item. The electronic device generates, from the audio data, a plurality of samples, each sample having a predefined maximum length. The electronic device, using a neural network trained to predict character probabilities, generates a probability matrix of characters for a first portion of a first sample of the plurality of samples. The probability matrix includes character information, timing information, and respective probabilities of respective characters at respective times. The electronic device identifies, for the first portion of the first sample, a first sequence of characters based on the generated probability matrix.
    Type: Grant
    Filed: September 12, 2019
    Date of Patent: April 19, 2022
    Assignee: Spotify AB
    Inventors: Daniel Stoller, Simon René Georges Durand, Sebastian Ewert
  • Patent number: 11310223
    Abstract: An identity authentication method, includes: at an electronic device having one or more processors and memory, the electronic device coupled with a display and one or more input devices: receiving an identity authentication request; in response to receiving the identity authentication request, performing an interactive authentication information exchange between the electronic device and a user, including: displaying, on the display, first visual information in a first manner; displaying, on the display, the first visual information in a second manner that is distinct from the first manner, wherein the first visual information displayed in the second manner includes a timing characteristic that is absent from the first visual information displayed in the first manner; receiving user input entered in accordance with the first visual information displayed in the second manner; and verifying that the user input conforms to the timing characteristic in the first visual information displayed in the second manner.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: April 19, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Lu Zheng, Shuai Zhang, Tingting Shang, Rui Rao, Yan Chen, Yaode Huang, Zhenhua Wang
  • Patent number: 11302290
    Abstract: Described are various embodiments related to a vehicle device and an electronic device, wherein the vehicle device according to one embodiment can include: a display; a memory; at least one or more sensors; communication circuitry configured to communicate with an external electronic device; and a processor configured to display first display information according to execution of a first application on a first area on the display, perform control to transfer vehicle-related context information to the electronic device based on information obtained by the at least one or more sensors and, if information related to a second application corresponding to the vehicle-related context information is received from the electronic device, display second display information associated with the second application on a second area on the display using the received information.
    Type: Grant
    Filed: January 11, 2018
    Date of Patent: April 12, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yong-Jun Lim, Tae-Young Ha
  • Patent number: 11302305
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the method includes receiving a voice input from a user device; generating a recognition output; receiving a user selection of one or more terms in the recognition output; receiving a user input of one or more letters replacing the user selected one or more terms; determining suggested correction candidates based in part on the user input and the voice input; and providing one or more suggested correction candidates to the user device as suggested corrected recognition outputs.
    Type: Grant
    Filed: May 14, 2020
    Date of Patent: April 12, 2022
    Assignee: Google LLC
    Inventors: Evgeny A. Cherepanov, Jakob Nicolaus Foerster, Vikram Sridar, Ishai Rabinovitz, Omer Tabach
  • Patent number: 11302313
    Abstract: Systems and methods for speech recognition are provided. The method may include obtaining a plurality of candidate recognition results of speech information uttered by a user and a plurality of preliminary scores corresponding to the plurality of candidate recognition results, respectively. The method may further include, for each of the plurality of candidate recognition results, extracting one or more keywords from the candidate recognition result and determining at least one parameter associated with the one or more extracted keywords. The method may further include, for each of the plurality of candidate recognition results, generating an updating coefficient based on the at least one parameter and updating the preliminary score based on the updating coefficient to generate an updated score. The method may further include determining, from the plurality of candidate recognition results, a target recognition result based on the plurality of updated scores.
    Type: Grant
    Filed: December 14, 2019
    Date of Patent: April 12, 2022
    Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventor: Xiulin Li
  • Patent number: 11301506
    Abstract: Automated digital asset tagging techniques and systems are described that support use of multiple vocabulary sets. In one example, a plurality of digital assets are obtained having first-vocabulary tags taken from a first-vocabulary set. Second-vocabulary tags taken from a second-vocabulary set are assigned to the plurality of digital assets through machine learning. A determination is made that at least one first-vocabulary tag includes a plurality of visual classes based on the assignment of at least one second-vocabulary tag. Digital assets are collected from the plurality of digital assets that correspond to one visual class of the plurality of visual classes. The model is generated using machine learning based on the collected digital assets.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: April 12, 2022
    Assignee: Adobe Inc.
    Inventors: Mayur Hemani, Balaji Krishnamurthy
  • Patent number: 11295839
    Abstract: A method, computer program product, and computing system for automating a follow-up process is executed on a computing device and includes prompting a patient to provide encounter information via a virtual assistant during a post-visit portion of a patient encounter. Encounter information is obtained from the patient in response to the prompting by the virtual assistant.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: April 5, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Donald E. Owen, Garret N. Erskine, Mehmet Mert Öz
  • Patent number: 11294474
    Abstract: A virtual collaboration system receives input video data including a participant. The system analyzes the input video data to identify a gesture or a movement made by the participant. The system selects an overlay image as a function of the gesture or the movement made by the participant, incorporates the overlay image into the input video data, thereby generating output video data that includes the overlay image, and transmits the output video data to one or more participant devices.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: April 5, 2022
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Aaron Michael Stewart, Alden Rose, Ellis Anderson