Speech To Image Patents (Class 704/235)
  • Patent number: 11373672
    Abstract: Disclosed are devices, systems, apparatus, methods, products, and other implementations, including a method comprising obtaining, by a device, a combined sound signal for signals combined from multiple sound sources in an area in which a person is located, and applying, by the device, speech-separation processing (e.g., deep attractor network (DAN) processing, online DAN processing, LSTM-TasNet processing, Conv-TasNet processing), to the combined sound signal from the multiple sound sources to derive a plurality of separated signals that each contains signals corresponding to different groups of the multiple sound sources. The method further includes obtaining, by the device, neural signals for the person, the neural signals being indicative of one or more of the multiple sound sources the person is attentive to, and selecting one of the plurality of separated signals based on the obtained neural signals. The selected signal may then be processed (amplified, attenuated).
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: June 28, 2022
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Nima Mesgarani, Yi Luo, James O'Sullivan, Zhuo Chen
  • Patent number: 11373634
    Abstract: An electronic device secures diversity of a user utterance with respect to a content name when a user searches a content through a display device by utilizing a voice. A method by an electronic device includes steps of receiving input of a user voice, acquiring a keyword related to a content included in the user voice, and acquiring at least one modified keyword based on the keyword, acquiring a plurality of search results corresponding to the keyword and the at least one modified keyword, comparing the keyword and the modified keyword with the plurality of search results and acquiring a content name corresponding to the keyword, and updating a database of content names based on the keyword, the modified keyword, and the final content name.
    Type: Grant
    Filed: October 30, 2019
    Date of Patent: June 28, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jiwon Yoo, Jihun Park
  • Patent number: 11372608
    Abstract: A machine includes a processor and a memory connected to the processor. The memory stores instructions executed by the processor to receive a message and a message parameter indicative of a characteristic of the message, where the message includes a photograph or a video. A determination is made that the message parameter corresponds to a selected gallery, where the selected gallery includes a sequence of photographs or videos. The message is posted to the selected gallery in response to the determination. The selected gallery is supplied in response to a request.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: June 28, 2022
    Assignee: Snap Inc.
    Inventor: Timothy Sehn
  • Patent number: 11368581
    Abstract: A method to transcribe communications includes obtaining an audio message from a first device during a voice communication session with a second device including a display screen, providing the message to a first speech recognition system to generate a first message transcript, providing the transcript to the second device for presentation on the screen, obtaining an indication that a transcript quality is below a threshold, providing, in response, the message to a second system to generate a second transcript while still providing it to the first system to generate the first transcript and providing the first transcript to the second device for presentation on the display screen, and in response to an event occurring that indicates the second transcript is to be provided to the second device instead of the first transcript, providing the second transcript to the second device for presentation on the screen instead of the first transcript.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: June 21, 2022
    Assignee: Ultratec, Inc.
    Inventors: Robert M. Engelke, Kevin R. Colwell, Christopher Engelke
  • Patent number: 11366569
    Abstract: The disclosure relates to an interactive interface display method, apparatus, and storage medium. The method includes displaying an information display interface including a call entry of an intelligent interactive application; calling the intelligent interactive application when a trigger operation on the call entry is detected; displaying a first dynamic effect in which the call entry moves in the information display interface; and displaying an interactive interface of the intelligent interactive application after displaying the first dynamic effect in which the call entry moves in the information display interface.
    Type: Grant
    Filed: March 12, 2020
    Date of Patent: June 21, 2022
    Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.
    Inventors: Yuhang Xia, Zekui Li
  • Patent number: 11361491
    Abstract: The present invention relates to a method of generating a facial expression of a user for a virtual environment. The method comprises obtaining a video and an associated speech of the user. Further, extracting in real-time at least one of one or more voice features and one or more text features based on the speech. Furthermore, identifying one or more phonemes in the speech. Thereafter, determining one or more facial features relating to the speech of the user using a pre-trained second learning model based on the one or more voice features, the one or more phonemes, the video and one or more previously generated facial features of the user. Finally, generating the facial expression of the user corresponding to the speech for an avatar representing the user in the virtual environment.
    Type: Grant
    Filed: September 3, 2020
    Date of Patent: June 14, 2022
    Assignee: Wipro Limited
    Inventors: Vivek Kumar Varma Nadimpalli, Gopichand Agnihotram
  • Patent number: 11354516
    Abstract: An information processor includes a generation section that generates a specified character string on the basis of at least one of voice information corresponding to a content of speech detected by a voice detection section and vehicle information acquired from a vehicle. With this configuration, a user can input the specified character string, which is a hashtag, without an operation. Thus, compared to the related art in which the hashtag is generated on the basis of the operation (manual input) by the user, a burden on the user can significantly be reduced, and an input error can be prevented.
    Type: Grant
    Filed: November 22, 2019
    Date of Patent: June 7, 2022
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Ryotaro Fujiwara, Keiko Suzuki, Makoto Honda, Chikage Kubo, Ryota Okubi, Takeshi Fujiki
  • Patent number: 11356492
    Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention detect an audio stream comprising one or more voice packets from a first computing system. Embodiments of the present invention can, in response to detecting an audio stream, dynamically prevent audio drop out on a second computing system using circular buffers based on network consistency.
    Type: Grant
    Filed: September 16, 2020
    Date of Patent: June 7, 2022
    Assignee: Kyndryl, Inc.
    Inventors: Tiberiu Suto, Nadiya Kochura, Vinod A. Valecha
  • Patent number: 11350148
    Abstract: Aspects of the subject disclosure may include, for example, modifying a user profile associated with a user associated with a content service to generate an updated user profile according to consumption of media content by the user and user feedback information associated with the consumption of the media content, determining a user context according to information associated with user device, where the user context includes current activity of the user, modifying a set of media content according to the user context that is determined to generate an updated set of media content, where a type of media content is eliminated from the set of media content in the updated set of media content according to the user context, and presenting the updated set of content at a presentation device of the user via a personal media channel of the user associated with the content service. Other embodiments are disclosed.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: May 31, 2022
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Zhu Liu, Eric Zavesky, Bernard S. Renger, Behzad Shahraray, David Crawford Gibbon, Tan Xu, Lee Begeja, Raghuraman Gopalan
  • Patent number: 11341707
    Abstract: A method and system for transforming simple user input into customizable animated images for use in text-messaging applications.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: May 24, 2022
    Assignee: EMONSTER INC
    Inventor: Enrique Bonansea
  • Patent number: 11341957
    Abstract: A method for detecting a keyword, applied to a terminal, includes: extracting a speech eigenvector of a speech signal; obtaining, according to the speech eigenvector, a posterior probability of each target character being a key character in any keyword in an acquisition time period of the speech signal; obtaining confidences of at least two target character combinations according to the posterior probability of each target character; and determining that the speech signal includes the keyword upon determining that all the confidences of the at least two target character combinations meet a preset condition. The target character is a character in the speech signal whose pronunciation matches a pronunciation of the key character. Each target character combination includes at least one target character, and a confidence of a target character combination represents a probability of the target character combination being the keyword or a part of the keyword.
    Type: Grant
    Filed: July 20, 2020
    Date of Patent: May 24, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Yi Gao, Meng Yu, Dan Su, Jie Chen, Min Luo
  • Patent number: 11341331
    Abstract: An intelligent speech assistant receives information collected while a user is speaking. The information can comprise speech data, vision data, or both, where the speech data is from the user speaking and the vision data is of the user while speaking. The assistant evaluates the speech data against a script which can contain information that the user should speak, information that the user should not speak, or both. The assistant collects instances where the user utters phrases that match the script or instances where the user utters phrases that do not match the script, depending on whether phases should or should not be spoken. The assistant evaluates vision data to identify gestures, facial expressions, and/or emotions of the user. Instances where the gestures, facial expressions, and/or emotions are not appropriate to the context are flagged. Real-time prompts and/or a summary is presented to the user as feedback.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: May 24, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Huakai Liao, Priyanka Vikram Sinha, Kevin Dara Khieu, Derek Martin Johnson, Siliang Kang, Huey-Ru Tsai, Amit Srivastava
  • Patent number: 11334315
    Abstract: Systems, methods, and devices for human-machine interfaces for utterance-based playlist selection are disclosed. In one method, a list of playlists is traversed and a portion of each is audibly output until a playlist command is received. Based on the playlist command, the traversing is stopped and a playlist is selected for playback. In examples, the list of playlists is modified based on a modification input.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: May 17, 2022
    Assignee: Spotify AB
    Inventors: Daniel Bromand, Richard Mitic, Horia-Dragos Jurcut, Henriette Susanne Martine Cramer, Ruth Brillman
  • Patent number: 11335360
    Abstract: In one aspect, a device includes at least one processor and storage accessible to the at least one processor. The storage includes instructions executable by the at least one processor to analyze the decibel levels of audio of a user's speech. The instructions are executable to, based on the analysis, enhance a transcript of the user's speech with indications of particular words from the user's speech as being associated with one or more emotions of the user.
    Type: Grant
    Filed: September 21, 2019
    Date of Patent: May 17, 2022
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Johnathan Co Lee, Jonathan Jen-Wei Yu
  • Patent number: 11322231
    Abstract: A method, computer program product, and computing system for automating an intake process is executed on a computing device and includes prompting a patient to provide encounter information via a virtual assistant during a pre-visit portion of a patient encounter. Encounter information is obtained from the patient in response to the prompting by the virtual assistant.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: May 3, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Donald E. Owen, Garret N. Erskine, Mehmet Mert Öz
  • Patent number: 11314799
    Abstract: Described herein are technologies that facilitate effective use (e.g., indexing and searching) of non-text machine data (e.g., audio/visual data) with text-based indexes of an event-based machine-data intake and query system.
    Type: Grant
    Filed: July 29, 2016
    Date of Patent: April 26, 2022
    Assignee: Splunk Inc.
    Inventor: Adam Oliner
  • Patent number: 11315570
    Abstract: The technology disclosed relates to a machine learning based speech-to-text transcription intermediary which, from among multiple speech recognition engines, selects a speech recognition engine for accurately transcribing an audio channel based on sound and speech characteristics of the audio channel.
    Type: Grant
    Filed: April 2, 2019
    Date of Patent: April 26, 2022
    Assignee: Facebook Technologies, LLC
    Inventor: Shamir Allibhai
  • Patent number: 11315569
    Abstract: Disclosed is a system for generating a transcript of a meeting using individual audio recordings of speakers in the meeting. The system obtains an audio recording file from each speaker in the meeting, generates a speaker-specific transcript for each speaker using the audio recording of the corresponding speaker, and merges the speaker-specific transcripts to generate a meeting transcript that includes text of a speech from all speakers in the meeting. As the system generates speaker specific transcripts using speaker-specific (high quality) audio recordings, the need for “diarization” is removed, the audio quality of recording of each speaker is maximized, leading to virtually lossless recordings, and resulting in an improved transcription quality and analysis.
    Type: Grant
    Filed: February 7, 2020
    Date of Patent: April 26, 2022
    Assignee: Memoria, Inc.
    Inventors: Homayoun Talieh, Rémi Berson, Eric Pellish
  • Patent number: 11308951
    Abstract: There is provided an information processing apparatus, an information processing method, and a program capable of providing a more convenient speech recognition service. The processing of recognizing, as an edited portion, a desired word configuring a sentence presented to a user as a speech recognition result, acquiring speech information repeatedly uttered for editing a word of the edited portion, and connecting speech information other than a repeated utterance to the speech information is performed, and speech information for speech recognition for editing is generated. Then, speech recognition is performed on the generated speech information for speech recognition for editing.
    Type: Grant
    Filed: January 4, 2018
    Date of Patent: April 19, 2022
    Assignee: SONY CORPORATION
    Inventors: Shinichi Kawano, Yuhei Taki
  • Patent number: 11308943
    Abstract: An electronic device receives audio data for a media item. The electronic device generates, from the audio data, a plurality of samples, each sample having a predefined maximum length. The electronic device, using a neural network trained to predict character probabilities, generates a probability matrix of characters for a first portion of a first sample of the plurality of samples. The probability matrix includes character information, timing information, and respective probabilities of respective characters at respective times. The electronic device identifies, for the first portion of the first sample, a first sequence of characters based on the generated probability matrix.
    Type: Grant
    Filed: September 12, 2019
    Date of Patent: April 19, 2022
    Assignee: Spotify AB
    Inventors: Daniel Stoller, Simon René Georges Durand, Sebastian Ewert
  • Patent number: 11308938
    Abstract: To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: April 19, 2022
    Assignee: SoundHound, Inc.
    Inventors: Maisy Wieman, Jonah Probell, Sudharsan Krishnaswamy
  • Patent number: 11308945
    Abstract: A hypernym of a word in utterance data may be probabilistically determined. The utterance data may correspond to a spoken query or command. A redacted utterance may be derived by replacing the word with the hypernym. The hypernym may be determined by applying noise to a position in a hierarchical embedding that corresponds to the word. The word may be identified as being potentially sensitive. The hierarchical embedding may be a Hyperbolic embedding that may indicate hierarchical relationships between individual words of a corpus of words, such as “red” is a “color” or “Austin” is in “Texas.” Noise may be applied by obtaining a first value in Euclidean space based on a second value in Hyperbolic space, and obtaining a third value in Hyperbolic space based on the first value in Euclidean space. The second value in Hyperbolic space may correspond to the word.
    Type: Grant
    Filed: September 4, 2019
    Date of Patent: April 19, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Thomas Drake, Oluwaseyi Feyisetan, Thomas Diethe
  • Patent number: 11310223
    Abstract: An identity authentication method, includes: at an electronic device having one or more processors and memory, the electronic device coupled with a display and one or more input devices: receiving an identity authentication request; in response to receiving the identity authentication request, performing an interactive authentication information exchange between the electronic device and a user, including: displaying, on the display, first visual information in a first manner; displaying, on the display, the first visual information in a second manner that is distinct from the first manner, wherein the first visual information displayed in the second manner includes a timing characteristic that is absent from the first visual information displayed in the first manner; receiving user input entered in accordance with the first visual information displayed in the second manner; and verifying that the user input conforms to the timing characteristic in the first visual information displayed in the second manner.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: April 19, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Lu Zheng, Shuai Zhang, Tingting Shang, Rui Rao, Yan Chen, Yaode Huang, Zhenhua Wang
  • Patent number: 11301506
    Abstract: Automated digital asset tagging techniques and systems are described that support use of multiple vocabulary sets. In one example, a plurality of digital assets are obtained having first-vocabulary tags taken from a first-vocabulary set. Second-vocabulary tags taken from a second-vocabulary set are assigned to the plurality of digital assets through machine learning. A determination is made that at least one first-vocabulary tag includes a plurality of visual classes based on the assignment of at least one second-vocabulary tag. Digital assets are collected from the plurality of digital assets that correspond to one visual class of the plurality of visual classes. The model is generated using machine learning based on the collected digital assets.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: April 12, 2022
    Assignee: Adobe Inc.
    Inventors: Mayur Hemani, Balaji Krishnamurthy
  • Patent number: 11302290
    Abstract: Described are various embodiments related to a vehicle device and an electronic device, wherein the vehicle device according to one embodiment can include: a display; a memory; at least one or more sensors; communication circuitry configured to communicate with an external electronic device; and a processor configured to display first display information according to execution of a first application on a first area on the display, perform control to transfer vehicle-related context information to the electronic device based on information obtained by the at least one or more sensors and, if information related to a second application corresponding to the vehicle-related context information is received from the electronic device, display second display information associated with the second application on a second area on the display using the received information.
    Type: Grant
    Filed: January 11, 2018
    Date of Patent: April 12, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yong-Jun Lim, Tae-Young Ha
  • Patent number: 11302313
    Abstract: Systems and methods for speech recognition are provided. The method may include obtaining a plurality of candidate recognition results of speech information uttered by a user and a plurality of preliminary scores corresponding to the plurality of candidate recognition results, respectively. The method may further include, for each of the plurality of candidate recognition results, extracting one or more keywords from the candidate recognition result and determining at least one parameter associated with the one or more extracted keywords. The method may further include, for each of the plurality of candidate recognition results, generating an updating coefficient based on the at least one parameter and updating the preliminary score based on the updating coefficient to generate an updated score. The method may further include determining, from the plurality of candidate recognition results, a target recognition result based on the plurality of updated scores.
    Type: Grant
    Filed: December 14, 2019
    Date of Patent: April 12, 2022
    Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventor: Xiulin Li
  • Patent number: 11302305
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the method includes receiving a voice input from a user device; generating a recognition output; receiving a user selection of one or more terms in the recognition output; receiving a user input of one or more letters replacing the user selected one or more terms; determining suggested correction candidates based in part on the user input and the voice input; and providing one or more suggested correction candidates to the user device as suggested corrected recognition outputs.
    Type: Grant
    Filed: May 14, 2020
    Date of Patent: April 12, 2022
    Assignee: Google LLC
    Inventors: Evgeny A. Cherepanov, Jakob Nicolaus Foerster, Vikram Sridar, Ishai Rabinovitz, Omer Tabach
  • Patent number: 11294474
    Abstract: A virtual collaboration system receives input video data including a participant. The system analyzes the input video data to identify a gesture or a movement made by the participant. The system selects an overlay image as a function of the gesture or the movement made by the participant, incorporates the overlay image into the input video data, thereby generating output video data that includes the overlay image, and transmits the output video data to one or more participant devices.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: April 5, 2022
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Aaron Michael Stewart, Alden Rose, Ellis Anderson
  • Patent number: 11295272
    Abstract: A method, computer program product, and computing system for obtaining encounter information of a patient encounter, wherein the encounter information includes machine vision encounter information; and processing the encounter information to generate an encounter transcript.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: April 5, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Daniel Paulino Almendro Barreda, Dushyant Sharma, Joel Praveen Pinto, Uwe Helmut Jost, Patrick A. Naylor
  • Patent number: 11294542
    Abstract: Different types of media experiences can be developed based on characteristics of the consumer. “Linear” experiences may require execution of a pre-built script, although the script could be dynamically modified by a media production platform. Linear experiences can include guided audio tours that are modified or updated based on the location of the consumer. “Enhanced” experiences include conventional media content that is supplemented with intelligent media content. For example, turn-by-turn directions could be supplemented with audio descriptions about the surrounding area. “Freeform” experiences, meanwhile, are those that can continually morph based on information gleaned from a consumer. For example, a radio station may modify what content is being presented based on the geographical metadata uploaded by a computing device associated with the consumer.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: April 5, 2022
    Assignee: Descript, Inc.
    Inventors: Ryan Terrill Holmes, Steven Surmacz Rubin, Ulf Schwekendiek, David John Williams
  • Patent number: 11295839
    Abstract: A method, computer program product, and computing system for automating a follow-up process is executed on a computing device and includes prompting a patient to provide encounter information via a virtual assistant during a post-visit portion of a patient encounter. Encounter information is obtained from the patient in response to the prompting by the virtual assistant.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: April 5, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Donald E. Owen, Garret N. Erskine, Mehmet Mert Öz
  • Patent number: 11282513
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing dynamic, stroke-based alignment of touch displays. In one aspect, a method includes obtaining a candidate transcription that an automated speech recognizer generates for an utterance, determining a particular context associated with the utterance, determining that a particular n-gram that is included in the candidate transcription is included among a set of undesirable n-grams that is associated with the context, adjusting a speech recognition confidence score associated with the transcription based on determining that the particular n-gram that is included in the candidate transcription is included among the set of undesirable n-grams that is associated with the context, and determining whether to provide the candidate transcription for output based at least on the adjusted speech recognition confidence score.
    Type: Grant
    Filed: June 15, 2020
    Date of Patent: March 22, 2022
    Assignee: Google LLC
    Inventors: Pedro J. Moreno Mengibar, Petar Aleksic
  • Patent number: 11282501
    Abstract: A speech recognition method and apparatus, including implementation and/or training, are disclosed. The speech recognition method includes obtaining a speech signal, and performing a recognition of the speech signal, including generating a dialect parameter, for the speech signal, from input dialect data using a parameter generation model, applying the dialect parameter to a trained speech recognition model to generate a dialect speech recognition model, and generating a speech recognition result from the speech signal by implementing, with respect to the speech signal, the dialect speech recognition model. The speech recognition method and apparatus may perform speech recognition and/or training of the speech recognition model and the parameter generation model.
    Type: Grant
    Filed: October 18, 2019
    Date of Patent: March 22, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sanghyun Yoo, Yoshua Bengio, Inchul Song
  • Patent number: 11282508
    Abstract: A computer implemented method and system for processing an audio signal. The method includes the steps of extracting prosodic features from the audio signal, aligning the extracted prosodic features with a script derived from or associated with the audio signal, and segmenting the script with the aligned extracted prosodic features into structural blocks of a first type. The method may further include determining a distance measure between a structural block of a first type derived from the script with another structural block of the first type using, for example, the Damerau-Levenshtein distance.
    Type: Grant
    Filed: December 9, 2019
    Date of Patent: March 22, 2022
    Assignee: Blue Planet Training, Inc.
    Inventors: Huamin Qu, Yuanzhe Chen, Siwei Fu, Linping Yuan, Aoyu Wu
  • Patent number: 11282518
    Abstract: An information processing apparatus as an image forming apparatus includes an utterance period detecting section, a simple response/statement determining section, and an HDD. The utterance period detecting section detects utterance periods of utterances of each person from voice data. The simple response/statement determining section converts the voice data to a text, determines, when the utterance in the detected utterance period falling within a first period contains any predetermined keyword, that the utterance is a simple response, determines the utterance made for a second period longer than the first period to be a statement, and extracts, for each person, a frequent keyword appearing a predetermined number of times or more in the utterances. The HDD stores determination results of the simple response/statement determining section, the utterance periods for the simple responses, and the utterance periods for the statements, together with the frequent keyword.
    Type: Grant
    Filed: December 18, 2018
    Date of Patent: March 22, 2022
    Assignee: KYOCERA Document Solutions Inc.
    Inventors: Yuki Kobayashi, Nami Nishimura, Tomoko Mano
  • Patent number: 11277277
    Abstract: A method, computer system, and a computer program product for environment personalization is provided. The present invention may include initializing a profile of a user. The present invention may include defining a baseline within the profile of the user. The present invention may include tracking a plurality of user data. The present invention may include storing the tracked plurality of user data in a tracked user database. The present invention may lastly include optimizing an environmental condition based on the tracked plurality of user data.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: March 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Craig M. Trim, Shikhar Kwatra, Adam Lee Griffin, Jeremy R. Fox
  • Patent number: 11270694
    Abstract: An artificial intelligence apparatus for recognizing speech by correcting misrecognized word includes a microphone and a processor. The processor is configured to obtain, via the microphone, speech data including speech of a user, convert the speech data into text by using an acoustic model and a language model, determine whether an uncertain recognition exists in an acoustic recognition result according to the acoustic model, determine whether the converted text is a normal sentence by using a natural language processing model if an uncertain recognition exists in the acoustic recognition result, determine a sentence most similar to the converted text among sentences pre-learned by using the language model if the converted text is not a normal sentence, replace the converted text with the determined most similar sentence, and generate a speech recognition result corresponding to the speech data by using the converted text.
    Type: Grant
    Filed: November 22, 2019
    Date of Patent: March 8, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Jaehong Kim, Heeyeon Choi
  • Patent number: 11272260
    Abstract: Systems and methods for an ephemeral digital story channel may include (1) determining that a particular time period coincides with a life event of a user of a social networking platform, (2) during the time period, maintaining an ephemeral celebratory story channel designated for digital story compositions relating to the life event, and (3) after the time period expires, discontinuing the ephemeral celebratory story channel. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Grant
    Filed: January 24, 2020
    Date of Patent: March 8, 2022
    Assignee: Meta Platforms, Inc.
    Inventor: Debashish Paul
  • Patent number: 11270693
    Abstract: A speech information processing method includes: determining text information corresponding to collected speech information according to a speech recognition technology, wherein the text information comprises a word; with the word in the text information as a target word, determining one or more fuzzy words corresponding to the target word according to a phoneme sequence corresponding to the target and a preset phonetic dictionary, wherein the phonetic dictionary comprises a plurality of words and phoneme sequences corresponding to the plurality of words; and outputting the target word and the one or more fuzzy words corresponding to the target word.
    Type: Grant
    Filed: December 15, 2019
    Date of Patent: March 8, 2022
    Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.
    Inventors: Yi Niu, Hongyu Wang, Xuefang Wu
  • Patent number: 11270708
    Abstract: Various embodiments of the technology described herein alleviate the need to specifically request enrollment information from a user to enroll the user in a voice biometric authentication program. For example, the system can receive a verbal request or a verbal command and non-voice biometric authentication information from a user. The user can be authenticated via a first authentication method using the non-voice biometric authentication information. After the user is authenticated using the first authentication method, the system enrolls the user into a voice biometric authentication program for at least one portion of the verbal request or the verbal command without requesting enrollment information.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: March 8, 2022
    Assignee: UNITED SERVICES AUTOMOBILE ASSOCIATION (USAA)
    Inventors: Zakery Layne Johnson, Maland Keith Mortensen, Gabriel Carlos Fernandez, Debra Randall Casillas, Sudarshan Rangarajan, Thomas Bret Buckingham
  • Patent number: 11264027
    Abstract: An audio processing method includes: acquiring first audio data associated with a first audio signal after waking up a target application; when second audio data associated with a second audio signal is detected in the process of acquiring the first audio data, acquiring the second audio data; and obtaining target audio data according to the first audio data and the second audio data. With the method, a conversation flow can be simplified without waking up a target application again, the first audio data and the second audio data are combined to obtain target audio data, and audio response is made to the target audio data, which can more accurately get real needs of a user, reduce the rate of isolated responding errors, and improve the accuracy of the audio response.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: March 1, 2022
    Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.
    Inventors: Kanghong Lu, Rui Yang, Xiaochuan Feng, Shiqi Cui, Wei Han, Bin Qin, Gang Wang, Dan Li
  • Patent number: 11265603
    Abstract: Provided are an information processing apparatus and method, a display control apparatus and method, a reproducing apparatus and method, and an information processing system that transmit a response of viewers acquired in a more natural way to a place where content is captured, enabling presentation in an easier-to-see way. The information processing apparatus receives motion information indicating motions of users watching video content and information indicating attributes of the users, and generates an excitement image by arranging information visually indicating a degree of excitement of each user determined on the basis of the motion information transmitted from a plurality of reproducing apparatuses at a position according to an attribute of each user.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: March 1, 2022
    Assignee: SONY CORPORATION
    Inventors: Yukio Oobuchi, Yuichi Hasegawa, Yuki Yamamoto
  • Patent number: 11250215
    Abstract: A method for form-based conversation system design is provided. The embodiment may include ingesting, by a processor, a plurality of forms from a given domain. The embodiment may also include extracting indicators of required input fields from the ingested plurality of forms. The embodiment may further include generating a required input list based on the extracted indicators of the required input fields to update a size of the required input list. The embodiment may also include determining transactional intents based on the required input list. The embodiments may further include generating a dialog flow that satisfies the determined transactional intents.
    Type: Grant
    Filed: July 12, 2019
    Date of Patent: February 15, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrew R. Freed, Corville O. Allen, Joseph Kozhaya, Shikhar Kwatra
  • Patent number: 11250387
    Abstract: In non-limiting examples of the present disclosure, systems, methods and devices for assisting with scheduling a meeting are presented. A message comprising a plurality of sentences may be received. A hierarchical attention model may be utilized to identify a subset of sentences of the plurality of sentences that are relevant to a scheduling of the meeting. A subset of words in the subset of sentences that are potentially relevant to scheduling of the meeting may be identified based on relating to at least one meeting parameter. The subset of words may be split into a first group comprising words from the subset of words that are above a meeting relevance threshold value, and a second group comprising words from the subset of words that are below a meeting relevance threshold value. An automated action associated with scheduling the meeting may be caused to be performed.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: February 15, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Charles Yin-Che Lee, Pamela Bhattacharya, Barun Patra
  • Patent number: 11250850
    Abstract: An electronic apparatus includes a communicator configured to communicate with a plurality of external apparatus. A storage is configured to store situation information. A processor is configured to, based on a first utterance of a user, control a first operation corresponding to the first utterance to be carried out from among a plurality of operations related to the plurality of external apparatuses. Situation information corresponding to each of a plurality of situations where the first operation is carried out based on the first utterance is stored in the storage. Based on a second utterance of the user, a second operation is identified corresponding to the second utterance from among the plurality of operations based on the stored situation information, and the identified second operation is carried out.
    Type: Grant
    Filed: November 1, 2018
    Date of Patent: February 15, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Eun Heui Jo, Jae Hyun Bae
  • Patent number: 11250213
    Abstract: A method, computer system, and computer program product for form-based conversation system design are provided. The embodiment may include ingesting, by a processor, a plurality of forms from a given domain. The embodiment may also include extracting indicators of required input fields from the ingested plurality of forms. The embodiment may further include generating a required input list based on the extracted indicators of the required input fields to update a size of the required input list. The embodiment may also include determining transactional intents based on the required input list. The embodiments may further include generating a dialog flow that satisfies the determined transactional intents.
    Type: Grant
    Filed: April 16, 2019
    Date of Patent: February 15, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrew R. Freed, Corville O. Allen, Joseph Kozhaya, Shikhar Kwatra
  • Patent number: 11244679
    Abstract: An electronic device according to various embodiments of the present invention comprises: a display; a communication circuit; a processor electrically connected with the display and the communication circuit; and a memory electrically connected with the processor, wherein when instructions, which can be included by the memory, are executed, the processor acquires message data received through a communication circuit and confirms attribute information included in the message data and, resulting from the confirmation, the display displays first text data for a first time if the text data included in the message data is first text data inputted from a touch screen of an external electronic device, and the display displays second text data for a second time different from the first time if the text data included in the message data is second text data converted from voice data.
    Type: Grant
    Filed: February 12, 2018
    Date of Patent: February 8, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nojoon Park, Junhyung Park, Hyojung Lee, Taehee Lee, Geonsoo Kim, Hanjib Kim, Yongjoon Jeon
  • Patent number: 11240376
    Abstract: A method to transcribe communications is provided. The method may include obtaining first communication data during a communication session between a first communication device and a second communication device and transmitting the first communication data to the second communication device by way of a mobile device that is locally coupled with the first communication device. The method may also include receiving, at the first communication device, second communication data from the second communication device through the mobile device and transmitting the second communication data to a remote transcription system. The method may further include receiving, at the first communication device, transcription data from the remote transcription system, the transcription data corresponding to a transcription of the second communication data, the transcription generated by the remote transcription system and presenting, by the first communication device, the transcription of the second communication data.
    Type: Grant
    Filed: February 25, 2020
    Date of Patent: February 1, 2022
    Assignee: Sorenson IP Holdings, LLC
    Inventor: Jasper Cheekeong Pan
  • Patent number: 11227599
    Abstract: The present disclosure generally relates to voice-control for electronic devices. In some embodiments, the method includes, in response to detecting a plurality of utterances, associating the plurality of operations with a first stored operation set and detecting a second set of one or more inputs corresponding to a request to perform the operations associated with the first stored operation set; and performing the plurality of operations associated with the first stored operation set, in the respective order.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: January 18, 2022
    Assignee: Apple Inc.
    Inventors: Jigar Vasant Gada, Mosab Hatem Elagha
  • Patent number: 11222044
    Abstract: Natural language image search is described, for example, whereby natural language queries may be used to retrieve images from a store of images automatically tagged with image tags being concepts of an ontology (which may comprise a hierarchy of concepts). In various examples, a natural language query is mapped to one or more of a plurality of image tags, and the mapped query is used for retrieval. In various examples, the query is mapped by computing one or more distance measures between the query and the image tags, the distance measures being computed with respect to the ontology and/or with respect to a semantic space of words computed from a natural language corpus. In examples, the image tags may be associated with bounding boxes of objects depicted in the images, and a user may navigate the store of images by selecting a bounding box and/or an image.
    Type: Grant
    Filed: May 16, 2014
    Date of Patent: January 11, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Motaz Ahmad El-Saban, Ahmed Yassin Tawfik, Achraf Abdel Moneim Tawfik Chalabi, Sayed Hassan Sayed