Translation Patents (Class 704/277)
  • Patent number: 12087296
    Abstract: A display device according to an embodiment of the present disclosure includes an output unit, a communication unit configured to perform communication with an artificial intelligence server, and a control unit configured to receive a voice command, convert the received voice command into text data, determine whether the converted text data is composed of a plurality of languages, when the text data is composed of the plurality of languages, determine a language for a voice recognition service among the plurality of languages based on the text data, and output an intent analysis result of the voice command in the determined language.
    Type: Grant
    Filed: September 19, 2019
    Date of Patent: September 10, 2024
    Assignee: LG ELECTRONICS INC.
    Inventors: Changmin Kwak, Jaekyung Lee
  • Patent number: 12079540
    Abstract: An electronic device to perform an operation corresponding to a first user voice when the first user voice is received through the microphone; store, information about an operation corresponding to the first user voice and user reaction information including the user command when a user command is received through the input unit within the first threshold time from when the first user voice is received, or from when the operation corresponding to the first user voice is performed; perform an operation corresponding to a second user voice when the second user voice is received through the microphone; and provide guide information corresponding to the user command on the basis of the user reaction information stored in the memory when the type of operation corresponding to the first user voice is the same as the type of operation corresponding to the second user voice.
    Type: Grant
    Filed: October 12, 2021
    Date of Patent: September 3, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Youngsoo Yun
  • Patent number: 12080271
    Abstract: Computer generated speech can be generated for cross-lingual natural language textual data streams by utilizing a universal phoneme set. In a variety of implementations, the natural language textual data stream includes a primary language portion in a primary language and a secondary language portion that is not in the primary language. Phonemes corresponding to the secondary language portion can be determined from a set of phonemes in a universal data set. These phonemes can be mapped back to a set of phonemes for the primary language. Audio data can be generated for these phonemes to pronounce the secondary language portion of the natural language textual data stream utilizing phonemes associated with the primary language.
    Type: Grant
    Filed: August 26, 2022
    Date of Patent: September 3, 2024
    Assignee: GOOGLE LLC
    Inventors: Ami Patel, Siamak Tazari
  • Patent number: 12080296
    Abstract: Apparatuses, methods, and program products are disclosed for performing a transcription action. One apparatus includes at least one processor and a memory that stores code executable by the at least one processor. The code is executable by the processor to monitor, by use of the at least one processor, a quality of audio information. The code is executable by the processor to determine whether the quality of the audio information is below a predetermined threshold. The code is executable by the processor to, in response to determining that the quality of the audio information is below the predetermined threshold, perform a transcription action corresponding to the audio information.
    Type: Grant
    Filed: March 16, 2021
    Date of Patent: September 3, 2024
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: John C. Mese, Arnold S. Weksler, Mark Patrick Delaney, Nathan J. Peterson, Russell Speight VanBlon
  • Patent number: 12061873
    Abstract: Techniques performed by a data processing system for analyzing the lexical difficulty of words of textual content include analyzing a plurality of textual content sources to determine a first frequency at which each of a plurality of first words appears, analyzing search data to determine a second frequency at which each of the plurality of first words appear in searches for a definition, generating a lexical difficulty model based on the first frequency and the second frequency, the model is configured to receive a word as an input and to output a prediction for how difficult the word is likely to be for a user, receiving a request to analyze first textual content from a client device, analyzing the first textual content using the lexical difficulty model to generate lexical difficulty information, and sending a response to the client device that includes requested information.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: August 13, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Priyanka Subhash Kulkarni, Robert Rounthwaite
  • Patent number: 12057110
    Abstract: An information processing method applied to a computation circuit is disclosed. The computation circuit includes a communication circuit and an operation circuit. The method includes controlling, by the computation circuit, the communication circuit to obtain a voice to be identified input by a user; controlling, by the computation circuit, the operation circuit to obtain and call an operation instruction to perform voice identification processing on the voice to be identified to obtain target text information corresponding to the voice to be identified. The operation instruction is a preset instruction for voice identification.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: August 6, 2024
    Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD.
    Inventors: Tianshi Chen, Shaoli Liu, Zai Wang, Shuai Hu
  • Patent number: 11990117
    Abstract: A method for training a speech recognition model includes obtaining a multilingual text-to-speech (TTS) model. The method also includes generating a native synthesized speech representation for an input text sequence in a first language that is conditioned on speaker characteristics of a native speaker of the first language. The method also includes generating a cross-lingual synthesized speech representation for the input text sequence in the first language that is conditioned on speaker characteristics of a native speaker of a different second language. The method also includes generating a first speech recognition result for the native synthesized speech representation and a second speech recognition result for the cross-lingual synthesized speech representation. The method also includes determining a consistent loss term based on the first speech recognition result and the second speech recognition result and updating parameters of the speech recognition model based on the consistent loss term.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: May 21, 2024
    Assignee: Google LLC
    Inventors: Zhehuai Chen, Bhuvana Ramabhadran, Andrew Rosenberg, Yu Zhang, Pedro J. Moreno Mengibar
  • Patent number: 11984126
    Abstract: A device for recognizing a speech input and an operating method thereof are provided. The device may be configured to: obtain one or more text candidates comprising a character string in which it is predicted that the speech input is to be converted by recognizing a speech input using an automatic speech recognition (ASR) model; extract text history information corresponding to the speech input from a database by comparing the speech input with a plurality of speech signals previously stored in the database; and perform training to adjust a weight of each of the one or more text candidates using the extracted text history information. Also, a method in which the device recognizes a speech input using an AI model may be performed.
    Type: Grant
    Filed: August 10, 2021
    Date of Patent: May 14, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyungtak Choi, Jongyoub Ryu
  • Patent number: 11978541
    Abstract: Medical information is communicated between different entities. Personalized models of peoples' understanding of a medical field are created. Role information is used to assign user-appropriate ontologies. More information than mere role may be used for assigning ontologies, such as information on past medical history. The concepts and relationships in different ontologies may be linked, providing a translation from one personalized model to another. The terminology with similar or the same concepts and/or relationships is output for a given user based on their model.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: May 7, 2024
    Assignee: CERNER INNOVATION, INC.
    Inventor: John D. Haley
  • Patent number: 11977852
    Abstract: A device configured to receive a sentence that includes a plurality of words. The device is further configured to input the words into a machine learning model that is configured to output a first feature vector based on the words. The device is further configured to identify a keyword within the sentence and to determine that the keyword is an implicit reference to an item. The device is further configured to identify a second feature vector in a reference list that closest matches a numeric value of the first feature vector and to identify an explicit reference in the reference list that is associated with the second feature vector. The device is further configured to replace the keyword with the explicit reference in the sentence and to output the sentence that includes the first explicit reference.
    Type: Grant
    Filed: January 12, 2022
    Date of Patent: May 7, 2024
    Assignee: Bank of America Corporation
    Inventors: Aaron Michael Hosford, Donatus E. Asumu, Emad Noorizadeh, Ramakrishna Reddy Yannam
  • Patent number: 11972227
    Abstract: A speech translation system and methods for cross-lingual communication that enable users to improve and customize content and usage of the system and easily. The methods include, in response to receiving an utterance including a first term associated with a field, translating the utterance into a second language. In response to receiving an indication to add the first term associated with the field to a first recognition lexicon, adding the first term associated with the field and the determined translation to a first machine translation module and to a shared database for a community associated with the field of the first term associated with the field, wherein the first term associated with the field added to the shared database is accessible by the community.
    Type: Grant
    Filed: December 7, 2021
    Date of Patent: April 30, 2024
    Assignee: Meta Platforms, Inc.
    Inventors: Alexander Waibel, Ian R. Lane
  • Patent number: 11967248
    Abstract: A method for foreign language learning between a learner and a terminal, based on video or audio containing foreign language, particularly, to a conversation-based foreign language learning method using a speech recognition function and a TTS function of a terminal, a learner learns a foreign language in a way that: the terminal reads a current learning target sentence to the learner to allow the learner to speak the current learning target sentence after the terminal, when speech input by the learner in a speech waiting state of the terminal is the same as the current learning target sentence or belongs to the same category as the current learning target sentence; and the terminal and the learner alternately speak sentences one-by-one when the speech input by the learner is the same as the next sentence of the current learning target sentence or belongs to the same category as the next sentence.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: April 23, 2024
    Inventor: Jangho Lee
  • Patent number: 11922932
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses a set of speech recognition hypothesis samples, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.
    Type: Grant
    Filed: March 31, 2023
    Date of Patent: March 5, 2024
    Assignee: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
  • Patent number: 11908451
    Abstract: A text-based virtual object animation generation includes acquiring text information, where the text information includes an original text of a virtual object animation to be generated; analyzing an emotional feature of the text information; performing speech synthesis according to the emotional feature, a rhyme boundary, and the text information to obtain audio information, where the audio information includes emotional speech obtained by conversion based on the original text; and generating a corresponding virtual object animation based on the text information and the audio information, where the virtual object animation is synchronized in time with the audio information.
    Type: Grant
    Filed: August 9, 2021
    Date of Patent: February 20, 2024
    Assignees: Mofa (Shanghai) Information Technology Co., Ltd., Shanghai Movu Technology Co., Ltd.
    Inventors: Congyi Wang, Yu Chen, Jinxiang Chai
  • Patent number: 11893359
    Abstract: This application discloses an audio processing method and a terminal. The method may include: collecting, by a first terminal, an original speech of a first user, translating the original speech of the first user into a translated speech of the first user, receiving an original speech of a second user that is sent by a second terminal, and translating the original speech of the second user into a translated speech of the second user; sending at least one of the original speech of the first user, the translated speech of the first user, and the translated speech of the second user to the second terminal based on a first setting; and playing at least one of the original speech of the second user, the translated speech of the second user, and the translated speech of the first user based on a second setting.
    Type: Grant
    Filed: April 14, 2021
    Date of Patent: February 6, 2024
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Xin Zhang, Gan Zhao
  • Patent number: 11868732
    Abstract: This disclosure describes techniques and architectures for evaluating conversations. In some instances, conversations with users, virtual assistants, and others may be analyzed to identify potential risks within a language model that is employed by the virtual assistants and other entities. The potential risks may be evaluated by administrators, users, systems, and others to identify potential issues with the language model that need to be addressed. This may allow the language model to be improved and enhance user experience with the virtual assistants and others that employ the language model.
    Type: Grant
    Filed: August 8, 2022
    Date of Patent: January 9, 2024
    Assignee: Verint Americas Inc.
    Inventors: Cynthia Freeman, Ian Beaver
  • Patent number: 11853707
    Abstract: Technologies are provided for determining deficiencies in narrative textual data that may impact decision-making in a decisional context. A candidate text document and a reference corpus of text may be utilized to generate one or more topic models and document-term matrices, and then to determine a corresponding statistical perplexity and probabilistic coherence. Statistical determinations of a degree to which the candidate deviates from the reference normative corpus are determined, in terms of the statistical perplexity and probabilistic coherence of the candidate as compared to the reference. If the difference is statistically significant, a message may be reported to user, such as the author or an auditor of the candidate text document, so that the user has the opportunity to amend the candidate document so as to improve its adequacy for the decisional purposes in the context at hand.
    Type: Grant
    Filed: October 13, 2021
    Date of Patent: December 26, 2023
    Assignee: Cerner Innovation, Inc.
    Inventor: Douglas S. McNair
  • Patent number: 11847424
    Abstract: Devices and techniques are generally described for data-to-text generation. In various examples, a first machine learned model may receive first data including a structured representation of linguistic data. In various examples, the first machine learned model may generate first output data comprising a first natural language representation of the first data. In at least some examples, a second machine learning model may determine second data indicating that the first natural language representation is a semantically accurate representation of the first data. In some examples, the first output data may be selected for output based at least in part on the second data.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: December 19, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Hamza Harkous, Isabel Groves, Amir Reza Safari Azar Alamdari
  • Patent number: 11847857
    Abstract: A vehicle device setting method including: capturing, by an image sensing unit, a first image frame; recognizing a user ID according to the first image frame; showing ID information of the recognized user ID on a screen or by a speaker; capturing a second image frame; generating a confirm signal when a first user expression is recognized by calculating an expression feature in the second image frame and comparing the recognized expression feature with stored expression data associated with a predetermined user expression to confirm whether the recognized user ID is correct or not according to the second image frame captured after the ID information is shown; controlling an electronic device according to the confirm signal; and entering a data update mode instructed by the user and updating setting information of the electronic device by current electronic device setting according to a saving signal generated by confirming a second user expression in a third image frame captured after the user ID is confirmed
    Type: Grant
    Filed: November 24, 2021
    Date of Patent: December 19, 2023
    Assignee: PIXART IMAGING INC.
    Inventors: Liang-Chi Chiu, Yu-Han Chen, Ming-Tsan Kao
  • Patent number: 11843716
    Abstract: A translation method includes: a first electronic device establishes a call connection to a second electronic device and then displays a call interface; after receiving a first operation of a first user, the first electronic device switches from displaying the call to displaying a translation interface; when receiving a first speech of the first user in a first language, the translation interface sequentially displays at least a first text and a second text, where the first text is obtained by recognizing the first speech, and the second text is obtained by translating the first speech into a target language; and the first electronic device sends a machine speech in the target language to the second electronic device.
    Type: Grant
    Filed: June 28, 2022
    Date of Patent: December 12, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Hejin Gu, Long Wang
  • Patent number: 11830494
    Abstract: An eyewear device with camera-based compensation that improves the user experience for user's having partial blindness or complete blindness. The camera-based compensation determines features, such as objects, and then converts the determined objects to audio that is indicative of the objects and that is perceptible to the eyewear user. The camera-based compensation may use a region-based convolutional neural network (RCNN) to generate a feature map including text that is indicative of objects in images captured by a camera. The feature map is then processed through a speech to audio algorithm featuring a natural language processor to generate audio indicative of the objects in the processed images.
    Type: Grant
    Filed: December 20, 2022
    Date of Patent: November 28, 2023
    Assignee: Snap Inc.
    Inventor: Stephen Pomes
  • Patent number: 11818406
    Abstract: A network-attached storage device (NAS) includes a non-volatile memory module storing a media stream, a network interface, and control circuitry coupled to the non-volatile memory module and to the network interface and configured to connect to a client over a network connection using the network interface, receive a request for the media stream from the client, determine subtitle preferences associated with the request for the media stream, access an audio stream associated with the media stream, generate subtitles based on the audio stream, and send a transport stream to the client over the network connection, the transport stream including the media stream and the subtitles.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: November 14, 2023
    Assignee: Western Digital Technologies, Inc.
    Inventor: Ramanathan Muthiah
  • Patent number: 11798538
    Abstract: This disclosure relates to answer prediction in a speech processing system. The system may disambiguate entities spoken or implied in a request to initiate an action with respect to a target user. To initiate the action, the system may determine one or more parameters; for example, the target (e.g., a contact/recipient), a source (e.g., a caller/requesting user), and a network (voice over internet protocol (VOIP), cellular, video chat, etc.). Due to the privacy implications of initiating actions involving data transfers between parties, the system may apply a high threshold for a confidence associated with each parameter. Rather than ask multiple follow-up questions, which may frustrate the requesting user, the system may attempt to disambiguate or determine a parameter, and skip a question regarding the parameter if it can predict an answer with high confidence. The system can improve the customer experience while maintaining security for actions involving, for example, communications.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: October 24, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Christopher Geiger Parker, Piyush Bhargava, Aparna Nandyal, Rajagopalan Ranganathan, Mugunthan Govindaraju, Vidya Narasimhan
  • Patent number: 11792338
    Abstract: An image forming system is configured to receive an input of natural language speech. Regardless of whether the natural language speech includes a combination of first words or second words, the image forming system can recognize the natural language speech as an instruction to select a specific print setting displayed on a screen.
    Type: Grant
    Filed: August 27, 2021
    Date of Patent: October 17, 2023
    Assignee: Canon Kabushiki Kaisha
    Inventors: Toru Takahashi, Yuji Naya, Takeshi Matsumura
  • Patent number: 11734509
    Abstract: Methods, systems and computer program products for multi-style text transformation are provided herein. A computer-implemented method includes selecting at least one set of style specifications for transforming at least a portion of input text. The at least one set of style specifications include one or more target writing style domains selected from a plurality of writing style domains, weights for each of the target writing style domains representing relative impact of the target writing style domains for transformation of at least a portion of the input text, and weights for each of a set of linguistic aspects for transformation of at least a portion of the input text. The computer-implemented method also includes generating one or more style-transformed output texts based at least in part on the at least one set of style specifications utilizing at least one unsupervised neural network.
    Type: Grant
    Filed: December 29, 2020
    Date of Patent: August 22, 2023
    Assignee: International Business Machines Corporation
    Inventors: Abhijit Mishra, Parag Jain, Amar P. Azad, Karthik Sankaranarayanan
  • Patent number: 11721333
    Abstract: The disclosure relates to an artificial intelligence (AI) system using a learned AI model according to at least one of machine learning, neural network, or a deep learning algorithm and applications thereof. In the disclosure, a control method of an electronic apparatus is provided. The control method comprises the steps of: displaying an image including at least one object receiving a voice; inputting the voice to an AI model learned by an AI algorithm to identify an object related to the voice among the at least one object included in the image and acquire tag information about the identified object; and providing the obtained tag information.
    Type: Grant
    Filed: January 11, 2019
    Date of Patent: August 8, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Younghwa Lee, Jinhe Jung, Meejeong Park, Inchul Hwang
  • Patent number: 11721329
    Abstract: In the present invention, a method for searching multilingual keywords in mixlingual speech corpus is proposed. This method is capable of searching audio as well as text keywords. The capability of audio search enables it to search out-of-vocabulary (OOV) words. The capability of searching text keywords enables it to perform semantic search. An advanced application of searching keyword translations in mixlingual speech corpus is also possible within posteriorgram framework with this system. Also, a technique for combining information from text and audio keywords is given which further enhances the search performance. This system is based on multiple posteriorgrams based on articulatory classes trained with multiple languages.
    Type: Grant
    Filed: September 10, 2018
    Date of Patent: August 8, 2023
    Assignees: Indian Institute of Technology, Delhi, Centre for Development of Telematics
    Inventors: Arun Kumar, Abhimanyu Popli
  • Patent number: 11705122
    Abstract: According to one embodiment, the interface-providing apparatus comprises an identifying unit and a generating unit. The identifying unit identifies a keyword from dialogue data including a question text to request information, and a response text in reply thereto. The generating unit generates display information to display a user interface for receiving feedback input relating to a degree of usefulness of a keyword when searching for the requested information.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: July 18, 2023
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kenji Iwata, Hiroshi Fujimura, Takami Yoshida
  • Patent number: 11694028
    Abstract: According to one embodiment, the data generation apparatus includes a speech synthesis unit, a speech recognition unit, a matching processing unit, and a dataset generation unit. The speech synthesis unit generates speech data from an original text. The speech recognition unit generates a recognition text by speech recognition from the speech data. The matching processing unit performs matching between the original text and the recognition text. The dataset generation unit generates a dataset in such a manner where the speech data, from which the recognition text satisfying a certain condition for a matching degree relative to the original text is generated, is associated with the original text, based on a matching result.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: July 4, 2023
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Hiroshi Fujimura, Kenji Iwata, Hui Di, Pengfei Chen
  • Patent number: 11694682
    Abstract: In various embodiments, a voice command is associated with a plurality of processing steps to be performed. The plurality of processing steps may include analysis of audio data using automatic speech recognition, generating and selecting a search query from the utterance text, and conducting a search of database of items using a search query. The plurality of processing steps may include additional or different steps, depending on the type of the request. In performing one or more of these processing steps, an error or ambiguity may be detected. An error or ambiguity may either halt the processing step or create more than one path of actions. A model may be used to determine if and how to request additional user input to attempt to resolve the error or ambiguity. The voice-enabled device or a second client device is then causing to output a request for the additional user input.
    Type: Grant
    Filed: December 11, 2019
    Date of Patent: July 4, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Julia Reinspach, Oleg Rokhlenko, Ramakanthachary Gottumukkala, Giovanni Clemente, Ankit Agrawal, Swayam Bhardwaj, Guy Michaeli, Vaidyanathan Puthucode Krishnamoorthy, Costantino Vlachos, Nalledath P. Vinodkrishnan, Shaun M. Vickers, Sethuraman Ramachandran, Charles C. Moore
  • Patent number: 11687719
    Abstract: A method for identifying errors associated with named entity recognition includes recognizing a candidate named entity within a text and extracting a chunk from the text containing the candidate named entity. The method further includes creating a feature vector associated with the chunk and analyzing the feature vector for an indication of an error associated with the candidate named entity. The method also includes correcting the error associated with the candidate named entity.
    Type: Grant
    Filed: March 1, 2021
    Date of Patent: June 27, 2023
    Assignee: LEVERTON HOLDING LLC
    Inventors: Christian Schäfer, Michael Kieweg, Florian Kuhlmann
  • Patent number: 11682388
    Abstract: An AI apparatus includes a microphone to acquire speech data including multiple languages, and a processor to acquire text data corresponding to the speech data, determine a main language from languages included in the text data, acquire a translated text data obtained by translating a text data portion, which has a language other than the main language, in the main language, acquire a morpheme analysis result for the translated text data, extract a keyword for intention analysis from the morpheme analysis result, acquire an intention pattern matched to the keyword, and perform an operation corresponding to the intention pattern.
    Type: Grant
    Filed: June 2, 2022
    Date of Patent: June 20, 2023
    Assignee: LG ELECTRONICS INC
    Inventors: Yejin Kim, Hyun Yu, Jonghoon Chae
  • Patent number: 11669107
    Abstract: A guidance system S includes a plurality of autonomous mobile robots (1) which guide a user to a destination, and a reception apparatus (2) which is provided separately from the robots (1) and recognizes the destination. Availability of each of the plurality of robots (1) is managed based on a state of the robot and the destination.
    Type: Grant
    Filed: December 25, 2018
    Date of Patent: June 6, 2023
    Assignee: HONDA MOTOR CO., LTD.
    Inventor: Kenichiro Sugiyama
  • Patent number: 11669695
    Abstract: A translation method, implemented by a computer, includes: converting a text written in a first language into a replacement text in which a named entity in the text is replaced with a predetermined character string; translating the replacement text into a second language by using a text translation model which is a neural network; and translating a named entity corresponding to the predetermined character string in the replacement text into the second language by using a named entity translation model which is a neural network.
    Type: Grant
    Filed: March 17, 2020
    Date of Patent: June 6, 2023
    Assignee: FUJITSU LIMITED
    Inventors: Akiba Miura, Tomoya Iwakura
  • Patent number: 11646034
    Abstract: An information processing system includes: a first device configured to acquire a user's uttered voice, transfer the user's uttered voice to at least one of a second and a third devices each actualizing a voice interaction agent, when a control command is acquired, convert a control signal based on the acquired control command to a control signal that matches the second device, and transmit the converted control signal to the second device; a second device configured to recognize the uttered voice transferred from the first device, and output, to the first device, a control command regarding a recognition result obtained by recognizing the uttered voice and response data based on the control signal; and a third device configured to recognize the uttered voice transferred from the first device, and output, to the first device, a control command regarding a recognition result obtained by recognizing the uttered voice.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: May 9, 2023
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Satoshi Aihara
  • Patent number: 11646019
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses N-best lists of decoded hypotheses, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.
    Type: Grant
    Filed: July 27, 2021
    Date of Patent: May 9, 2023
    Assignee: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
  • Patent number: 11636673
    Abstract: A system enhances existing audio-visual content with audio describing the setting of the visual content. A scene annotation module classifies scene elements from an image frame received from a host system and generates a caption describing the scene elements.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: April 25, 2023
    Assignee: SONY INTERACTIVE ENTERTAINMENT INC.
    Inventors: Sudha Krishnamurthy, Justice Adams, Arindam Jati, Masanori Omote, Jian Zheng
  • Patent number: 11630961
    Abstract: A device includes a memory adapted to store a list in a file or database comprising a plurality of vocabulary words in a first language and, for each vocabulary word, a corresponding word in a second language, a display device, and a processor. The processor is adapted to receive a plurality of words in the first language, select one or more words among the plurality of words, based on one or more predetermined criteria, translate, match or equate the one or more selected words from the first language to words of the second language, and cause the display device to display the plurality of words, wherein one or more first words that are in the plurality of words and are not among the one or more selected words which are displayed in the first language and one or more second words that are in the plurality of words and are among the one or more selected words are displayed in the second language.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: April 18, 2023
    Inventor: Robert F. Deming, Jr.
  • Patent number: 11626101
    Abstract: Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user's spoken command.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: April 11, 2023
    Assignee: Comcast Cable Communications, LLC
    Inventors: George Thomas Des Jardins, Vikrant Sagar
  • Patent number: 11625542
    Abstract: A co-user list may be configured based on user interaction in a virtual world environment. A first user may be enabled to navigate the virtual world environment using an instant messenger application that includes the co-user list. A second user that is located proximate to the first user in the virtual world environment may be detected. An attribute associated with the second user may be determined. The co-user list may be configured based on the attribute associated with the second user.
    Type: Grant
    Filed: February 24, 2021
    Date of Patent: April 11, 2023
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: David S. Bill
  • Patent number: 11610588
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating contextually relevant transcripts of voice recordings based on social networking data. For instance, the disclosed systems receive a voice recording from a user corresponding to a message thread including the user and one or more co-users. The disclosed systems analyze acoustic features of the voice recording to generate transcription-text probabilities. The disclosed systems generate term weights for terms corresponding to objects associated with the user within a social networking system by analyzing user social networking data. Using the contextually aware term weights, the disclosed systems adjust the transcription-text probabilities. Based on the adjusted transcription-text probabilities, the disclosed systems generate a transcript of the voice recording for display within the message thread.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: March 21, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: James Matthew Grichnik, Chetan Parag Gupta, Fuchun Peng, Yinan Zhang, Si Chen
  • Patent number: 11594226
    Abstract: An embodiment includes converting an original audio signal to an original text string, the original audio signal being from a recording of the original text string spoken by a specific person in a source language. The embodiment generates a translated text string by translating the original text string from the source language to a target language, including translation of a word from the source language to a target language. The embodiment assembles a standard phoneme sequence from a set of standard phonemes, where the standard phoneme sequence includes a standard pronunciation of the translated word. The embodiment also associates a custom phoneme with a standard phoneme of the standard phoneme sequence, where the custom phoneme includes the specific person's pronunciation of a sound in the translated word. The embodiment synthesizes the translated text string to a translated audio signal including the translated word pronounced using the custom phoneme.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: February 28, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Su Liu, Yang Liang, Debbie Anglin, Fan Yang
  • Patent number: 11586831
    Abstract: Provided are a speech translation method and apparatus, an electronic device and a storage medium. The method includes: acquiring a source speech corresponding to a to-be-translated language; acquiring a specified target language; inputting the source speech and indication information matched with the target language into a pre-trained speech translation model, where the speech translation model is configured to translate a language in a first language set into a language in a second language set, the first language set includes a plurality of languages, the first language set includes the to-be-translated language, the second language set includes a plurality of languages, and the second language set includes the target language; and acquiring a translated speech corresponding to the target language and output by the speech translation model; where the to-be-translated language is different from the target language.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: February 21, 2023
    Assignee: BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.
    Inventors: Mingxuan Wang, Qianqian Dong, Lei Li
  • Patent number: 11574633
    Abstract: Enhanced graphical user interfaces for transcription of audio and video messages is disclosed. Audio data may be transcribed, and the transcription may include emphasized words and/or punctuation corresponding to emphasis of user speech. Additionally, the transcription may be translated into a second language. A message spoken by a user depicted in one or more images of video data may also be transcribed and provided to one or more devices.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: February 7, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Sandra Lemon, Nancy Yi Liang
  • Patent number: 11568858
    Abstract: A computer-implemented method of building a multilingual acoustic model for automatic speech recognition in a low resource setting includes training a multilingual network on a set of training languages with an original transcribed training data to create a baseline multilingual acoustic model. Transliteration of transcribed training data is performed by processing through the multilingual network a plurality of multilingual data types from the set of languages, and outputting a pool of transliterated data. A filtering metric is applied to the pool of transliterated data output to select one or more portions of the transliterated data for retraining of the acoustic model. Data augmentation is performed by adding one or more selected portions of the output transliterated data back to the original transcribed training data to update training data. The training of a new multilingual acoustic model through the multilingual network is performed using the updated training data.
    Type: Grant
    Filed: October 17, 2020
    Date of Patent: January 31, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Samuel Thomas, Kartik Audhkhasi, Brian E. D. Kingsbury
  • Patent number: 11551692
    Abstract: In one aspect, a server that receives, from a client terminal via a network, a request to initiate a verbal conversation using natural language that is in a spoken or textual format, extracts information during the verbal conversation, determines a context of the verbal conversation, receives an inquiry during the verbal conversation, processes the inquiry, acquires response information based on the determined appropriate response, and transmits to the client terminal the response information.
    Type: Grant
    Filed: July 19, 2021
    Date of Patent: January 10, 2023
    Assignee: FIRST ADVANTAGE CORPORATION
    Inventors: Arun N. Kumar, Stefano Malnati
  • Patent number: 11532306
    Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.
    Type: Grant
    Filed: December 3, 2020
    Date of Patent: December 20, 2022
    Assignee: Apple Inc.
    Inventors: Yoon Kim, John Bridle, Joshua D. Atkins, Feipeng Li, Mehrez Souden
  • Patent number: 11514899
    Abstract: A method and apparatus for correcting a wrongly-translated word in a device employing speech recognition is provided herein. During operation, a device will use a second language to correct a wrongly-translated word that was wrongly translated using a first language. More particularly, after speech recognition is performed using the first language, when a user selects text to be corrected, the user will utter the speech again using the second language that differs from the first language. Both the first and the second language can be used by the device to determine a best translation of the speech.
    Type: Grant
    Filed: January 21, 2020
    Date of Patent: November 29, 2022
    Assignee: MOTOROLA SOLUTIONS, INC.
    Inventors: Yen Hsiang Chew, Chin Chuan Yap, Syed Mohamed Mooulana Jamaluddin, Lee Sun Ooi
  • Patent number: 11513371
    Abstract: In one embodiment, an eyeglass frame includes a lens holder, a first temple with a first end close to the lens holder and a second end, a second temple, an electrical connector and a printed circuit board. The printed circuit board with at least one electrical component can be in the first temple. The connector can be close to the first end of the first temple, facing downward, and electrically connected to the at least one electrical component. In another embodiment, an eyeglass frame includes a first printed circuit board, with at least one electrical component. The first printed circuit board can be connected to an electrical component at the frame via a second printed circuit board.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: November 29, 2022
    Assignee: IngenioSpec, LLC
    Inventors: Thomas A. Howell, David Chao, C. Douglass Thomas, Peter P. Tong
  • Patent number: 11508364
    Abstract: An artificial intelligence (AI) system is provided. The AI system simulates functions of human brain such as recognition and judgment by utilizing a machine learning algorithm such as deep learning, etc. and an application of the AI system. A method, performed by an electronic device, of outputting a response to a speech input by using an application, includes receiving the speech input, obtaining text corresponding to the speech input by performing speech recognition on the speech input, obtaining metadata for the speech input based on the obtained text, selecting at least one application from among a plurality of applications for outputting the response to the speech input based on the metadata, and outputting the response to the speech input by using the selected at least one application.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: November 22, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Cheenepalli Srirama Krishna Bhargava, Ankush Gupta