Translation Patents (Class 704/277)
-
Patent number: 12254868Abstract: A system and method for providing a custom response to a voice command of a specific user. The method encompasses receiving, at a transceiver unit [102] from a user device, a custom voice response preference setting associated with the specific user. The method thereafter leads to receiving, at the transceiver unit [102] from a first target device, a voice command of the specific user. The method thereafter encompasses generating, by a processing unit [104], a custom response to the voice command of the specific user based at least on the custom voice response preference setting. Further, the method encompasses identifying, by an identification unit [106], a second target device from one or more devices present in vicinity of the specific user. Thereafter, the method comprises providing, by the processing unit [104], the generated custom response to the voice command of the specific user via the second target device.Type: GrantFiled: June 1, 2021Date of Patent: March 18, 2025Assignee: JIO PLATFORMS LIMITEDInventors: Vishal Shashikant Patil, Gulprit Singh, Rajeev Gupta
-
Patent number: 12230260Abstract: One embodiment provides a method, including: receiving, at an information handling device, text associated with a user command; storing, in a data store, an encrypted form of the text associated with the user command; determining, using a processor, whether the encrypted form of the text has been detected in other user commands in exceedance of a predetermined threshold; and storing, responsive to determining that the encrypted form of the text has been detected in the other user commands in exceedance of the predetermined threshold, an unencrypted transcript of the text in a data table. Other aspects are described and claimed.Type: GrantFiled: March 5, 2021Date of Patent: February 18, 2025Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: John Weldon Nicholson, Igor Stolbikov, David Alexander Schwarz
-
Patent number: 12198700Abstract: In one aspect, an example method includes (i) obtaining media, wherein the obtained media includes (a) audio representing speech and (b) video; (ii) using at least the audio representing speech as a basis to generate speech text; (iii) using at least the audio representing speech to determine starting and ending time points of the speech; and (iv) using at least the generated speech text and the determined starting and ending time points of the speech to (a) generate closed-captioning or subtitle data that includes closed-captioning or subtitle text based on the generated speech text and (b) associating the generated closed-captioning or subtitle data with the obtained media, such that the closed-captioning or subtitle text is time-aligned with the video based on the determined starting and ending time points of the speech.Type: GrantFiled: June 2, 2023Date of Patent: January 14, 2025Assignee: Roku, Inc.Inventors: Snehal Karia, Greg Garner, Sunil Ramesh
-
Patent number: 12197904Abstract: An industrial integrated development environment (IDE) supports collaborative editing of translation tables used to facilitate rendering of the system project text in different defined languages. Rather than merging edits by comparing text representations of the edited translation tables on a line-by-line basis, the system expresses the edited and base versions of the translation tables as information models that represent the translation table versions as hierarchical organizations of nodes representing content of the tables, and compares corresponding nodes of the information models to obtain differential statuses for the nodes. The various versions of the nodes are then merged into a single consistent model based on the differential statuses of the nodes.Type: GrantFiled: September 23, 2022Date of Patent: January 14, 2025Assignee: ROCKWELL AUTOMATION TECHNOLOGIES, INC.Inventors: Valerio Guarnieri, Alessandro Menon
-
Patent number: 12197882Abstract: A translation method, an electronic device and a storage medium, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. An implementation includes: acquiring an intermediate translation result generated by each of multiple pre-trained translation models for a to-be-translated specified sentence in a same iteration of a translation process, so as to obtain multiple intermediate translation results; acquiring a co-occurrence word based on the multiple intermediate translation results; and acquiring a target translation result of the specified sentence based on the co-occurrence word.Type: GrantFiled: August 10, 2022Date of Patent: January 14, 2025Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Ruiqing Zhang, Xiyang Wang, Zhongjun He, Zhi Li, Hua Wu
-
Patent number: 12190876Abstract: A display device according to an embodiment of the present invention may include: a display unit which displays a content image; a microphone which receives a voice command of a user; a network interface unit for communicating with a natural language processing server and a search server; and a control unit which transmits the received voice command to the natural language processing server, receives intent analysis result information that indicates the intent of the user, which corresponds to the voice command, from the natural language processing server, and performs a function of the display device according to the received intent analysis result information.Type: GrantFiled: September 27, 2019Date of Patent: January 7, 2025Assignee: LG ELECTRONICS INC.Inventors: Sangseok Lee, Jaekyung Lee
-
Patent number: 12118981Abstract: Implementations relate to determining multilingual content to render at an interface in response to a user submitted query. Those implementations further relate to determining a first language response and a second language response to a query that is submitted to an automated assistant. Some of those implementations relate to determining multilingual content that includes a response to the query in both the first and second languages. Other implementations relate to determining multilingual content that includes a query suggestion in the first language and a query suggestion in a second language. Some of those implementations relate to pre-fetching results for the query suggestions prior to rendering the multilingual content.Type: GrantFiled: September 15, 2021Date of Patent: October 15, 2024Assignee: GOOGLE LLCInventors: Wangqing Yuan, Bryan Christopher Horling, David Kogan
-
Patent number: 12112134Abstract: The technology relates to methods for detecting and classifying emotions in textual communication, and using this information to suggest graphical indicia such as emoji, stickers or GIFs to a user. Two main types of models are fully supervised models and few-shot models. In addition to fully supervised and few-shot models, other types of models focusing on the back-end (server) side or client (on-device) side may also be employed. Server-side models are larger-scale models that can enable higher degrees of accuracy, such as for use cases where models can be hosted on cloud servers where computational and storage resources are relatively abundant. On-device models are smaller-scale models, which enable use on resource-constrained devices such as mobile phones, smart watches or other wearables (e.g., head mounted displays), in-home devices, embedded devices, etc.Type: GrantFiled: January 24, 2022Date of Patent: October 8, 2024Assignee: GOOGLE LLCInventors: Dana Movshovitz-Attias, John Patrick McGregor, Jr., Gaurav Nemade, Sujith Ravi, Jeongwoo Ko, Dora Demszky
-
Patent number: 12094459Abstract: Methods, systems, and computer program products for automated domain-specific constrained decoding from speech inputs to structured resources are provided herein.Type: GrantFiled: January 5, 2022Date of Patent: September 17, 2024Assignee: International Business Machines CorporationInventors: Ashish R Mittal, Samarth Bharadwaj, Shreya Khare, Karthik Sankaranarayanan
-
Patent number: 12087296Abstract: A display device according to an embodiment of the present disclosure includes an output unit, a communication unit configured to perform communication with an artificial intelligence server, and a control unit configured to receive a voice command, convert the received voice command into text data, determine whether the converted text data is composed of a plurality of languages, when the text data is composed of the plurality of languages, determine a language for a voice recognition service among the plurality of languages based on the text data, and output an intent analysis result of the voice command in the determined language.Type: GrantFiled: September 19, 2019Date of Patent: September 10, 2024Assignee: LG ELECTRONICS INC.Inventors: Changmin Kwak, Jaekyung Lee
-
Patent number: 12079540Abstract: An electronic device to perform an operation corresponding to a first user voice when the first user voice is received through the microphone; store, information about an operation corresponding to the first user voice and user reaction information including the user command when a user command is received through the input unit within the first threshold time from when the first user voice is received, or from when the operation corresponding to the first user voice is performed; perform an operation corresponding to a second user voice when the second user voice is received through the microphone; and provide guide information corresponding to the user command on the basis of the user reaction information stored in the memory when the type of operation corresponding to the first user voice is the same as the type of operation corresponding to the second user voice.Type: GrantFiled: October 12, 2021Date of Patent: September 3, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventor: Youngsoo Yun
-
Patent number: 12080296Abstract: Apparatuses, methods, and program products are disclosed for performing a transcription action. One apparatus includes at least one processor and a memory that stores code executable by the at least one processor. The code is executable by the processor to monitor, by use of the at least one processor, a quality of audio information. The code is executable by the processor to determine whether the quality of the audio information is below a predetermined threshold. The code is executable by the processor to, in response to determining that the quality of the audio information is below the predetermined threshold, perform a transcription action corresponding to the audio information.Type: GrantFiled: March 16, 2021Date of Patent: September 3, 2024Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: John C. Mese, Arnold S. Weksler, Mark Patrick Delaney, Nathan J. Peterson, Russell Speight VanBlon
-
Patent number: 12080271Abstract: Computer generated speech can be generated for cross-lingual natural language textual data streams by utilizing a universal phoneme set. In a variety of implementations, the natural language textual data stream includes a primary language portion in a primary language and a secondary language portion that is not in the primary language. Phonemes corresponding to the secondary language portion can be determined from a set of phonemes in a universal data set. These phonemes can be mapped back to a set of phonemes for the primary language. Audio data can be generated for these phonemes to pronounce the secondary language portion of the natural language textual data stream utilizing phonemes associated with the primary language.Type: GrantFiled: August 26, 2022Date of Patent: September 3, 2024Assignee: GOOGLE LLCInventors: Ami Patel, Siamak Tazari
-
Patent number: 12061873Abstract: Techniques performed by a data processing system for analyzing the lexical difficulty of words of textual content include analyzing a plurality of textual content sources to determine a first frequency at which each of a plurality of first words appears, analyzing search data to determine a second frequency at which each of the plurality of first words appear in searches for a definition, generating a lexical difficulty model based on the first frequency and the second frequency, the model is configured to receive a word as an input and to output a prediction for how difficult the word is likely to be for a user, receiving a request to analyze first textual content from a client device, analyzing the first textual content using the lexical difficulty model to generate lexical difficulty information, and sending a response to the client device that includes requested information.Type: GrantFiled: October 30, 2020Date of Patent: August 13, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Priyanka Subhash Kulkarni, Robert Rounthwaite
-
Patent number: 12057110Abstract: An information processing method applied to a computation circuit is disclosed. The computation circuit includes a communication circuit and an operation circuit. The method includes controlling, by the computation circuit, the communication circuit to obtain a voice to be identified input by a user; controlling, by the computation circuit, the operation circuit to obtain and call an operation instruction to perform voice identification processing on the voice to be identified to obtain target text information corresponding to the voice to be identified. The operation instruction is a preset instruction for voice identification.Type: GrantFiled: December 11, 2020Date of Patent: August 6, 2024Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD.Inventors: Tianshi Chen, Shaoli Liu, Zai Wang, Shuai Hu
-
Patent number: 11990117Abstract: A method for training a speech recognition model includes obtaining a multilingual text-to-speech (TTS) model. The method also includes generating a native synthesized speech representation for an input text sequence in a first language that is conditioned on speaker characteristics of a native speaker of the first language. The method also includes generating a cross-lingual synthesized speech representation for the input text sequence in the first language that is conditioned on speaker characteristics of a native speaker of a different second language. The method also includes generating a first speech recognition result for the native synthesized speech representation and a second speech recognition result for the cross-lingual synthesized speech representation. The method also includes determining a consistent loss term based on the first speech recognition result and the second speech recognition result and updating parameters of the speech recognition model based on the consistent loss term.Type: GrantFiled: October 20, 2021Date of Patent: May 21, 2024Assignee: Google LLCInventors: Zhehuai Chen, Bhuvana Ramabhadran, Andrew Rosenberg, Yu Zhang, Pedro J. Moreno Mengibar
-
Patent number: 11984126Abstract: A device for recognizing a speech input and an operating method thereof are provided. The device may be configured to: obtain one or more text candidates comprising a character string in which it is predicted that the speech input is to be converted by recognizing a speech input using an automatic speech recognition (ASR) model; extract text history information corresponding to the speech input from a database by comparing the speech input with a plurality of speech signals previously stored in the database; and perform training to adjust a weight of each of the one or more text candidates using the extracted text history information. Also, a method in which the device recognizes a speech input using an AI model may be performed.Type: GrantFiled: August 10, 2021Date of Patent: May 14, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Hyungtak Choi, Jongyoub Ryu
-
Patent number: 11977852Abstract: A device configured to receive a sentence that includes a plurality of words. The device is further configured to input the words into a machine learning model that is configured to output a first feature vector based on the words. The device is further configured to identify a keyword within the sentence and to determine that the keyword is an implicit reference to an item. The device is further configured to identify a second feature vector in a reference list that closest matches a numeric value of the first feature vector and to identify an explicit reference in the reference list that is associated with the second feature vector. The device is further configured to replace the keyword with the explicit reference in the sentence and to output the sentence that includes the first explicit reference.Type: GrantFiled: January 12, 2022Date of Patent: May 7, 2024Assignee: Bank of America CorporationInventors: Aaron Michael Hosford, Donatus E. Asumu, Emad Noorizadeh, Ramakrishna Reddy Yannam
-
Patent number: 11978541Abstract: Medical information is communicated between different entities. Personalized models of peoples' understanding of a medical field are created. Role information is used to assign user-appropriate ontologies. More information than mere role may be used for assigning ontologies, such as information on past medical history. The concepts and relationships in different ontologies may be linked, providing a translation from one personalized model to another. The terminology with similar or the same concepts and/or relationships is output for a given user based on their model.Type: GrantFiled: August 9, 2019Date of Patent: May 7, 2024Assignee: CERNER INNOVATION, INC.Inventor: John D. Haley
-
Patent number: 11972227Abstract: A speech translation system and methods for cross-lingual communication that enable users to improve and customize content and usage of the system and easily. The methods include, in response to receiving an utterance including a first term associated with a field, translating the utterance into a second language. In response to receiving an indication to add the first term associated with the field to a first recognition lexicon, adding the first term associated with the field and the determined translation to a first machine translation module and to a shared database for a community associated with the field of the first term associated with the field, wherein the first term associated with the field added to the shared database is accessible by the community.Type: GrantFiled: December 7, 2021Date of Patent: April 30, 2024Assignee: Meta Platforms, Inc.Inventors: Alexander Waibel, Ian R. Lane
-
Patent number: 11967248Abstract: A method for foreign language learning between a learner and a terminal, based on video or audio containing foreign language, particularly, to a conversation-based foreign language learning method using a speech recognition function and a TTS function of a terminal, a learner learns a foreign language in a way that: the terminal reads a current learning target sentence to the learner to allow the learner to speak the current learning target sentence after the terminal, when speech input by the learner in a speech waiting state of the terminal is the same as the current learning target sentence or belongs to the same category as the current learning target sentence; and the terminal and the learner alternately speak sentences one-by-one when the speech input by the learner is the same as the next sentence of the current learning target sentence or belongs to the same category as the next sentence.Type: GrantFiled: December 12, 2019Date of Patent: April 23, 2024Inventor: Jangho Lee
-
Patent number: 11922932Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses a set of speech recognition hypothesis samples, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.Type: GrantFiled: March 31, 2023Date of Patent: March 5, 2024Assignee: Google LLCInventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
-
Patent number: 11908451Abstract: A text-based virtual object animation generation includes acquiring text information, where the text information includes an original text of a virtual object animation to be generated; analyzing an emotional feature of the text information; performing speech synthesis according to the emotional feature, a rhyme boundary, and the text information to obtain audio information, where the audio information includes emotional speech obtained by conversion based on the original text; and generating a corresponding virtual object animation based on the text information and the audio information, where the virtual object animation is synchronized in time with the audio information.Type: GrantFiled: August 9, 2021Date of Patent: February 20, 2024Assignees: Mofa (Shanghai) Information Technology Co., Ltd., Shanghai Movu Technology Co., Ltd.Inventors: Congyi Wang, Yu Chen, Jinxiang Chai
-
Patent number: 11893359Abstract: This application discloses an audio processing method and a terminal. The method may include: collecting, by a first terminal, an original speech of a first user, translating the original speech of the first user into a translated speech of the first user, receiving an original speech of a second user that is sent by a second terminal, and translating the original speech of the second user into a translated speech of the second user; sending at least one of the original speech of the first user, the translated speech of the first user, and the translated speech of the second user to the second terminal based on a first setting; and playing at least one of the original speech of the second user, the translated speech of the second user, and the translated speech of the first user based on a second setting.Type: GrantFiled: April 14, 2021Date of Patent: February 6, 2024Assignee: Huawei Technologies Co., Ltd.Inventors: Xin Zhang, Gan Zhao
-
Patent number: 11868732Abstract: This disclosure describes techniques and architectures for evaluating conversations. In some instances, conversations with users, virtual assistants, and others may be analyzed to identify potential risks within a language model that is employed by the virtual assistants and other entities. The potential risks may be evaluated by administrators, users, systems, and others to identify potential issues with the language model that need to be addressed. This may allow the language model to be improved and enhance user experience with the virtual assistants and others that employ the language model.Type: GrantFiled: August 8, 2022Date of Patent: January 9, 2024Assignee: Verint Americas Inc.Inventors: Cynthia Freeman, Ian Beaver
-
Patent number: 11853707Abstract: Technologies are provided for determining deficiencies in narrative textual data that may impact decision-making in a decisional context. A candidate text document and a reference corpus of text may be utilized to generate one or more topic models and document-term matrices, and then to determine a corresponding statistical perplexity and probabilistic coherence. Statistical determinations of a degree to which the candidate deviates from the reference normative corpus are determined, in terms of the statistical perplexity and probabilistic coherence of the candidate as compared to the reference. If the difference is statistically significant, a message may be reported to user, such as the author or an auditor of the candidate text document, so that the user has the opportunity to amend the candidate document so as to improve its adequacy for the decisional purposes in the context at hand.Type: GrantFiled: October 13, 2021Date of Patent: December 26, 2023Assignee: Cerner Innovation, Inc.Inventor: Douglas S. McNair
-
Patent number: 11847424Abstract: Devices and techniques are generally described for data-to-text generation. In various examples, a first machine learned model may receive first data including a structured representation of linguistic data. In various examples, the first machine learned model may generate first output data comprising a first natural language representation of the first data. In at least some examples, a second machine learning model may determine second data indicating that the first natural language representation is a semantically accurate representation of the first data. In some examples, the first output data may be selected for output based at least in part on the second data.Type: GrantFiled: March 20, 2020Date of Patent: December 19, 2023Assignee: Amazon Technologies, Inc.Inventors: Hamza Harkous, Isabel Groves, Amir Reza Safari Azar Alamdari
-
Patent number: 11847857Abstract: A vehicle device setting method including: capturing, by an image sensing unit, a first image frame; recognizing a user ID according to the first image frame; showing ID information of the recognized user ID on a screen or by a speaker; capturing a second image frame; generating a confirm signal when a first user expression is recognized by calculating an expression feature in the second image frame and comparing the recognized expression feature with stored expression data associated with a predetermined user expression to confirm whether the recognized user ID is correct or not according to the second image frame captured after the ID information is shown; controlling an electronic device according to the confirm signal; and entering a data update mode instructed by the user and updating setting information of the electronic device by current electronic device setting according to a saving signal generated by confirming a second user expression in a third image frame captured after the user ID is confirmedType: GrantFiled: November 24, 2021Date of Patent: December 19, 2023Assignee: PIXART IMAGING INC.Inventors: Liang-Chi Chiu, Yu-Han Chen, Ming-Tsan Kao
-
Patent number: 11843716Abstract: A translation method includes: a first electronic device establishes a call connection to a second electronic device and then displays a call interface; after receiving a first operation of a first user, the first electronic device switches from displaying the call to displaying a translation interface; when receiving a first speech of the first user in a first language, the translation interface sequentially displays at least a first text and a second text, where the first text is obtained by recognizing the first speech, and the second text is obtained by translating the first speech into a target language; and the first electronic device sends a machine speech in the target language to the second electronic device.Type: GrantFiled: June 28, 2022Date of Patent: December 12, 2023Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Hejin Gu, Long Wang
-
Patent number: 11830494Abstract: An eyewear device with camera-based compensation that improves the user experience for user's having partial blindness or complete blindness. The camera-based compensation determines features, such as objects, and then converts the determined objects to audio that is indicative of the objects and that is perceptible to the eyewear user. The camera-based compensation may use a region-based convolutional neural network (RCNN) to generate a feature map including text that is indicative of objects in images captured by a camera. The feature map is then processed through a speech to audio algorithm featuring a natural language processor to generate audio indicative of the objects in the processed images.Type: GrantFiled: December 20, 2022Date of Patent: November 28, 2023Assignee: Snap Inc.Inventor: Stephen Pomes
-
Patent number: 11818406Abstract: A network-attached storage device (NAS) includes a non-volatile memory module storing a media stream, a network interface, and control circuitry coupled to the non-volatile memory module and to the network interface and configured to connect to a client over a network connection using the network interface, receive a request for the media stream from the client, determine subtitle preferences associated with the request for the media stream, access an audio stream associated with the media stream, generate subtitles based on the audio stream, and send a transport stream to the client over the network connection, the transport stream including the media stream and the subtitles.Type: GrantFiled: July 23, 2020Date of Patent: November 14, 2023Assignee: Western Digital Technologies, Inc.Inventor: Ramanathan Muthiah
-
Patent number: 11798538Abstract: This disclosure relates to answer prediction in a speech processing system. The system may disambiguate entities spoken or implied in a request to initiate an action with respect to a target user. To initiate the action, the system may determine one or more parameters; for example, the target (e.g., a contact/recipient), a source (e.g., a caller/requesting user), and a network (voice over internet protocol (VOIP), cellular, video chat, etc.). Due to the privacy implications of initiating actions involving data transfers between parties, the system may apply a high threshold for a confidence associated with each parameter. Rather than ask multiple follow-up questions, which may frustrate the requesting user, the system may attempt to disambiguate or determine a parameter, and skip a question regarding the parameter if it can predict an answer with high confidence. The system can improve the customer experience while maintaining security for actions involving, for example, communications.Type: GrantFiled: September 21, 2020Date of Patent: October 24, 2023Assignee: Amazon Technologies, Inc.Inventors: Christopher Geiger Parker, Piyush Bhargava, Aparna Nandyal, Rajagopalan Ranganathan, Mugunthan Govindaraju, Vidya Narasimhan
-
Patent number: 11792338Abstract: An image forming system is configured to receive an input of natural language speech. Regardless of whether the natural language speech includes a combination of first words or second words, the image forming system can recognize the natural language speech as an instruction to select a specific print setting displayed on a screen.Type: GrantFiled: August 27, 2021Date of Patent: October 17, 2023Assignee: Canon Kabushiki KaishaInventors: Toru Takahashi, Yuji Naya, Takeshi Matsumura
-
Patent number: 11734509Abstract: Methods, systems and computer program products for multi-style text transformation are provided herein. A computer-implemented method includes selecting at least one set of style specifications for transforming at least a portion of input text. The at least one set of style specifications include one or more target writing style domains selected from a plurality of writing style domains, weights for each of the target writing style domains representing relative impact of the target writing style domains for transformation of at least a portion of the input text, and weights for each of a set of linguistic aspects for transformation of at least a portion of the input text. The computer-implemented method also includes generating one or more style-transformed output texts based at least in part on the at least one set of style specifications utilizing at least one unsupervised neural network.Type: GrantFiled: December 29, 2020Date of Patent: August 22, 2023Assignee: International Business Machines CorporationInventors: Abhijit Mishra, Parag Jain, Amar P. Azad, Karthik Sankaranarayanan
-
Patent number: 11721333Abstract: The disclosure relates to an artificial intelligence (AI) system using a learned AI model according to at least one of machine learning, neural network, or a deep learning algorithm and applications thereof. In the disclosure, a control method of an electronic apparatus is provided. The control method comprises the steps of: displaying an image including at least one object receiving a voice; inputting the voice to an AI model learned by an AI algorithm to identify an object related to the voice among the at least one object included in the image and acquire tag information about the identified object; and providing the obtained tag information.Type: GrantFiled: January 11, 2019Date of Patent: August 8, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Younghwa Lee, Jinhe Jung, Meejeong Park, Inchul Hwang
-
Patent number: 11721329Abstract: In the present invention, a method for searching multilingual keywords in mixlingual speech corpus is proposed. This method is capable of searching audio as well as text keywords. The capability of audio search enables it to search out-of-vocabulary (OOV) words. The capability of searching text keywords enables it to perform semantic search. An advanced application of searching keyword translations in mixlingual speech corpus is also possible within posteriorgram framework with this system. Also, a technique for combining information from text and audio keywords is given which further enhances the search performance. This system is based on multiple posteriorgrams based on articulatory classes trained with multiple languages.Type: GrantFiled: September 10, 2018Date of Patent: August 8, 2023Assignees: Indian Institute of Technology, Delhi, Centre for Development of TelematicsInventors: Arun Kumar, Abhimanyu Popli
-
Patent number: 11705122Abstract: According to one embodiment, the interface-providing apparatus comprises an identifying unit and a generating unit. The identifying unit identifies a keyword from dialogue data including a question text to request information, and a response text in reply thereto. The generating unit generates display information to display a user interface for receiving feedback input relating to a degree of usefulness of a keyword when searching for the requested information.Type: GrantFiled: August 31, 2020Date of Patent: July 18, 2023Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Kenji Iwata, Hiroshi Fujimura, Takami Yoshida
-
Patent number: 11694682Abstract: In various embodiments, a voice command is associated with a plurality of processing steps to be performed. The plurality of processing steps may include analysis of audio data using automatic speech recognition, generating and selecting a search query from the utterance text, and conducting a search of database of items using a search query. The plurality of processing steps may include additional or different steps, depending on the type of the request. In performing one or more of these processing steps, an error or ambiguity may be detected. An error or ambiguity may either halt the processing step or create more than one path of actions. A model may be used to determine if and how to request additional user input to attempt to resolve the error or ambiguity. The voice-enabled device or a second client device is then causing to output a request for the additional user input.Type: GrantFiled: December 11, 2019Date of Patent: July 4, 2023Assignee: Amazon Technologies, Inc.Inventors: Julia Reinspach, Oleg Rokhlenko, Ramakanthachary Gottumukkala, Giovanni Clemente, Ankit Agrawal, Swayam Bhardwaj, Guy Michaeli, Vaidyanathan Puthucode Krishnamoorthy, Costantino Vlachos, Nalledath P. Vinodkrishnan, Shaun M. Vickers, Sethuraman Ramachandran, Charles C. Moore
-
Data generation apparatus and data generation method that generate recognition text from speech data
Patent number: 11694028Abstract: According to one embodiment, the data generation apparatus includes a speech synthesis unit, a speech recognition unit, a matching processing unit, and a dataset generation unit. The speech synthesis unit generates speech data from an original text. The speech recognition unit generates a recognition text by speech recognition from the speech data. The matching processing unit performs matching between the original text and the recognition text. The dataset generation unit generates a dataset in such a manner where the speech data, from which the recognition text satisfying a certain condition for a matching degree relative to the original text is generated, is associated with the original text, based on a matching result.Type: GrantFiled: August 31, 2020Date of Patent: July 4, 2023Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Hiroshi Fujimura, Kenji Iwata, Hui Di, Pengfei Chen -
Patent number: 11687719Abstract: A method for identifying errors associated with named entity recognition includes recognizing a candidate named entity within a text and extracting a chunk from the text containing the candidate named entity. The method further includes creating a feature vector associated with the chunk and analyzing the feature vector for an indication of an error associated with the candidate named entity. The method also includes correcting the error associated with the candidate named entity.Type: GrantFiled: March 1, 2021Date of Patent: June 27, 2023Assignee: LEVERTON HOLDING LLCInventors: Christian Schäfer, Michael Kieweg, Florian Kuhlmann
-
Patent number: 11682388Abstract: An AI apparatus includes a microphone to acquire speech data including multiple languages, and a processor to acquire text data corresponding to the speech data, determine a main language from languages included in the text data, acquire a translated text data obtained by translating a text data portion, which has a language other than the main language, in the main language, acquire a morpheme analysis result for the translated text data, extract a keyword for intention analysis from the morpheme analysis result, acquire an intention pattern matched to the keyword, and perform an operation corresponding to the intention pattern.Type: GrantFiled: June 2, 2022Date of Patent: June 20, 2023Assignee: LG ELECTRONICS INCInventors: Yejin Kim, Hyun Yu, Jonghoon Chae
-
Patent number: 11669107Abstract: A guidance system S includes a plurality of autonomous mobile robots (1) which guide a user to a destination, and a reception apparatus (2) which is provided separately from the robots (1) and recognizes the destination. Availability of each of the plurality of robots (1) is managed based on a state of the robot and the destination.Type: GrantFiled: December 25, 2018Date of Patent: June 6, 2023Assignee: HONDA MOTOR CO., LTD.Inventor: Kenichiro Sugiyama
-
Patent number: 11669695Abstract: A translation method, implemented by a computer, includes: converting a text written in a first language into a replacement text in which a named entity in the text is replaced with a predetermined character string; translating the replacement text into a second language by using a text translation model which is a neural network; and translating a named entity corresponding to the predetermined character string in the replacement text into the second language by using a named entity translation model which is a neural network.Type: GrantFiled: March 17, 2020Date of Patent: June 6, 2023Assignee: FUJITSU LIMITEDInventors: Akiba Miura, Tomoya Iwakura
-
Patent number: 11646019Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses N-best lists of decoded hypotheses, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.Type: GrantFiled: July 27, 2021Date of Patent: May 9, 2023Assignee: Google LLCInventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
-
Patent number: 11646034Abstract: An information processing system includes: a first device configured to acquire a user's uttered voice, transfer the user's uttered voice to at least one of a second and a third devices each actualizing a voice interaction agent, when a control command is acquired, convert a control signal based on the acquired control command to a control signal that matches the second device, and transmit the converted control signal to the second device; a second device configured to recognize the uttered voice transferred from the first device, and output, to the first device, a control command regarding a recognition result obtained by recognizing the uttered voice and response data based on the control signal; and a third device configured to recognize the uttered voice transferred from the first device, and output, to the first device, a control command regarding a recognition result obtained by recognizing the uttered voice.Type: GrantFiled: January 8, 2021Date of Patent: May 9, 2023Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHAInventor: Satoshi Aihara
-
Patent number: 11636673Abstract: A system enhances existing audio-visual content with audio describing the setting of the visual content. A scene annotation module classifies scene elements from an image frame received from a host system and generates a caption describing the scene elements.Type: GrantFiled: October 31, 2018Date of Patent: April 25, 2023Assignee: SONY INTERACTIVE ENTERTAINMENT INC.Inventors: Sudha Krishnamurthy, Justice Adams, Arindam Jati, Masanori Omote, Jian Zheng
-
Patent number: 11630961Abstract: A device includes a memory adapted to store a list in a file or database comprising a plurality of vocabulary words in a first language and, for each vocabulary word, a corresponding word in a second language, a display device, and a processor. The processor is adapted to receive a plurality of words in the first language, select one or more words among the plurality of words, based on one or more predetermined criteria, translate, match or equate the one or more selected words from the first language to words of the second language, and cause the display device to display the plurality of words, wherein one or more first words that are in the plurality of words and are not among the one or more selected words which are displayed in the first language and one or more second words that are in the plurality of words and are among the one or more selected words are displayed in the second language.Type: GrantFiled: September 14, 2018Date of Patent: April 18, 2023Inventor: Robert F. Deming, Jr.
-
Patent number: 11625542Abstract: A co-user list may be configured based on user interaction in a virtual world environment. A first user may be enabled to navigate the virtual world environment using an instant messenger application that includes the co-user list. A second user that is located proximate to the first user in the virtual world environment may be detected. An attribute associated with the second user may be determined. The co-user list may be configured based on the attribute associated with the second user.Type: GrantFiled: February 24, 2021Date of Patent: April 11, 2023Assignee: Verizon Patent and Licensing Inc.Inventor: David S. Bill
-
Patent number: 11626101Abstract: Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user's spoken command.Type: GrantFiled: October 28, 2021Date of Patent: April 11, 2023Assignee: Comcast Cable Communications, LLCInventors: George Thomas Des Jardins, Vikrant Sagar
-
Patent number: 11610588Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating contextually relevant transcripts of voice recordings based on social networking data. For instance, the disclosed systems receive a voice recording from a user corresponding to a message thread including the user and one or more co-users. The disclosed systems analyze acoustic features of the voice recording to generate transcription-text probabilities. The disclosed systems generate term weights for terms corresponding to objects associated with the user within a social networking system by analyzing user social networking data. Using the contextually aware term weights, the disclosed systems adjust the transcription-text probabilities. Based on the adjusted transcription-text probabilities, the disclosed systems generate a transcript of the voice recording for display within the message thread.Type: GrantFiled: October 28, 2019Date of Patent: March 21, 2023Assignee: Meta Platforms, Inc.Inventors: James Matthew Grichnik, Chetan Parag Gupta, Fuchun Peng, Yinan Zhang, Si Chen