Word Recognition Patents (Class 704/251)
-
Patent number: 11295730Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.Type: GrantFiled: August 1, 2019Date of Patent: April 5, 2022Assignee: SoundHound, Inc.Inventors: Keyvan Mohajer, Christopher Wilson, Bernard Mont-Reynaud
-
Patent number: 11295755Abstract: A non-transitory computer-readable storage medium storing a program that causes a processor included in a computer mounted on a sound source direction estimation device to execute a process, the process includes calculating a sound pressure difference between a first voice data acquired from a first microphone and a second voice data acquired from a second microphone and estimating a sound source direction of the first voice data and the second voice data based on the sound pressure difference, outputting an instruction to execute a voice recognition on the first voice data or the second voice data in a language corresponding to the estimated sound source direction, and controlling a reference for estimating a sound source direction based on the sound pressure difference, based on a time length of the voice data used for the voice recognition based on the instruction and a voice recognition time length.Type: GrantFiled: August 5, 2019Date of Patent: April 5, 2022Assignee: FUJITSU LIMITEDInventors: Nobuyuki Washio, Masanao Suzuki, Chisato Shioda
-
Patent number: 11288710Abstract: Automatically collecting advertisement bidding order by automatically accessing at least one Internet content site and presenting the Internet content site with at least one virtual user data and an at least one of IP address representing a geographic location of the virtual user. In response, receiving from the Internet content site advertisement content, and bidding data. Presenting the advertisement bidding data to a user, and/or storing the advertisement bidding data.Type: GrantFiled: October 18, 2019Date of Patent: March 29, 2022Assignee: BI SCIENCE (2009) LTDInventors: Assaf Toval, Kfir Moyal
-
Patent number: 11282495Abstract: A first neural network model of a user device processes audio data to extract audio embeddings that represent vocal characteristics of a user of an utterance represented in the audio data. The audio embeddings may then be hashed to remove characteristics specific to the user while still maintaining a unique set of characteristics. The hashed embeddings may be sent to a remote system, which may use them to identify the user.Type: GrantFiled: December 12, 2019Date of Patent: March 22, 2022Assignee: Amazon Technologies, Inc.Inventors: Hongda Mao, George Yu-Chien Lin, Sundararajan Srinivasan, Chu-Cheng Hsieh
-
Patent number: 11282520Abstract: Embodiments of the present application provide a method, apparatus and device for interaction of intelligent voice devices, and a storage medium. The method includes: receiving wake-up messages sent by respective awakened intelligent voice devices; determining a forwarding device according to the wake-up messages; sending a forwarding instruction to the forwarding device to enable the forwarding device to receive a user voice request according to the forwarding instruction, where the forwarding instruction includes: type skill information of all intelligent voice devices; and sending a non-response message to other awakened intelligent voice device other than the forwarding device, which enables the most appropriate response device to execute the cloud result requested by the forwarding device, and the plurality of awakened intelligent voice devices do not respond at the same time so as to avoid confusion, thus making it easier to meet user needs.Type: GrantFiled: July 16, 2019Date of Patent: March 22, 2022Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.Inventors: Gaofei Cheng, Fei Wang, Yan Zhang, Qin Xiong, Leilei Gao
-
Patent number: 11270691Abstract: A voice interaction system performs a voice interaction with a user. The voice interaction system includes: topic detection means for estimating a topic of the voice interaction and detecting a change in the topic that has been estimated; and ask-again detection means for detecting, when the change in the topic has been detected by the topic detection means, the user's voice as ask-again by the user based on prosodic information on the user's voice.Type: GrantFiled: May 29, 2019Date of Patent: March 8, 2022Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHAInventors: Narimasa Watanabe, Sawa Higuchi, Tatsuro Hori
-
Patent number: 11270700Abstract: An artificial intelligence device includes a microphone configured to acquire speech including a plurality of languages, and a processor configured to generate, from the speech, text data corresponding to the speech, generate a plurality of pieces of separated data acquired by separating the text data for each language, perform natural language understanding processing corresponding to a language of each of the plurality of pieces of separated data to generate a natural language understanding processing result for each of the plurality of pieces of separated data, acquire command information about a command to be instructed by the speech and slot information about an entity subjected to the command, based on the natural language understanding processing result, perform an operation corresponding to the speech based on the command information and the slot information, and generate a response based on a result of performing the operation.Type: GrantFiled: February 24, 2020Date of Patent: March 8, 2022Assignee: LG ELECTRONICS INC.Inventors: Hyun Yu, Byeongha Kim, Yejin Kim, Jonghoon Chae
-
Patent number: 11257491Abstract: This application relates generally to modifying visual data based on audio commands and more specifically, to performing complex operations that modify visual data based on one or more audio commands. In some embodiments, a computer system may receive an audio input and identify an audio command based on the audio input. The audio command may be mapped to one or more operations capable of being performed by a multimedia editing application. The computer system may perform the one or more operations to edit to received multimedia data.Type: GrantFiled: November 29, 2018Date of Patent: February 22, 2022Assignee: ADOBE INC.Inventors: Sarah Kong, Yinglan Ma, Hyunghwan Byun, Chih-Yao Hsieh
-
Patent number: 11257492Abstract: Embodiments of the present disclosure provide a voice interaction method and apparatus for a customer service. The method includes: receiving customer demand information from a customer demand end, the customer demand information including a customer demand end identifier and a voice demand instruction; performing a speech recognition on the voice demand instruction; and if a demanded service type in the voice demand instruction is identified, sending a service-providing request to a service management system based on the demanded service type, the service-providing request including the customer demand end identifier and the demanded service type. The embodiments of the present disclosure realize the interaction between the customer demand end, the service management system and the customer by adopting the voice interaction method, so that the customer's demand can be quickly and intelligently recognized and the corresponding service can be provided.Type: GrantFiled: March 15, 2019Date of Patent: February 22, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventor: Xiantang Chang
-
Patent number: 11244681Abstract: A system and a method for automating drive-thru orders are provided. In particular, a bridge board is provided that can integrate existing drive-thru hardware with computer devices executing machine learning models to detect and analyze speech from a drive-thru. The system and method employ vehicle tracking to account for vehicle behavior during its time at the drive-thru as well as enhanced vehicle analytics. Additionally, the system and method employ tools for assessing customer speed-of-service. Cameras and vehicle image analysis are used to link drive-thru and/or on-line food/beverage orders with vehicles entering the eatery property to accelerate food/beverage delivery to vehicle occupants.Type: GrantFiled: May 20, 2021Date of Patent: February 8, 2022Assignee: XENIAL, INC.Inventors: Christopher Siefken, William Wine, Arjun Wadwalkar, Brian Keith Jackson, Andrew Grindstaff
-
Patent number: 11238856Abstract: Aspects of the present disclosure relate to ignoring trigger words of a buffered media stream. A buffered media stream of media content is accessed in advance of the playing the media stream. One or more trigger words in the media content of the buffered media stream are identified. A time stamp is generated for each of the one or more identified trigger words in relation to a play time of the media content of the buffered media stream. A voice command device is instructed to ignore audio content of the buffered media stream based on the time stamp for each of the one or more identified trigger words while the buffered media stream is played.Type: GrantFiled: May 1, 2018Date of Patent: February 1, 2022Assignee: International Business Machines CorporationInventors: Eunjin Lee, Jack Dunning, John J. Wood, Giacomo G. Chiarella, Daniel T. Cunnington
-
Patent number: 11227250Abstract: A method, computer system, and a computer program product for customer representative ratings is provided. The present invention may include receiving a chat transcript with one or more tagged triplets and one or more multi-dimensional success vectors. The present invention may include aggregating the one or more multi-dimensional success vectors. The present invention may include receiving at least one business priority. The present invention may include applying at least one filter to the one or more multi-dimensional success vectors. The present invention may include normalizing the one or more multi-dimensional success vectors based on the at least one applied filter. The present invention may include obtaining a rating.Type: GrantFiled: June 26, 2019Date of Patent: January 18, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Steven Ware Jones, Arjun Jauhari, Jennifer A. Mallette, Vivek Salve
-
Patent number: 11222635Abstract: An electronic device of the present invention comprises: a housing; a touchscreen display; a microphone; at least one speaker; a button disposed on a portion of the housing or set to be displayed on the touchscreen display; a wireless communication circuit; a processor; and a memory. The electronic device is configured to store an application program including a user interface for receiving a text input. When the user interface is not displayed on the touchscreen display, the electronic device enables a user to receive a user input through the button, receives user speech through the microphone, and then provides data on the user speech to an external server including an automatic speech recognition system and an intelligence system. An instruction for performing a task generated by the intelligence system in response to the user speech is received from the server.Type: GrantFiled: February 1, 2018Date of Patent: January 11, 2022Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Sang-Ki Kang, Jang-Seok Seo, Kook-Tae Choi, Hyun-Woo Kang, Jin-Yeol Kim, Chae-Hwan Li, Kyung-Tae Kim, Dong-Ho Jang, Min-Kyung Hwang
-
Patent number: 11222175Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t?T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.Type: GrantFiled: May 24, 2019Date of Patent: January 11, 2022Assignee: International Business Machines CorporationInventors: Michael Glass, Alfio M Gliozzo
-
Patent number: 11222623Abstract: A speech keyword recognition method includes: obtaining first speech segments based on a to-be-recognized speech signal; obtaining first probabilities respectively corresponding to the first speech segments by using a preset first classification model. A first probability of a first speech segment is obtained from probabilities of the first speech segment respectively corresponding to pre-determined word segmentation units of a pre-determined keyword.Type: GrantFiled: May 27, 2020Date of Patent: January 11, 2022Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Jun Wang, Dan Su, Dong Yu
-
Patent number: 11211046Abstract: A mistranscription generated by a speech recognition system is identified. A received utterance is matched to a first utterance member within a set of known utterance members. The matching operation matches fewer than the first plural number of words in the received utterance and the received utterance varies in a first particular manner as compared to a first word in a first slot in the first utterance member. The received utterance is sent to a mistranscription analyzer component which increments evidence that the received utterance is evidence of a mistranscription. Once the incremented evidence for the mistranscription exceeds a threshold, future received utterances containing the mistranscription are treated as though the first word was recognized.Type: GrantFiled: January 13, 2020Date of Patent: December 28, 2021Assignee: International Business Machines CorporationInventors: Andrew Aaron, Shang Guo, Jonathan Lenchner, Maharaj Mukherjee
-
Patent number: 11205415Abstract: An electronic apparatus which includes a memory configured to store first voice recognition information related to a first language and second voice recognition information related to a second language, and a processor to obtain a first text corresponding to a user voice that is received on the basis of first voice recognition information. The processor, based on an entity name being included in the user voice according to the obtained first text, identifies a segment in the user voice in which the entity name is included, and obtains a second text corresponding to the identified segment of the user voice on the basis of the second voice recognition information, and obtains control information corresponding to the user voice on the basis of the first text and the second text.Type: GrantFiled: October 25, 2019Date of Patent: December 21, 2021Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Chansik Bok, Jihun Park
-
Patent number: 11200909Abstract: A method is disclosed. The proposed method includes: providing an initial speech corpus including plural utterances; based on a condition of maximum a posteriori (MAP), according to respective sequences of syllable duration, syllable duration prosodic state, syllable tone, base-syllable type, and break type of the kth utterance, using a probability of an ISR of the kth utterance xk to estimate an estimated value {circumflex over (x)}k of the xk; and through the MAP condition, according to respective sequences of syllable duration, syllable duration prosodic state, syllable tone, base-syllable type, and break type of the given lth breath group/prosodic phrase group (BG/PG) of the kth utterance, using a probability of an ISR of the lth BG/PG of the kth utterance xk,l to estimate an estimated value {circumflex over (x)}k,l of the xk,l wherein the {circumflex over (x)}k,l is the estimated value of local ISR, and a mean of a prior probability model of the {circumflex over (x)}k,l is the {circumflex over (x)}k.Type: GrantFiled: August 30, 2019Date of Patent: December 14, 2021Assignee: NATIONAL YANG MING CHIAO TUNG UNIVERSITYInventors: Chen-Yu Chiang, Guan-Ting Liou, Yih-Ru Wang, Sin-Horng Chen
-
Patent number: 11200382Abstract: This application discloses a prosodic pause prediction method, a prosodic pause prediction device and an electronic device. The specific implementation scheme includes: obtaining a first matrix by mapping a to-be-tested text sequence through a trained embedding layer, where the to-be-tested text sequence includes a to-be-tested input text and an identity of a to-be-tested speaker; inputting the first matrix into a trained attention model, and determining a semantic representation matrix by the trained attention model; and, performing prosodic pause prediction based on the semantic representation matrix and outputting a prosodic pause prediction result of each word in the to-be-tested input text.Type: GrantFiled: May 8, 2020Date of Patent: December 14, 2021Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.Inventors: Zhipeng Nie, Yanyao Bian, Zhanjie Gao, Changbin Chen
-
Patent number: 11194825Abstract: A distributed sequential pattern data mining framework mines user data to determine statistically-relevant sequential patterns which are used to correlate the sequential patterns to a particular outcome. The correlation is provided by a statistical model, a binary predictive model and/or a logistic regression model which uses the sequential patterns to learn the behavior of end users during their usage of a software application.Type: GrantFiled: September 23, 2018Date of Patent: December 7, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.Inventors: Shengyu Fu, Sai Tulasi Neppali, Neelakantan Sundaresan, Siyu Yang
-
Patent number: 11189288Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.Type: GrantFiled: January 15, 2020Date of Patent: November 30, 2021Assignee: Nuance Communications, Inc.Inventors: Michael Johnston, Derya Ozkan
-
Patent number: 11188923Abstract: Aspects of the disclosure relate to real-time knowledge-based widget prioritization and display. A computing platform may detect, via a computing device, a voice-based interaction between an enterprise agent and a customer. Then, the computing platform may cause, via the computing device, the voice-based interaction to be captured as audio data. The computing platform may then transform the audio data to textual data. Subsequently, the computing platform may identify, in the textual data, a customer query. Then, the computing platform may retrieve, in real-time and based on the voice-based interaction and from a repository of widgets, a first widget, where the first widget includes information at least partially responsive to the customer query. Then, the computing platform may display, to the enterprise agent and via a graphical user interface in use by the enterprise agent, the first widget.Type: GrantFiled: August 29, 2019Date of Patent: November 30, 2021Assignee: Bank of America CorporationInventors: Gaurav Bansal, Shekhar Singh Mehra, Vinod Maghnani, Sandeep Kumar Chauhan
-
Patent number: 11189287Abstract: Provided are an optimization method, apparatus, device for a wake-up model and a storage medium, which allow for: acquiring a training set and a verification set; performing an iterative training on the wake-up model according to the training set and the verification set; during the iterative training, periodically updating the training set and the verification set according to the wake-up model and a preset corpus database, and continuing performing the iterative training on the wake-up model according to the updated training set and verification set; and outputting the wake-up model when a preset termination condition is reached. The embodiments of the present disclosure, by periodically updating the training set and the verification set according to the wake-up model and the preset corpus database during an iteration, may improve optimization efficiency and effects of the wake-up model, thereby improving stability and adaptability of the wake-up model and avoiding overfitting.Type: GrantFiled: December 4, 2019Date of Patent: November 30, 2021Assignees: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., SHANGHAI XIAODU TECHNOLOGY CO. LTD.Inventor: Yongchao Zhang
-
Patent number: 11183187Abstract: The present invention provides a dialog system comprising a speech receiving step in which the dialog system receives input of a speech of a human, a first speech determination step in which the dialog system determines a first speech which is a speech in response to the speech of the human, a first speech presentation step in which the first speech is presented by a first agent, a reaction acquisition step in which the dialog system acquires a reaction of the human to the first speech, a second speech determination step in which the dialog system determines, when the reaction of the human is a reaction indicating that the first speech is not a speech in response to the speech of the human, a second speech which is different from the first speech, and a second speech presentation step in which the second speech is presented by the second agent.Type: GrantFiled: May 19, 2017Date of Patent: November 23, 2021Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, OSAKA UNIVERSITYInventors: Hiroaki Sugiyama, Toyomi Meguro, Junji Yamato, Yuichiro Yoshikawa, Hiroshi Ishiguro
-
Patent number: 11182555Abstract: A sequence processing method and apparatus are provided. The sequence processing method includes determining a word of a first R-node corresponding to a root node based on an input sequence, generating first I-nodes that are connected to the first R-node and include relative position information with respect to the word of the first R-node, determining a word of a second R-node to correspond to each of the first I-nodes, and determining an output sequence corresponding to the input sequence based on the determined words.Type: GrantFiled: April 9, 2020Date of Patent: November 23, 2021Assignee: Samsung Electronics Co., Ltd.Inventors: Hwidong Na, Min-Joong Lee
-
Patent number: 11176520Abstract: A method may include configuring a processor to monitor, in an application, composition of an electronic communication addressed to a second user from a first user, the electronic communication associated with a set of parameters; determine an intent of the electronic communication based on the set of parameters; search an associative data structure to retrieve content associated with the intent, the content previously transmitted to a third user from the first user or content(s) received from a fourth user(s); and present a suggestion in the application to include the retrieved content in the electronic communicationType: GrantFiled: April 18, 2019Date of Patent: November 16, 2021Assignee: Microsoft Technology Licensing, LLCInventor: Manoj Ramakrishnan
-
Patent number: 11176141Abstract: An aspect provides a method, including: receiving, at an input component of an information handling device, user input comprising one or more words; identifying, using a processor of the information handling device, an emotion associated with the one or more words; creating, using the processor, an emotion tag including the emotion associated with the one or more words; storing the emotion tag in a memory; analyzing one or more emotion tags; and modifying an operation of an application based on the analyzing. Other embodiments are described and claimed.Type: GrantFiled: May 16, 2016Date of Patent: November 16, 2021Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: Suzanne Marion Beaumont, Russell Speight VanBlon, Rod D. Waltermann
-
Patent number: 11176214Abstract: Methods, apparatuses, and computer program products are described herein that are configured to express a linguistic description of set of points within a spatial area in an output text. In some example embodiments, a method is provided that comprises generating one or more descriptors and/or one or more combinations of descriptors that are configured to linguistically describe at least a portion of a set of points within a spatial area. The method of this embodiment may also include scoring each of the one or more descriptors and/or one or more combinations of the one or more descriptors. The method of this embodiment may also include selecting a descriptor or combination of descriptors that has the highest score when compared to other descriptors or combination of descriptors, providing the descriptor or combination of descriptors satisfies a threshold.Type: GrantFiled: May 1, 2015Date of Patent: November 16, 2021Assignee: ARRIA DATA2TEXT LIMITEDInventors: Gowri Somayajulu Sripada, Neil Burnett
-
Patent number: 11164561Abstract: A method and system for building a speech recognizer, and a speech recognition method and system are proposed. The method for building a speech recognizer includes: reading and parsing each grammar file, and building a network of each grammar; reading an acoustic syllable mapping relationship table, and deploying the network of each grammar as a syllable network; performing a merge minimization operation for each syllable network to form a sound element decoding network; forming the speech recognizer by using the sound element decoding network and a language model. The technical solutions of the present disclosure may be applied to exhibit strong extensibility, support an N-Gram language model, support a class model, present flexible use, and adapt for an embedded recognizer in a vehicle-mounted environment.Type: GrantFiled: August 19, 2019Date of Patent: November 2, 2021Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Zhijian Wang, Sheng Qian
-
Patent number: 11164562Abstract: A system for entity-level clarification in conversation services includes a memory having instructions therein. The system also includes at least one processor in communication with the memory. The at least one processor is configured to execute the instructions to receive a conversation services training example set, build an entity usage map using the conversation services training example set, receive a user utterance, and, responsive to a reception of the user utterance, generate a clarification response using the entity usage map. The at least one processor is also configured to execute the instructions to provide the clarification response to a user.Type: GrantFiled: January 10, 2019Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Carmine M. DiMascio, Donna K. Byron, Benjamin L. Johnson, Florian Pinel
-
Patent number: 11159685Abstract: A display control device includes a display section, a first receiving section, a second receiving section, and a performing section. The display section displays an object. The first receiving section receives non-voice input specifying a first operation on the object. The second receiving section receives voice input specifying a second operation on the object. The performing section performs, on the object, a complex operation specified by the non-voice input and the voice input.Type: GrantFiled: March 27, 2020Date of Patent: October 26, 2021Assignee: KYOCERA Document Solutions Inc.Inventors: Nobuto Fujita, Kenji Kiyose, Sumio Yamada, Takayuki Mashimo, Ryota Seike, Koji Kuroda
-
Patent number: 11151988Abstract: Techniques for implementing multiple wakeword detectors on a single device are described. A digital signal processor (DSP) of the device may implement a wakeword detection component to detect when captured speech includes a wakeword. A companion application installed on the device may implement a wakeword detection component trained using speech of a user of the device. If the DSP's wakeword detection component detects a wakeword in speech, the companion application's wakeword detection component may be used to determine whether the wakeword was spoken by the user of the device. If the companion application's wakeword detection component determines the user spoke the wakeword, audio data representing the speech may be sent to at least one server(s) for processing.Type: GrantFiled: January 31, 2020Date of Patent: October 19, 2021Assignee: Amazon Technolgies, Inc.Inventors: Deepak Yavagal, Ajith Prabhakara, John Gray
-
Patent number: 11137978Abstract: An electronic device includes a processor, and a memory. The memory may store instructions that, cause the processor to display a user interface including items, receive a first user utterance while the user interface is displayed, wherein the first user utterance includes a first request for executing a first task by using at least one item, transmit first data related to the first user utterance to an external server, receive a first response from the external server, wherein the first response includes information on a first sequence of states of the electronic device for executing the first task and further includes numbers and locations of the items in the user interface, and execute the first task including an operation of allowing the application program to select the one or the plurality of items based on the numbers or the locations.Type: GrantFiled: April 27, 2018Date of Patent: October 5, 2021Assignee: Samsung Electronics Co., Ltd.Inventors: Kwang Yong Lee, Jung Hoe Kim, Soo Bin Park, Kyoung Gu Woo, Seong Min Je
-
Patent number: 11120802Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.Type: GrantFiled: November 21, 2017Date of Patent: September 14, 2021Assignee: International Business Machines CorporationInventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Miroslav Novak, George A. Saon
-
Patent number: 11113098Abstract: The present disclosure relates to the field of a multi-chip system, and provides an interrupt processing method, a master chip, a slave chip, and a multi-chip system. An interrupt processing method is applied to a master chip and includes: when an interrupt transport request sent by a slave chip through an interrupt line is detected, obtaining all current interrupt requests (irq_s_1-irq_s_N) of the slave chip, the interrupt request (irq_s_1_-irq_s_N) is generated by a first peripheral (4) of the slave chip; obtaining an interrupt subroutine corresponding to each of the interrupt requests (irq_s_1-irq_s_N), and processing the corresponding interrupt request (irq_s_1-irq_s_N) by using the interrupt subroutine. In the embodiments of the present disclosure, all the interrupt requests (irq_s_1-irq_s_N) of the slave chip are mapped to the master chip, so that the interrupt processing flow of the peripheral on the slave chip is simplified.Type: GrantFiled: November 26, 2019Date of Patent: September 7, 2021Assignee: SHENZHEN GOODIX TECHNOLOGY CO., LTD.Inventors: Zhibing Liang, Yifan Li, Zekai Chen
-
Patent number: 11114100Abstract: Methods, apparatus, and computer readable media are described related to automated assistants that proactively incorporate, into human-to-computer dialog sessions, unsolicited content of potential interest to a user. In various implementations, based on content of an existing human-to-computer dialog session between a user and an automated assistant, an entity mentioned by the user or automated assistant may be identified. Fact(s)s related to the entity or to another entity that is related to the entity may be identified based on entity data contained in database(s). For each of the fact(s), a corresponding measure of potential interest to the user may be determined. Unsolicited natural language content may then be generated that includes one or more of the facts selected based on the corresponding measure(s) of potential interest. The automated assistant may then incorporate the unsolicited content into the existing human-to-computer dialog session or a subsequent human-to-computer dialog session.Type: GrantFiled: August 23, 2019Date of Patent: September 7, 2021Assignee: GOOGLE LLCInventors: Vladimir Vuskovic, Stephan Wenger, Zineb Ait Bahajji, Martin Baeuml, Alexandru Dovlecel, Gleb Skobeltsyn
-
Patent number: 11113672Abstract: A system and method to provide computer support for a meeting of invitees comprises accessing one or more sensory data streams providing digitized sensory data responsive to an activity of one or more of the invitees during the meeting, the one or more sensory data streams including at least one audio stream. The method also comprises subjecting the at least one audio stream to phonetic and situational computer modeling to recognize a sequence of words in the audio stream and to assign each word to an invitee, subjecting the sequence of words to semantic computer modeling to recognize a sequence of directives in the sequence of words, and releasing one or more output data streams based on the sequence of directives, the one or more output data streams including one or more notifications.Type: GrantFiled: March 22, 2018Date of Patent: September 7, 2021Inventors: Robert Alexander Sim, Marcello Mendes Hasegawa, Ryen William White, Mudit Jain, Tomer Hermelin, Adi Gerzi Rosenthal, Sagi Hilleli
-
Patent number: 11109104Abstract: Novel techniques are described for viewer compositing using media playback systems for enhanced media recommendation and consumption. For example, a display device can be in communication with a media recommendation and consumption compositor (MRCC) system. When a group of viewers desires a shared media consumption experience, the MRCC system can detect the group of viewers and can obtain respective viewer profiles, which can be used to generate a composite profile representing a composite of the group of viewers. The MRCC system can determine an available content space indicating the content available for consumption and can compute a content recommendation space as a function of the composite viewer profile and the available content space that defines recommended content options for the composited group of viewers. A recommendation interface can be output to indicate recommended content options for selecting and viewing.Type: GrantFiled: July 10, 2019Date of Patent: August 31, 2021Assignee: DISH Network L.L.C.Inventor: John Rishea
-
Patent number: 11100296Abstract: Provided is a processor-implemented method of generating a natural language, the method including generating a latent variable from an embedding vector that corresponds to an input utterance, determining attention information related to the input utterance by applying the generated latent variable to a neural network model, and outputting a natural language response that corresponds to the input utterance based on the calculated attention information.Type: GrantFiled: July 16, 2018Date of Patent: August 24, 2021Assignee: Samsung Electronics Co., Ltd.Inventors: Jehun Jeon, Young-Seok Kim, Sang Hyun Yoo, Junhwi Choi
-
Patent number: 11087760Abstract: A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.Type: GrantFiled: November 26, 2019Date of Patent: August 10, 2021Assignee: Google, LLCInventors: Gaurav Bhaya, Robert Stets
-
Patent number: 11086596Abstract: Provided are a display apparatus, a control method thereof, a server, and a control method thereof. The display apparatus includes: a processor which processes a signal; a display which displays an image based on the processed signal; a first command receiver which receives a voice command; a storage which stores a plurality of voice commands said by a user; a second command receiver which receives a user's manipulation command; and a controller which, upon receiving the voice command, displays a list of the stored plurality of voice commands, selects one of the plurality of voice commands of the list according to the received user's manipulation command and controls the processor to process based on the selected voice command.Type: GrantFiled: September 11, 2018Date of Patent: August 10, 2021Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Do-wan Kim, Oh-yun Kwon, Tae-hwan Cha
-
Patent number: 11075876Abstract: Embodiments provide a social networking platform offering various services, such as, facilitating aggregation and management of a user's interaction on one or more social networking platforms, offering enhanced control over the level of privacy associated with the flow of user data, offering tools to customize the user's exposure to advertisement-related content on the social networking platform(s), integrating features to control aspects of how data/content is presented to and visualized by the user, empowering the user to multicast direct messages to other users without the other users having to meet certain constraints, empowering the user to create and/or join a group based on messaging threads, and the like. One or more of these enhanced services/features are associated with a powerful framework of authentication/permission model for access control.Type: GrantFiled: April 15, 2021Date of Patent: July 27, 2021Assignee: SGROUPLES, INC.Inventors: Jonathan Wolfe, Mark Weinstein
-
Patent number: 11068655Abstract: A text recognition method and apparatus, and a storage medium are provided. The method includes: obtaining sample text data, the sample text data comprising a plurality of sample phrases; and generating a recognition model based on the sample phrases by performing training on a plurality of training nodes. Generating the recognition model includes respectively obtaining, by each of the plurality of training nodes, recognition coefficients of the sample phrases distributed to the corresponding training node; and determining, by the plurality of training nodes, model parameters of the recognition model according to the recognition coefficients of the sample phrases. The method also includes obtaining to-be-recognized text data; inputting the text data to the recognition model; and obtaining recognized target text data output by the recognition model and corresponding to the text data.Type: GrantFiled: November 30, 2018Date of Patent: July 20, 2021Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Zhao Yang
-
Patent number: 11064006Abstract: A listening device that identifies, based on receiving the digitized voice stream, a first keyword of a plurality of keywords in the digitized voice stream. For example, the keyword may be a name associated with a service provider (e.g., “Google”). In response to identifying the first keyword of the plurality of keywords in the digitized voice stream the listening device identifies a first communication address of a first communication server of a first service provider associated with the first keyword of the plurality of keywords in the digitized voice stream. The listening server then routes the digitized voice stream and/or information associated with the digitized voice stream to the first communication server of the first service provider using the first communication address.Type: GrantFiled: November 13, 2019Date of Patent: July 13, 2021Assignee: Flex Ltd.Inventors: Mesut Gorkem Eraslan, Bruno Dias Leite
-
Patent number: 11064339Abstract: An emergency event detection and response system detects an occurrence of an event associated with a user and initiates an emergency response flow. A user may be associated with a wearable device and have in his home a base station and portable or stationary wireless devices containing sensors capable of detecting an emergency event. The emergency event may be detected based on voice or non-voice audio input from the user, data monitoring by the wearable device, base station, and/or portable or stationary wireless device, or by physical button press. Responsive to determining that an emergency event has occurred, the system triggers an emergency response flow by notifying a call center and contacting one or more caregivers associated with the user. Caregivers may access a response system application to receive updates regarding the detected emergency and to contact the user and/or a provider associated with the call center.Type: GrantFiled: April 12, 2020Date of Patent: July 13, 2021Assignee: Aloe Care Health, Inc.Inventors: Lasse Hamre, Raymond Eugene Spoljaric, Evan Samuel Schwartz, Ryan Christopher Haigh, Alexander Neville Sassoon, Sveinung Kval Bakken
-
Patent number: 11056113Abstract: A conversation guidance method of a speech recognition system may include managing a user domain based on speech recognition function information and situation information corrected from a system mounted on a vehicle, generating a conversation used for speech recognition based on the user domain, and guiding a user with the generated conversation.Type: GrantFiled: May 16, 2019Date of Patent: July 6, 2021Assignees: Hyundai Motor Company, Kia CorporationInventors: Kyung Chul Lee, Jae Min Joh
-
Patent number: 11049495Abstract: There is provided a system and method for processing and/or recognizing acoustic signals. The method comprises obtaining at least one pre-existing speech recognition model; adapting and/or training the at least one pre-existing speech recognition model incrementally when new, previously unseen, user-specific data is received, the data comprising input acoustic signals and/or user action demonstrations and/or semantic information about a meaning of the acoustic signals, wherein the at least one model is incrementally updated by associating new input acoustic signals with input semantic frames to enable recognition of changed input acoustic signals. The method further comprises adapting to a user's vocabulary over time by learning new words and/or removing words no longer being used by the user, generating a semantic frame from an input acoustic signal according to the at least one model, and mapping the semantic frame to a predetermined action.Type: GrantFiled: March 17, 2017Date of Patent: June 29, 2021Assignee: Fluent.ai Inc.Inventors: Vikrant Tomar, Vincent P. G. Renkens, Hugo R. J. G. Van Hamme
-
Patent number: 11044364Abstract: A system for providing help includes a preprogrammed kit that includes at least one digital assistant and a virtual private network repeater for connecting to a data provider for connecting the digital assistant to a server. A plurality of agent computers is connected to the server by a data network. The digital assistant is preprogrammed with a skill for recognizing a preprogrammed specific utterance and the digital assistant is pre-configured to connect with the virtual private network repeater. After the preprogrammed specific utterance is detected by the digital assistant, the digital assistant initiates a request for help to the server and upon receiving the request for the help, the server assigns one of the agent computers and forwards the request for help to the one of the agent computers.Type: GrantFiled: April 24, 2020Date of Patent: June 22, 2021Assignee: Ways Investments, LLCInventor: Mark Edward Gray
-
Patent number: 11031000Abstract: An artificial intelligence (AI) system configured to simulate functions of a human brain, such as recognition, determination, etc., by using a machine learning algorithm, such as deep learning, etc., and an application thereof. The AI system includes a method performed by a device to transmit and receive audio data to and from another device includes obtaining a voice input that is input by a first user of the device, obtaining recognition information indicating a meaning of the obtained voice input, transmitting the obtained voice input to the other device, determining whether an abnormal situation occurs, in which a second user of the other device does not understand the transmitted voice input, and transmitting the obtained recognition information to the other device, based on a result of the determination.Type: GrantFiled: December 18, 2019Date of Patent: June 8, 2021Assignee: Samsung Electronics Co., Ltd.Inventors: Jae-deok Kim, Mee-jeong Park
-
Patent number: 11031013Abstract: Method starts with processing, by a processor, audio signal to generate audio caller utterance and transcribed caller utterance. Processor generates identified task based on transcribed caller utterance. Processor samples audio caller utterance to generate samples of audio caller utterance. Processor generates loudness result based on loudness values of samples using loudness neural network associated with identified task. Processor generates pitch result based on pitch values of samples using pitch neural network associated with identified task. Processor generates tone result for each word in transcribed caller utterance using tone neural network associated with identified task. Using task completion probability neural network associated with identified task, processor generates task completion probability result that is based on at least one of: loudness result, pitch result, or tone result. Other embodiments are disclosed herein.Type: GrantFiled: June 17, 2019Date of Patent: June 8, 2021Assignee: Express Scripts Strategic Development, Inc.Inventors: Christopher M. Myers, Danielle L. Smith