Word Recognition Patents (Class 704/251)

Preliminary matching (Class 704/252)

Endpoint detection (Class 704/253)

Subportions (Class 704/254)

Specialized models (Class 704/255)

Markov (Class 704/256)

Hidden Markov Model (HMM) (EPO) (Class 704/256.1)

Training of HMM (EPO) (Class 704/256.2)

With insufficient amount of training data, e.g., state sharing, tying, deleted interpolation (EPO) (Class 704/256.3)

Continuous density, e.g, Gaussian distribution, Lapalce (EPO) (Class 704/256.7)
Discrete density, e.g., Vector Quantization preprocessor, look up tables (EPO) (Class 704/256.8)

Natural language (Class 704/257)

Information processing apparatus and game image distributing method

Patent number: 11338211

Abstract: An application execution unit 110 generates a game image. A message generation unit 112 generates a notification message. An image processing unit 118 generates a distribution image including the game image. A distribution processing unit 126 distributes the distribution image to one or more information processing terminals through a shared server. A setting unit 114 allows a user to set whether or not the notification message is included in the distribution image so as to be visually recognizable, and registers setting contents in a storage apparatus.

Type: Grant

Filed: November 22, 2018

Date of Patent: May 24, 2022

Assignee: SONY INTERACTIVE ENTERTAINMENT INC.

Inventors: Masahiro Fujihara, Kiyobumi Matsunaga
Method and apparatus for recognizing speaker by using a resonator

Patent number: 11341973

Abstract: Provided are a method and device for recognizing a speaker by using a resonator. The method of recognizing the speaker includes receiving a plurality of electrical signals corresponding to a speech of the speaker from a plurality of resonators having different resonance bands; obtaining a difference of magnitudes of the plurality of electrical signals; and recognizing the speaker based on the difference of magnitudes of the plurality of electrical signals.

Type: Grant

Filed: December 19, 2017

Date of Patent: May 24, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Cheheung Kim, Sungchan Kang, Sangha Park, Yongseop Yoon, Choongho Rhee
System, method, and apparatus for virtualizing digital assistants

Patent number: 11337061

Abstract: A system and method for providing anonymous communications from a user to a called party includes obtaining a dedicated phone number and creating a user account for the user and assigning the dedicated phone number to the user account. A provider account is created for a digital assistant using the dedicated phone number and the digital assistant is preprogrammed with the user account. The digital assistant is also preprogrammed with a skill for recognizing a specific utterance (e.g. “Call”). Connectivity is provided between the digital assistant and the Internet, for example, using a wireless access point. The digital assistant listens for the specific utterance and, upon recognizing the specific utterance followed by an identification of the called party, the digital assistant initiates a voice call through the Internet to the called party.

Type: Grant

Filed: November 6, 2020

Date of Patent: May 17, 2022

Assignee: Ways Investments, LLC

Inventor: Mark Edward Gray
Multiple classifications of audio data

Patent number: 11335347

Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.

Type: Grant

Filed: June 3, 2019

Date of Patent: May 17, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Gustavo Alfonso Aguilar Alas, Viktor Rozgic, Chao Wang
Processing audio data

Patent number: 11328714

Abstract: Processing data for speech recognition by generating hypotheses from input data, assigning each hypothesis, a score according to a confidence level value and hypothesis ranking, executing a pass/fail grammar test against each hypothesis, generating replacement hypotheses according to grammar test failures, assigning each replacement hypothesis a score according to a number of hypothesis changes, and providing a set of hypotheses, wherein the set comprises at least one replacement hypotheses.

Type: Grant

Filed: January 2, 2020

Date of Patent: May 10, 2022

Assignee: International Business Machines Corporation

Inventors: Andrew R. Freed, Marco Noel, Victor Povar
Data bundle generation and deployment

Patent number: 11328096

Abstract: The present disclosure provides a method, system, and device for distributing a software release. To illustrate, based on one or more files for distribution as a software release, a release bundle is generated that includes release bundle information, such as, for each file of the one or more files, a checksum, meta data, or both. One or more other aspects of the present disclosure further provide sending the release bundle to a node device. After receiving the release bundle at the node device, the node device receives and stores at least one file at a transaction directory. After verification that each of the one or more files is present/available at the node device, the one or more files may be provided to a memory of a node device and meta data included in the release bundle information may be applied to the one or more files transferred to the memory.

Type: Grant

Filed: June 10, 2020

Date of Patent: May 10, 2022

Assignee: JFROG, LTD.

Inventor: Yoav Landman
Method and apparatus for airborne-sound acoustic monitoring of an exterior and/or an interior of a vehicle, vehicle and computer-readable storage medium

Patent number: 11322130

Abstract: The invention relates to a method for airborne-sound acoustic monitoring of an exterior and/or an interior of a vehicle, in which at least one microphone (1) is used to convert airborne sound into an electrical signal (S) and to route it for evaluation purposes to a device for voice and/or sound recognition (2). According to the invention, the electrical signal (S) is subjected to a pre-evaluation in a device for trigger detection (3), and detection of a trigger results in the device for voice and/or sound recognition (2) being moved from an inactive or partially active state to a fully active state by means of the device for trigger detection (3). Further, the invention relates to an apparatus for airborne-sound acoustic monitoring of an exterior and/or an interior of a vehicle and to a vehicle having such an apparatus. The subject matter of the invention is also a computer-readable storage medium.

Type: Grant

Filed: March 7, 2019

Date of Patent: May 3, 2022

Assignee: Robert Bosch GmbH

Inventors: Thomas Fleischmann, Udo Hermann, Niko Dorsch
Information processing device and information processing method

Patent number: 11322141

Abstract: An information processing device includes a communication controller that performs communication control for receiving transmission data transmitted from a client, transmitting the transmission data to a first service providing server that performs a first service process, receiving a first service process result from the first service providing server, transmitting data according to the first service process result to a second service providing server that performs a second service process that is different from a first service, receiving a second service process result from the second service providing server, and transmitting the second service process result to the client. The first service process result is obtained by performing the first service process on the transmission data. The second service process result is obtained by performing the second service process on the data according to the first service process result.

Type: Grant

Filed: August 3, 2018

Date of Patent: May 3, 2022

Assignee: SONY CORPORATION

Inventors: Takao Okuda, Takashi Shibuya
Method, apparatus, and medium for processing speech signal

Patent number: 11322151

Abstract: According to embodiments of the disclosure, a method and an apparatus for processing a speech signal, and a computer-readable storage medium are provided. The method includes obtaining a set of speech feature representations of a speech signal received. The method also includes generating a set of source text feature representations based on a text recognized from the speech signal, each source text feature representation corresponding to an element in the text. The method also includes generating a set of target text feature representations based on the set of speech feature representations and the set of source text feature representations. The method also includes determining a match degree between the set of target text feature representations and a set of reference text feature representations predefined for the text, the match degree indicating an accuracy of recognizing of the text.

Type: Grant

Filed: June 22, 2020

Date of Patent: May 3, 2022

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD

Inventors: Chuanlei Zhai, Xu Chen, Jinfeng Bai, Lei Jia
Localized wakeword verification

Patent number: 11308958

Abstract: In one aspect, a networked microphone device is configured to (i) receive sound data, (ii) determine, via the wake-word engine, that a first portion of the sound data is representative of a wake word, (iii) determine that a second networked microphone device was added to a media playback system, (iv) transmit the first portion of the sound data to a second networked microphone device, (v) begin determining a command to be performed by the first networked microphone device, (vi) receive an indication of whether the first portion of the sound data is representative of the wake word, and (vii) output a response indicative of whether the first portion of the sound data is representative of the wake word.

Type: Grant

Filed: February 7, 2020

Date of Patent: April 19, 2022

Assignee: Sonos, Inc.

Inventor: Connor Kristopher Smith
Method and apparatus for identifying key phrase in audio, device and medium

Patent number: 11308937

Abstract: Embodiments of the present disclosure provide a method and an apparatus for identifying a key phrase in audio, a device and a computer readable storage medium. The method for identifying a key phrase in audio includes obtaining audio data to be identified. The method further includes identifying the key phrase in the audio data using a trained key phrase identification model. The key phrase identification model is trained based on first training data for identifying feature information of words in a first training text and second training data for identifying the key phrase in a second training text. In this way, embodiments of the present disclosure can accurately and efficiently identify key information in the audio data.

Type: Grant

Filed: August 2, 2019

Date of Patent: April 19, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Zhihua Wang, Tianxing Yang, Zhipeng Wu, Bin Peng, Chengyuan Zhao
Time-based frequency tuning of analog-to-information feature extraction

Patent number: 11302306

Abstract: A sound recognition system including time-dependent analog filtered feature extraction and sequencing. An analog front end (AFE) in the system receives input analog signals, such as signals representing an audio input to a microphone. Features in the input signal are extracted, by measuring such attributes as zero crossing events and total energy in filtered versions of the signal with different frequency characteristics at different times during the audio event. In one embodiment, a tunable analog filter is controlled to change its frequency characteristics at different times during the event. In another embodiment, multiple analog filters with different filter characteristics filter the input signal in parallel, and signal features are extracted from each filtered signal; a multiplexer selects the desired features at different times during the event.

Type: Grant

Filed: June 26, 2019

Date of Patent: April 12, 2022

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Zhenyong Zhang, Wei Ma
Systems and methods for searching for a media asset

Patent number: 11301507

Abstract: Systems and methods for searching for a media asset are described. In some aspects, the system includes control circuitry that receives a first search query from a user. The control circuitry identifies media assets related to the first search query from a content database. The control circuitry receives a second search query following the first search query. The control circuitry determines whether a media asset from the media assets is related to the second search query. In response to determining that less than a threshold number of media assets from the media assets are related to the second search query, the control circuitry transmits an instruction requesting the user to repeat the second search query. The control circuitry receives a third search query related to the first search query. The control circuitry determines a media asset from the media assets that is related to the third search query.

Type: Grant

Filed: July 29, 2020

Date of Patent: April 12, 2022

Assignee: Rovi Guides, Inc.

Inventors: Sashikumar Venkataraman, Ahmed Nizam Mohaideen Pathurudeen
Data clustering and user modeling for next-best-action decisions

Patent number: 11301885

Abstract: Embodiments herein provide data clustering and user modeling for next-best-action decisions. Specifically, a modeling tool is configured to: receive indicators within unstructured social data from a plurality of users; analyze the unstructured social data of each of the plurality of users to assign a set of feature vectors to each of the plurality of users, each feature vector corresponding to one or more personality characteristics of each of the plurality of users; and analyze the feature vectors to identify two or more users from the plurality of users sharing a set of similar feature vectors. The modeling tool is further configured to: group the two or more users from the plurality of users sharing the set of similar feature vectors to form a cluster; identify attributes of the cluster; and input the attributes of the cluster into a predictive model to determine an offer corresponding to the cluster.

Type: Grant

Filed: September 16, 2019

Date of Patent: April 12, 2022

Assignee: International Business Machines Corporation

Inventors: Norbert Herman, Daniel T. Lambert
Using phonetic variants in a local context to improve natural language understanding

Patent number: 11295730

Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.

Type: Grant

Filed: August 1, 2019

Date of Patent: April 5, 2022

Assignee: SoundHound, Inc.

Inventors: Keyvan Mohajer, Christopher Wilson, Bernard Mont-Reynaud
Storage medium, sound source direction estimation method, and sound source direction estimation device

Patent number: 11295755

Abstract: A non-transitory computer-readable storage medium storing a program that causes a processor included in a computer mounted on a sound source direction estimation device to execute a process, the process includes calculating a sound pressure difference between a first voice data acquired from a first microphone and a second voice data acquired from a second microphone and estimating a sound source direction of the first voice data and the second voice data based on the sound pressure difference, outputting an instruction to execute a voice recognition on the first voice data or the second voice data in a language corresponding to the estimated sound source direction, and controlling a reference for estimating a sound source direction based on the sound pressure difference, based on a time length of the voice data used for the voice recognition based on the instruction and a voice recognition time length.

Type: Grant

Filed: August 5, 2019

Date of Patent: April 5, 2022

Assignee: FUJITSU LIMITED

Inventors: Nobuyuki Washio, Masanao Suzuki, Chisato Shioda
Analyzing the advertisement bidding-chain

Patent number: 11288710

Abstract: Automatically collecting advertisement bidding order by automatically accessing at least one Internet content site and presenting the Internet content site with at least one virtual user data and an at least one of IP address representing a geographic location of the virtual user. In response, receiving from the Internet content site advertisement content, and bidding data. Presenting the advertisement bidding data to a user, and/or storing the advertisement bidding data.

Type: Grant

Filed: October 18, 2019

Date of Patent: March 29, 2022

Assignee: BI SCIENCE (2009) LTD

Inventors: Assaf Toval, Kfir Moyal
Method, apparatus and device for interaction of intelligent voice devices, and storage medium

Patent number: 11282520

Abstract: Embodiments of the present application provide a method, apparatus and device for interaction of intelligent voice devices, and a storage medium. The method includes: receiving wake-up messages sent by respective awakened intelligent voice devices; determining a forwarding device according to the wake-up messages; sending a forwarding instruction to the forwarding device to enable the forwarding device to receive a user voice request according to the forwarding instruction, where the forwarding instruction includes: type skill information of all intelligent voice devices; and sending a non-response message to other awakened intelligent voice device other than the forwarding device, which enables the most appropriate response device to execute the cloud result requested by the forwarding device, and the plurality of awakened intelligent voice devices do not respond at the same time so as to avoid confusion, thus making it easier to meet user needs.

Type: Grant

Filed: July 16, 2019

Date of Patent: March 22, 2022

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Gaofei Cheng, Fei Wang, Yan Zhang, Qin Xiong, Leilei Gao
Speech processing using embedding data

Patent number: 11282495

Abstract: A first neural network model of a user device processes audio data to extract audio embeddings that represent vocal characteristics of a user of an utterance represented in the audio data. The audio embeddings may then be hashed to remove characteristics specific to the user while still maintaining a unique set of characteristics. The hashed embeddings may be sent to a remote system, which may use them to identify the user.

Type: Grant

Filed: December 12, 2019

Date of Patent: March 22, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Hongda Mao, George Yu-Chien Lin, Sundararajan Srinivasan, Chu-Cheng Hsieh
Voice interaction system, its processing method, and program therefor

Patent number: 11270691

Abstract: A voice interaction system performs a voice interaction with a user. The voice interaction system includes: topic detection means for estimating a topic of the voice interaction and detecting a change in the topic that has been estimated; and ask-again detection means for detecting, when the change in the topic has been detected by the topic detection means, the user's voice as ask-again by the user based on prosodic information on the user's voice.

Type: Grant

Filed: May 29, 2019

Date of Patent: March 8, 2022

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Narimasa Watanabe, Sawa Higuchi, Tatsuro Hori
Artificial intelligence device and method for recognizing speech with multiple languages

Patent number: 11270700

Abstract: An artificial intelligence device includes a microphone configured to acquire speech including a plurality of languages, and a processor configured to generate, from the speech, text data corresponding to the speech, generate a plurality of pieces of separated data acquired by separating the text data for each language, perform natural language understanding processing corresponding to a language of each of the plurality of pieces of separated data to generate a natural language understanding processing result for each of the plurality of pieces of separated data, acquire command information about a command to be instructed by the speech and slot information about an entity subjected to the command, based on the natural language understanding processing result, perform an operation corresponding to the speech based on the command information and the slot information, and generate a response based on a result of performing the operation.

Type: Grant

Filed: February 24, 2020

Date of Patent: March 8, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Hyun Yu, Byeongha Kim, Yejin Kim, Jonghoon Chae
Voice interaction method and apparatus for customer service

Patent number: 11257492

Abstract: Embodiments of the present disclosure provide a voice interaction method and apparatus for a customer service. The method includes: receiving customer demand information from a customer demand end, the customer demand information including a customer demand end identifier and a voice demand instruction; performing a speech recognition on the voice demand instruction; and if a demanded service type in the voice demand instruction is identified, sending a service-providing request to a service management system based on the demanded service type, the service-providing request including the customer demand end identifier and the demanded service type. The embodiments of the present disclosure realize the interaction between the customer demand end, the service management system and the customer by adopting the voice interaction method, so that the customer's demand can be quickly and intelligently recognized and the corresponding service can be provided.

Type: Grant

Filed: March 15, 2019

Date of Patent: February 22, 2022

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventor: Xiantang Chang
Voice interaction for image editing

Patent number: 11257491

Abstract: This application relates generally to modifying visual data based on audio commands and more specifically, to performing complex operations that modify visual data based on one or more audio commands. In some embodiments, a computer system may receive an audio input and identify an audio command based on the audio input. The audio command may be mapped to one or more operations capable of being performed by a multimedia editing application. The computer system may perform the one or more operations to edit to received multimedia data.

Type: Grant

Filed: November 29, 2018

Date of Patent: February 22, 2022

Assignee: ADOBE INC.

Inventors: Sarah Kong, Yinglan Ma, Hyunghwan Byun, Chih-Yao Hsieh
System and method for drive through order processing

Patent number: 11244681

Abstract: A system and a method for automating drive-thru orders are provided. In particular, a bridge board is provided that can integrate existing drive-thru hardware with computer devices executing machine learning models to detect and analyze speech from a drive-thru. The system and method employ vehicle tracking to account for vehicle behavior during its time at the drive-thru as well as enhanced vehicle analytics. Additionally, the system and method employ tools for assessing customer speed-of-service. Cameras and vehicle image analysis are used to link drive-thru and/or on-line food/beverage orders with vehicles entering the eatery property to accelerate food/beverage delivery to vehicle occupants.

Type: Grant

Filed: May 20, 2021

Date of Patent: February 8, 2022

Assignee: XENIAL, INC.

Inventors: Christopher Siefken, William Wine, Arjun Wadwalkar, Brian Keith Jackson, Andrew Grindstaff
Ignoring trigger words in streamed media content

Patent number: 11238856

Abstract: Aspects of the present disclosure relate to ignoring trigger words of a buffered media stream. A buffered media stream of media content is accessed in advance of the playing the media stream. One or more trigger words in the media content of the buffered media stream are identified. A time stamp is generated for each of the one or more identified trigger words in relation to a play time of the media content of the buffered media stream. A voice command device is instructed to ignore audio content of the buffered media stream based on the time stamp for each of the one or more identified trigger words while the buffered media stream is played.

Type: Grant

Filed: May 1, 2018

Date of Patent: February 1, 2022

Assignee: International Business Machines Corporation

Inventors: Eunjin Lee, Jack Dunning, John J. Wood, Giacomo G. Chiarella, Daniel T. Cunnington
Rating customer representatives based on past chat transcripts

Patent number: 11227250

Abstract: A method, computer system, and a computer program product for customer representative ratings is provided. The present invention may include receiving a chat transcript with one or more tagged triplets and one or more multi-dimensional success vectors. The present invention may include aggregating the one or more multi-dimensional success vectors. The present invention may include receiving at least one business priority. The present invention may include applying at least one filter to the one or more multi-dimensional success vectors. The present invention may include normalizing the one or more multi-dimensional success vectors based on the at least one applied filter. The present invention may include obtaining a rating.

Type: Grant

Filed: June 26, 2019

Date of Patent: January 18, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Steven Ware Jones, Arjun Jauhari, Jennifer A. Mallette, Vivek Salve
Speech keyword recognition method and apparatus, computer-readable storage medium, and computer device

Patent number: 11222623

Abstract: A speech keyword recognition method includes: obtaining first speech segments based on a to-be-recognized speech signal; obtaining first probabilities respectively corresponding to the first speech segments by using a preset first classification model. A first probability of a first speech segment is obtained from probabilities of the first speech segment respectively corresponding to pre-determined word segmentation units of a pre-determined keyword.

Type: Grant

Filed: May 27, 2020

Date of Patent: January 11, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Jun Wang, Dan Su, Dong Yu
Electronic device for processing user speech and operating method therefor

Patent number: 11222635

Abstract: An electronic device of the present invention comprises: a housing; a touchscreen display; a microphone; at least one speaker; a button disposed on a portion of the housing or set to be displayed on the touchscreen display; a wireless communication circuit; a processor; and a memory. The electronic device is configured to store an application program including a user interface for receiving a text input. When the user interface is not displayed on the touchscreen display, the electronic device enables a user to receive a user input through the button, receives user speech through the microphone, and then provides data on the user speech to an external server including an automatic speech recognition system and an intelligence system. An instruction for performing a task generated by the intelligence system in response to the user speech is received from the server.

Type: Grant

Filed: February 1, 2018

Date of Patent: January 11, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Sang-Ki Kang, Jang-Seok Seo, Kook-Tae Choi, Hyun-Woo Kang, Jin-Yeol Kim, Chae-Hwan Li, Kyung-Tae Kim, Dong-Ho Jang, Min-Kyung Hwang
Structured term recognition

Patent number: 11222175

Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t?T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.

Type: Grant

Filed: May 24, 2019

Date of Patent: January 11, 2022

Assignee: International Business Machines Corporation

Inventors: Michael Glass, Alfio M Gliozzo
Learning transcription errors in speech recognition tasks

Patent number: 11211046

Abstract: A mistranscription generated by a speech recognition system is identified. A received utterance is matched to a first utterance member within a set of known utterance members. The matching operation matches fewer than the first plural number of words in the received utterance and the received utterance varies in a first particular manner as compared to a first word in a first slot in the first utterance member. The received utterance is sent to a mistranscription analyzer component which increments evidence that the received utterance is evidence of a mistranscription. Once the incremented evidence for the mistranscription exceeds a threshold, future received utterances containing the mistranscription are treated as though the first word was recognized.

Type: Grant

Filed: January 13, 2020

Date of Patent: December 28, 2021

Assignee: International Business Machines Corporation

Inventors: Andrew Aaron, Shang Guo, Jonathan Lenchner, Maharaj Mukherjee
Electronic apparatus and controlling method thereof

Patent number: 11205415

Abstract: An electronic apparatus which includes a memory configured to store first voice recognition information related to a first language and second voice recognition information related to a second language, and a processor to obtain a first text corresponding to a user voice that is received on the basis of first voice recognition information. The processor, based on an entity name being included in the user voice according to the obtained first text, identifies a segment in the user voice in which the entity name is included, and obtains a second text corresponding to the identified segment of the user voice on the basis of the second voice recognition information, and obtains control information corresponding to the user voice on the basis of the first text and the second text.

Type: Grant

Filed: October 25, 2019

Date of Patent: December 21, 2021

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Chansik Bok, Jihun Park
Prosodic pause prediction method, prosodic pause prediction device and electronic device

Patent number: 11200382

Abstract: This application discloses a prosodic pause prediction method, a prosodic pause prediction device and an electronic device. The specific implementation scheme includes: obtaining a first matrix by mapping a to-be-tested text sequence through a trained embedding layer, where the to-be-tested text sequence includes a to-be-tested input text and an identity of a to-be-tested speaker; inputting the first matrix into a trained attention model, and determining a semantic representation matrix by the trained attention model; and, performing prosodic pause prediction based on the semantic representation matrix and outputting a prosodic pause prediction result of each word in the to-be-tested input text.

Type: Grant

Filed: May 8, 2020

Date of Patent: December 14, 2021

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Zhipeng Nie, Yanyao Bian, Zhanjie Gao, Changbin Chen
Method of generating estimated value of local inverse speaking rate (ISR) and device and method of generating predicted value of local ISR accordingly

Patent number: 11200909

Abstract: A method is disclosed. The proposed method includes: providing an initial speech corpus including plural utterances; based on a condition of maximum a posteriori (MAP), according to respective sequences of syllable duration, syllable duration prosodic state, syllable tone, base-syllable type, and break type of the kth utterance, using a probability of an ISR of the kth utterance xk to estimate an estimated value {circumflex over (x)}k of the xk; and through the MAP condition, according to respective sequences of syllable duration, syllable duration prosodic state, syllable tone, base-syllable type, and break type of the given lth breath group/prosodic phrase group (BG/PG) of the kth utterance, using a probability of an ISR of the lth BG/PG of the kth utterance xk,l to estimate an estimated value {circumflex over (x)}k,l of the xk,l wherein the {circumflex over (x)}k,l is the estimated value of local ISR, and a mean of a prior probability model of the {circumflex over (x)}k,l is the {circumflex over (x)}k.

Type: Grant

Filed: August 30, 2019

Date of Patent: December 14, 2021

Assignee: NATIONAL YANG MING CHIAO TUNG UNIVERSITY

Inventors: Chen-Yu Chiang, Guan-Ting Liou, Yih-Ru Wang, Sin-Horng Chen
Distributed sequential pattern data mining framework

Patent number: 11194825

Abstract: A distributed sequential pattern data mining framework mines user data to determine statistically-relevant sequential patterns which are used to correlate the sequential patterns to a particular outcome. The correlation is provided by a statistical model, a binary predictive model and/or a logistic regression model which uses the sequential patterns to learn the behavior of end users during their usage of a software application.

Type: Grant

Filed: September 23, 2018

Date of Patent: December 7, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Shengyu Fu, Sai Tulasi Neppali, Neelakantan Sundaresan, Siyu Yang
Real-time knowledge-based widget prioritization and display

Patent number: 11188923

Abstract: Aspects of the disclosure relate to real-time knowledge-based widget prioritization and display. A computing platform may detect, via a computing device, a voice-based interaction between an enterprise agent and a customer. Then, the computing platform may cause, via the computing device, the voice-based interaction to be captured as audio data. The computing platform may then transform the audio data to textual data. Subsequently, the computing platform may identify, in the textual data, a customer query. Then, the computing platform may retrieve, in real-time and based on the voice-based interaction and from a repository of widgets, a first widget, where the first widget includes information at least partially responsive to the customer query. Then, the computing platform may display, to the enterprise agent and via a graphical user interface in use by the enterprise agent, the first widget.

Type: Grant

Filed: August 29, 2019

Date of Patent: November 30, 2021

Assignee: Bank of America Corporation

Inventors: Gaurav Bansal, Shekhar Singh Mehra, Vinod Maghnani, Sandeep Kumar Chauhan
System and method for continuous multimodal speech and gesture interaction

Patent number: 11189288

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.

Type: Grant

Filed: January 15, 2020

Date of Patent: November 30, 2021

Assignee: Nuance Communications, Inc.

Inventors: Michael Johnston, Derya Ozkan
Optimization method, apparatus, device for wake-up model, and storage medium

Patent number: 11189287

Abstract: Provided are an optimization method, apparatus, device for a wake-up model and a storage medium, which allow for: acquiring a training set and a verification set; performing an iterative training on the wake-up model according to the training set and the verification set; during the iterative training, periodically updating the training set and the verification set according to the wake-up model and a preset corpus database, and continuing performing the iterative training on the wake-up model according to the updated training set and verification set; and outputting the wake-up model when a preset termination condition is reached. The embodiments of the present disclosure, by periodically updating the training set and the verification set according to the wake-up model and the preset corpus database during an iteration, may improve optimization efficiency and effects of the wake-up model, thereby improving stability and adaptability of the wake-up model and avoiding overfitting.

Type: Grant

Filed: December 4, 2019

Date of Patent: November 30, 2021

Assignees: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., SHANGHAI XIAODU TECHNOLOGY CO. LTD.

Inventor: Yongchao Zhang
Dialog method, dialog system, dialog apparatus and program that gives impression that dialog system understands content of dialog

Patent number: 11183187

Abstract: The present invention provides a dialog system comprising a speech receiving step in which the dialog system receives input of a speech of a human, a first speech determination step in which the dialog system determines a first speech which is a speech in response to the speech of the human, a first speech presentation step in which the first speech is presented by a first agent, a reaction acquisition step in which the dialog system acquires a reaction of the human to the first speech, a second speech determination step in which the dialog system determines, when the reaction of the human is a reaction indicating that the first speech is not a speech in response to the speech of the human, a second speech which is different from the first speech, and a second speech presentation step in which the second speech is presented by the second agent.

Type: Grant

Filed: May 19, 2017

Date of Patent: November 23, 2021

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, OSAKA UNIVERSITY

Inventors: Hiroaki Sugiyama, Toyomi Meguro, Junji Yamato, Yuichiro Yoshikawa, Hiroshi Ishiguro
Method and apparatus for processing sequence

Patent number: 11182555

Abstract: A sequence processing method and apparatus are provided. The sequence processing method includes determining a word of a first R-node corresponding to a root node based on an input sequence, generating first I-nodes that are connected to the first R-node and include relative position information with respect to the word of the first R-node, determining a word of a second R-node to correspond to each of the first I-nodes, and determining an output sequence corresponding to the input sequence based on the determined words.

Type: Grant

Filed: April 9, 2020

Date of Patent: November 23, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hwidong Na, Min-Joong Lee
Email content modification system

Patent number: 11176520

Abstract: A method may include configuring a processor to monitor, in an application, composition of an electronic communication addressed to a second user from a first user, the electronic communication associated with a set of parameters; determine an intent of the electronic communication based on the set of parameters; search an associative data structure to retrieve content associated with the intent, the content previously transmitted to a third user from the first user or content(s) received from a fourth user(s); and present a suggestion in the application to include the retrieved content in the electronic communication

Type: Grant

Filed: April 18, 2019

Date of Patent: November 16, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventor: Manoj Ramakrishnan
Method and apparatus for spatial descriptions in an output text

Patent number: 11176214

Abstract: Methods, apparatuses, and computer program products are described herein that are configured to express a linguistic description of set of points within a spatial area in an output text. In some example embodiments, a method is provided that comprises generating one or more descriptors and/or one or more combinations of descriptors that are configured to linguistically describe at least a portion of a set of points within a spatial area. The method of this embodiment may also include scoring each of the one or more descriptors and/or one or more combinations of the one or more descriptors. The method of this embodiment may also include selecting a descriptor or combination of descriptors that has the highest score when compared to other descriptors or combination of descriptors, providing the descriptor or combination of descriptors satisfies a threshold.

Type: Grant

Filed: May 1, 2015

Date of Patent: November 16, 2021

Assignee: ARRIA DATA2TEXT LIMITED

Inventors: Gowri Somayajulu Sripada, Neil Burnett
Preserving emotion of user input

Patent number: 11176141

Abstract: An aspect provides a method, including: receiving, at an input component of an information handling device, user input comprising one or more words; identifying, using a processor of the information handling device, an emotion associated with the one or more words; creating, using the processor, an emotion tag including the emotion associated with the one or more words; storing the emotion tag in a memory; analyzing one or more emotion tags; and modifying an operation of an application based on the analyzing. Other embodiments are described and claimed.

Type: Grant

Filed: May 16, 2016

Date of Patent: November 16, 2021

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Suzanne Marion Beaumont, Russell Speight VanBlon, Rod D. Waltermann
Method and system for building speech recognizer, and speech recognition method and system

Patent number: 11164561

Abstract: A method and system for building a speech recognizer, and a speech recognition method and system are proposed. The method for building a speech recognizer includes: reading and parsing each grammar file, and building a network of each grammar; reading an acoustic syllable mapping relationship table, and deploying the network of each grammar as a syllable network; performing a merge minimization operation for each syllable network to form a sound element decoding network; forming the speech recognizer by using the sound element decoding network and a language model. The technical solutions of the present disclosure may be applied to exhibit strong extensibility, support an N-Gram language model, support a class model, present flexible use, and adapt for an embedded recognizer in a vehicle-mounted environment.

Type: Grant

Filed: August 19, 2019

Date of Patent: November 2, 2021

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Zhijian Wang, Sheng Qian
Entity-level clarification in conversation services

Patent number: 11164562

Abstract: A system for entity-level clarification in conversation services includes a memory having instructions therein. The system also includes at least one processor in communication with the memory. The at least one processor is configured to execute the instructions to receive a conversation services training example set, build an entity usage map using the conversation services training example set, receive a user utterance, and, responsive to a reception of the user utterance, generate a clarification response using the entity usage map. The at least one processor is also configured to execute the instructions to provide the clarification response to a user.

Type: Grant

Filed: January 10, 2019

Date of Patent: November 2, 2021

Assignee: International Business Machines Corporation

Inventors: Carmine M. DiMascio, Donna K. Byron, Benjamin L. Johnson, Florian Pinel
Display control device, display control method, and storage medium

Patent number: 11159685

Abstract: A display control device includes a display section, a first receiving section, a second receiving section, and a performing section. The display section displays an object. The first receiving section receives non-voice input specifying a first operation on the object. The second receiving section receives voice input specifying a second operation on the object. The performing section performs, on the object, a complex operation specified by the non-voice input and the voice input.

Type: Grant

Filed: March 27, 2020

Date of Patent: October 26, 2021

Assignee: KYOCERA Document Solutions Inc.

Inventors: Nobuto Fujita, Kenji Kiyose, Sumio Yamada, Takayuki Mashimo, Ryota Seike, Koji Kuroda
Wakeword detection

Patent number: 11151988

Abstract: Techniques for implementing multiple wakeword detectors on a single device are described. A digital signal processor (DSP) of the device may implement a wakeword detection component to detect when captured speech includes a wakeword. A companion application installed on the device may implement a wakeword detection component trained using speech of a user of the device. If the DSP's wakeword detection component detects a wakeword in speech, the companion application's wakeword detection component may be used to determine whether the wakeword was spoken by the user of the device. If the companion application's wakeword detection component determines the user spoke the wakeword, audio data representing the speech may be sent to at least one server(s) for processing.

Type: Grant

Filed: January 31, 2020

Date of Patent: October 19, 2021

Assignee: Amazon Technolgies, Inc.

Inventors: Deepak Yavagal, Ajith Prabhakara, John Gray
Method for operating speech recognition service and electronic device supporting the same

Patent number: 11137978

Abstract: An electronic device includes a processor, and a memory. The memory may store instructions that, cause the processor to display a user interface including items, receive a first user utterance while the user interface is displayed, wherein the first user utterance includes a first request for executing a first task by using at least one item, transmit first data related to the first user utterance to an external server, receive a first response from the external server, wherein the first response includes information on a first sequence of states of the electronic device for executing the first task and further includes numbers and locations of the items in the user interface, and execute the first task including an operation of allowing the application program to select the one or the plurality of items based on the numbers or the locations.

Type: Grant

Filed: April 27, 2018

Date of Patent: October 5, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventors: Kwang Yong Lee, Jung Hoe Kim, Soo Bin Park, Kyoung Gu Woo, Seong Min Je
Diarization driven by the ASR based segmentation

Patent number: 11120802

Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.

Type: Grant

Filed: November 21, 2017

Date of Patent: September 14, 2021

Assignee: International Business Machines Corporation

Inventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Miroslav Novak, George A. Saon
Interrupt processing method, master chip, slave chip, and multi-chip system

Patent number: 11113098

Abstract: The present disclosure relates to the field of a multi-chip system, and provides an interrupt processing method, a master chip, a slave chip, and a multi-chip system. An interrupt processing method is applied to a master chip and includes: when an interrupt transport request sent by a slave chip through an interrupt line is detected, obtaining all current interrupt requests (irq_s_1-irq_s_N) of the slave chip, the interrupt request (irq_s_1_-irq_s_N) is generated by a first peripheral (4) of the slave chip; obtaining an interrupt subroutine corresponding to each of the interrupt requests (irq_s_1-irq_s_N), and processing the corresponding interrupt request (irq_s_1-irq_s_N) by using the interrupt subroutine. In the embodiments of the present disclosure, all the interrupt requests (irq_s_1-irq_s_N) of the slave chip are mapped to the master chip, so that the interrupt processing flow of the peripheral on the slave chip is simplified.

Type: Grant

Filed: November 26, 2019

Date of Patent: September 7, 2021

Assignee: SHENZHEN GOODIX TECHNOLOGY CO., LTD.

Inventors: Zhibing Liang, Yifan Li, Zekai Chen
Computer support for meetings

Patent number: 11113672

Abstract: A system and method to provide computer support for a meeting of invitees comprises accessing one or more sensory data streams providing digitized sensory data responsive to an activity of one or more of the invitees during the meeting, the one or more sensory data streams including at least one audio stream. The method also comprises subjecting the at least one audio stream to phonetic and situational computer modeling to recognize a sequence of words in the audio stream and to assign each word to an invitee, subjecting the sequence of words to semantic computer modeling to recognize a sequence of directives in the sequence of words, and releasing one or more output data streams based on the sequence of directives, the one or more output data streams including one or more notifications.

Type: Grant

Filed: March 22, 2018

Date of Patent: September 7, 2021

Inventors: Robert Alexander Sim, Marcello Mendes Hasegawa, Ryen William White, Mudit Jain, Tomer Hermelin, Adi Gerzi Rosenthal, Sagi Hilleli

prev 1 2 3 4 5 6 7 8 9 … next