Subportions Patents (Class 704/249)

Proactive caching of assistant action content to enable resolution of spoken or typed utterances

Patent number: 12243533

Abstract: Implementations can reduce the time required to obtain responses from an automated assistant through proactive caching, locally at a client device, of proactive assistant cache entries—and through on-device utilization of the proactive assistant cache entries. Different proactive cache entries can be provided to different client devices, and various implementations relate to technique(s) utilized in determining which proactive cache entries to provide to which client devices. In some of those implementations, in determining which proactive cache entries to provide (proactively or in response to a request) to a given client device, a remote system selects, from a superset of candidate proactive cache entries, a subset of the cache entries for providing to the given client device.

Type: Grant

Filed: April 1, 2024

Date of Patent: March 4, 2025

Assignee: GOOGLE LLC

Inventors: Daniel Cotting, Zaheed Sabur, Lan Huo, Bryan Christopher Horling, Behshad Behzadi, Lucas Mirelmann, Michael Golikov, Denis Burakov, Steve Cheng, Bohdan Vlasyuk, Sergey Nazarov, Mario Bertschler, Luv Kothari
Methods and systems for enhancing the detection of synthetic speech

Patent number: 12210606

Abstract: A method for enhancing detection of synthetic speech is provided that includes the step of receiving, by an electronic device, voice biometric data of a user captured while the user was speaking and analyzing the context in which the received voice biometric data was captured. The context includes environmental and situational factors. Moreover, the method includes the steps of analyzing characteristics of the received voice biometric data for anomalies associated with synthetic speech, generating a risk score based on the results of the analysis, and comparing the risk score against a threshold value. In response to determining the risk score fails to satisfy the threshold score, the method includes a step of determining the captured voice biometric data includes anomalies associated with synthetic speech and initiating an alert protocol.

Type: Grant

Filed: April 8, 2024

Date of Patent: January 28, 2025

Assignee: Daon Technology

Inventors: Raphael A Rodriguez, Olena Mizynchuk, Davyd Mizynchuk
Systems and methods for training voice query models

Patent number: 12094452

Abstract: Methods for automatically evaluating ASR outputs and providing annotations, including corrections, on the transcriptions—in order to improve recognition—may be based on an analysis of sessions of user voice queries, utilizing time-ordered ASR transcriptions of user voice queries (i.e., user utterances). This utterance-based approach may involve extracting both session-level and query-level characteristics from a voice query sessions and identifying patterns of query reformulation in order to detect erroneous transcriptions and automatically determine an appropriate correction. Alternative, or in addition, ASR outputs may be evaluated based on user behavior. The outcomes may be classified as positive or negative. An ASR transcription may be labeled using the description of the outcome. The labeled transcription may be used as training data to train a model to output improved transcriptions of voice queries.

Type: Grant

Filed: July 22, 2021

Date of Patent: September 17, 2024

Assignee: COMCAST CABLE COMMUNICATIONS, LLC

Inventors: Wenyan Li, Ferhan Ture, Jose Casillas, George Thomas Des Jardins
Self-trigger prevention

Patent number: 12033631

Abstract: A system configured to perform self-trigger prevention to avoid a device waking itself up when a wakeword is output by the device's own output audio. For example, during active playback the device may perform double-talk detection and suppress wakewords or other device-directed utterances when near-end speech is not present. To detect whether near-end speech is present, an Audio Front End (AFE) of the device may perform echo cancellation and generate correlation data indicating an amount of correlation between an output of the echo canceller and an estimated reference signal. When the correlation is high in certain frequency ranges, near-end speech is not present and the device may suppress the utterance. When the correlation is low, indicating that near-end speech could be present, the device does not suppress the utterance and sends the utterance to a remote system for speech processing.

Type: Grant

Filed: February 15, 2022

Date of Patent: July 9, 2024

Assignee: Amazon Technologies, Inc.

Inventor: Aditya Sharadchandra Joshi
Sender and recipient disambiguation

Patent number: 11968271

Abstract: Systems and methods for sender profile and/or recipient profile disambiguation and/or confirmation are disclosed. In instances where a sender profile is not indicated by a user sending a communication from a communal device, heuristic data may be utilized to infer the sender profile. Similar heuristic data may also be used when selection of the sender profile is associated with a low confidence level. Heuristic data may also be used to infer the recipient profile when the user does not indicate the recipient profile or when selection of the recipient profile is associated with a low confidence. Various confirmations may result from the sender and recipient profile disambiguation.

Type: Grant

Filed: December 28, 2022

Date of Patent: April 23, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Christo Frank Devaraj, Christopher Geiger Parker, Sumedha Arvind Kshirsagar, James Alexander Stanton, Aaron Takayanagi Barnet, Venkatesh Kancharla, Gregory Michael Hart
Proactive caching of assistant action content to enable resolution of spoken or typed utterances

Patent number: 11948576

Abstract: Implementations can reduce the time required to obtain responses from an automated assistant through proactive caching, locally at a client device, of proactive assistant cache entries—and through on-device utilization of the proactive assistant cache entries. Different proactive cache entries can be provided to different client devices, and various implementations relate to technique(s) utilized in determining which proactive cache entries to provide to which client devices. In some of those implementations, in determining which proactive cache entries to provide (proactively or in response to a request) to a given client device, a remote system selects, from a superset of candidate proactive cache entries, a subset of the cache entries for providing to the given client device.

Type: Grant

Filed: April 18, 2023

Date of Patent: April 2, 2024

Assignee: GOOGLE LLC

Inventors: Daniel Cotting, Zaheed Sabur, Lan Huo, Bryan Christopher Horling, Behshad Behzadi, Lucas Mirelmann, Michael Golikov, Denis Burakov, Steve Cheng, Bohdan Vlasyuk, Sergey Nazarov, Mario Bertschler, Luv Kothari
Memory sub-system for decoding non-power-of-two addressable unit address boundaries

Patent number: 11928055

Abstract: A system generating, using a first addressable unit address decoder, a first addressable unit address based on an input address, an interleaving factor, and a number of first addressable units. The system then generating, using an internal address decoder, an internal address based on the input address, the interleaving factor, and the number of first addressable units. Generating the internal address includes: determining a lower address value by extracting lower bits of the internal address, determining an upper address value by extracting upper bits of the internal address, and adding the lower address value to the upper address value to generate the internal address. Using an internal power-of-two address boundary decoder and the internal address, the system then generating a second addressable unit address, a third addressable unit address, a fourth addressable unit address, and a fifth addressable unit address.

Type: Grant

Filed: October 27, 2022

Date of Patent: March 12, 2024

Assignee: Micron Technology, Inc.

Inventors: Patrick A. La Fratta, Robert Walker, Chandrasekhar Nagarajan
Convolutional neural network with phonetic attention for speaker verification

Patent number: 11776548

Abstract: Embodiments may include determination, for each of a plurality of speech frames associated with an acoustic feature, of a phonetic feature based on the associated acoustic feature, generation of one or more two-dimensional feature maps based on the plurality of phonetic features, input of the one or more two-dimensional feature maps to a trained neural network to generate a plurality of speaker embeddings, and aggregation of the plurality of speaker embeddings into a speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, wherein the speaker embedding is associated with an identity of the speaker.

Type: Grant

Filed: February 7, 2022

Date of Patent: October 3, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yong Zhao, Tianyan Zhou, Jinyu Li, Yifan Gong, Jian Wu, Zhuo Chen
Natural language input processing

Patent number: 11721330

Abstract: Techniques for intelligently selecting a component to execute with respect to a natural language user input are described. A natural language processing (NLP) system may receive first data representing a natural language input. The NLP system may determine first and second scores representing first and second confidences that first and second components are to be invoked to perform actions responsive to the natural language input, respectively. Based on the first and second scores, the NLP system may determine further information is needed to determine which of the first or second component is to be invoked. The NLP system may query a user for the further information. Based on the further information, the NLP system may determine third and fourth scores representing third and fourth confidences that the first and second components are to be invoked to perform actions responsive to the natural language input, respectively.

Type: Grant

Filed: September 4, 2019

Date of Patent: August 8, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Rajesh Kumar Pandey, Julia Kennedy Nemer, David Thomas, Isaac Joseph Madwed, Rashmi Tonge
Voice processing method and voice processing apparatus

Patent number: 11721357

Abstract: A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a procedure, the procedure includes detecting a plurality of voice sections from an input sound that includes voices of a plurality of speakers, calculating a feature amount of each of the plurality of voice sections, determining a plurality of emotions, corresponding to the plurality of voice sections respectively, of a speaker of the plurality of speakers for each of the plurality of voice sections, and clustering a plurality of feature amounts, based on a change vector from the feature amount of the voice section determined as a first emotion of the plurality of emotions of the speaker to the feature amount of the voice section determined as a second emotion of the plurality of emotions different from the first emotion.

Type: Grant

Filed: January 14, 2020

Date of Patent: August 8, 2023

Assignee: FUJITSU LIMITED

Inventors: Taro Togawa, Sayuri Nakayama, Jun Takahashi, Kiyonori Morioka
Proactive caching of assistant action content to enable resolution of spoken or typed utterances

Patent number: 11631412

Abstract: Implementations can reduce the time required to obtain responses from an automated assistant through proactive caching, locally at a client device, of proactive assistant cache entries—and through on-device utilization of the proactive assistant cache entries. Different proactive cache entries can be provided to different client devices, and various implementations relate to technique(s) utilized in determining which proactive cache entries to provide to which client devices. In some of those implementations, in determining which proactive cache entries to provide (proactively or in response to a request) to a given client device, a remote system selects, from a superset of candidate proactive cache entries, a subset of the cache entries for providing to the given client device.

Type: Grant

Filed: November 8, 2021

Date of Patent: April 18, 2023

Assignee: GOOGLE LLC

Inventors: Daniel Cotting, Zaheed Sabur, Lan Huo, Bryan Christopher Horling, Behshad Behzadi, Lucas Mirelmann, Michael Golikov, Denis Burakov, Steve Cheng, Bohdan Vlasyuk, Sergey Nazarov, Mario Bertschler, Luv Kothari
Clarification of natural language requests using neural networks

Patent number: 11610064

Abstract: A user of an automated natural language system may submit an ambiguous or incomplete request, and interactive techniques may be used to obtain clarification information from the user and then determine a response for presentation to the user. A user's initial request may be processed to compute a category score for each possible category of request. The category scores may be processed to determine if clarification of the request is needed. Where clarification is needed, one or more tags may be selected to determine a clarification question to be presented to the user. For example, a tag clarification score may be computed for each tag that indicates a value of the tag in clarifying the request. After receiving the clarification information from the user, one or more categories may be selected or, where needed, additional clarification information may be obtained.

Type: Grant

Filed: September 25, 2019

Date of Patent: March 21, 2023

Assignee: ASAPP, INC.

Inventors: Lili Yu, Tao Lei, Howard Chen, Sida Wang
Electronic device for processing multi-modal input, method for processing multi-modal input and server for processing multi-modal input

Patent number: 11561763

Abstract: An electronic device is provided. The electronic device includes a housing, a touchscreen display exposed through a first portion of the housing, a microphone disposed at a second portion of the housing, a speaker disposed at a third portion of the housing, a memory disposed inside the housing, a processor disposed inside the housing, and electrically connected to the display, the microphone, the speaker, and the memory. The memory is configured to store a plurality of application programs, each of which includes a graphic user interface (GUI).

Type: Grant

Filed: May 27, 2021

Date of Patent: January 24, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: In Jong Rhee, Ji Min Lee, Sang Ki Kang, Han Jun Ku, Sung Pa Park, Jang Seok Seo, In Wook Song, Won Ick Ahn, Kyoung Gu Woo, Ji Soo Yi, Chang Kyun Jeon, Ho Jun Jaygarl, Il Hwan Choi, Yoo Jin Hong, Ji Hyun Kim, Jae Yung Yeo
Label generation device, model learning device, emotion recognition apparatus, methods therefor, program, and recording medium

Patent number: 11551708

Abstract: With correct emotion classes selected as correct values of an emotion of an utterer of a first utterance from among a plurality of emotion classes C1, . . . , CK by listeners who have listened to the first utterance, as an input, the numbers of times ni that emotion classes Ci have been selected as the correct emotion classes are obtained, and rates of the numbers of times nk to a sum total of the numbers of times n1, . . . , nK or smoothed values of the rates are obtained as correct emotion soft labels tk(s) corresponding to the first utterance.

Type: Grant

Filed: November 12, 2018

Date of Patent: January 10, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Atsushi Ando, Hosana Kamiyama, Satoshi Kobashikawa
Voice input authentication device and method

Patent number: 11551699

Abstract: Provided are a method of authenticating a voice input provided from a user and a method of detecting a voice input having a strong attack tendency. The voice input authentication method includes: receiving the voice input; obtaining, from the voice input, signal characteristic data representing signal characteristics of the voice input; and authenticating the voice input by applying the obtained signal characteristic data to a first learning model configured to determine an attribute of the voice input, wherein the first learning model is trained to determine the attribute of the voice input based on a voice uttered by a person and a voice output by an apparatus.

Type: Grant

Filed: April 30, 2019

Date of Patent: January 10, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Junho Huh, Hyoungshik Kim, Muhammad Ejaz Ahmed, Ilyoup Kwak, Iljoo Kim, Sangjoon Je
Sender and recipient disambiguation

Patent number: 11546434

Abstract: Systems and methods for sender profile and/or recipient profile disambiguation and/or confirmation are disclosed. In instances where a sender profile is not indicated by a user sending a communication from a communal device, heuristic data may be utilized to infer the sender profile. Similar heuristic data may also be used when selection of the sender profile is associated with a low confidence level. Heuristic data may also be used to infer the recipient profile when the user does not indicate the recipient profile or when selection of the recipient profile is associated with a low confidence. Various confirmations may result from the sender and recipient profile disambiguation.

Type: Grant

Filed: November 23, 2020

Date of Patent: January 3, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Christo Frank Devaraj, Christopher Geiger Parker, Sumedha Arvind Kshirsagar, James Alexander Stanton, Aaron Takayanagi Barnet, Venkatesh Kancharla, Gregory Michael Hart
Automated meeting minutes generation service

Patent number: 11545156

Abstract: Attributes of electronic content from a meeting are identified and evaluated to determine whether sub-portions of the electronic content should or should not be attributed to a user profile. Upon determining that the sub-portion should be attributed to a user profile, attributes of the sub-portion of electronic content are compared to attributes of stored user profiles. A probability that the sub-portion corresponds to at least one stored user profile is calculated. Based on the calculated probability, the sub-portion is attributed to a stored user profile or a guest user profile.

Type: Grant

Filed: May 27, 2020

Date of Patent: January 3, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Nanshan Zeng, Wei Xiong, Lingfeng Wu, Jun Zhang, Shayin Jing
Interfacing with applications via dynamically updating natural language processing

Patent number: 11514896

Abstract: Dynamic interfacing with applications is provided. For example, a system receives a first input audio signal. The system processes, via a natural language processing technique, the first input audio signal to identify an application. The system activates the application for execution on the client computing device. The application declares a function the application is configured to perform. The system modifies the natural language processing technique responsive to the function declared by the application. The system receives a second input audio signal. The system processes, via the modified natural language processing technique, the second input audio signal to detect one or more parameters. The system determines that the one or more parameters are compatible for input into an input field of the application. The system generates an action data structure for the application. The system inputs the action data structure into the application, which executes the action data structure.

Type: Grant

Filed: November 27, 2019

Date of Patent: November 29, 2022

Assignee: GOOGLE LLC

Inventors: Quazi Hussain, Adam Coimbra, Ilya Firman
Convolutional neural network with phonetic attention for speaker verification

Patent number: 11276410

Abstract: Embodiments may include reception of a plurality of speech frames, determination of a multi-dimensional acoustic feature associated with each of the plurality of speech frames, determination of a plurality of multi-dimensional phonetic features, each of the plurality of multi-dimensional phonetic features determined based on a respective one of the plurality of speech frames, generation of a plurality of two-dimensional feature maps based on the phonetic features, input of the feature maps and the plurality of acoustic features to a convolutional neural network, the convolutional neural network to generate a plurality of speaker embeddings based on the plurality of feature maps and the plurality of acoustic features, aggregation of the plurality of speaker embeddings into a first speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, and determination of a speaker associated with the plurality of speech frames based on the first speaker embedding.

Type: Grant

Filed: November 13, 2019

Date of Patent: March 15, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Yong Zhao, Tianyan Zhou, Jinyu Li, Yifan Gong, Jian Wu, Zhuo Chen
Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models

Patent number: 11270687

Abstract: A method includes receiving audio data encoding an utterance spoken by a native speaker of a first language, and receiving a biasing term list including one or more terms in a second language different than the first language. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data to generate speech recognition scores for both wordpieces and corresponding phoneme sequences in the first language. The method also includes rescoring the speech recognition scores for the phoneme sequences based on the one or more terms in the biasing term list, and executing, using the speech recognition scores for the wordpieces and the rescored speech recognition scores for the phoneme sequences, a decoding graph to generate a transcription for the utterance.

Type: Grant

Filed: April 28, 2020

Date of Patent: March 8, 2022

Assignee: Google LLC

Inventors: Ke Hu, Antoine Jean Bruguier, Tara N. Sainath, Rohit Prakash Prabhavalkar, Golan Pundak
Identity verification system using voice recognition by an individual

Patent number: 11257501

Abstract: A verification device is configurable for verifying the identity of a person. The verification device may be in communication with a user device and an entity device that may request an authorization decision from the verification device. The verification device may have a collection system configured to populate a voice profile database with a plurality of voice profiles in the form of audio recordings of people. The verification device may also have a testing system configured to select one of the voice profiles, provide the voice profile to an end-user device, and receive an answer from the end-user device, the answer being an attempt by the person to identify an individual associated with the voice profile. The verification device may further have an authentication system configured to determine whether the identity of the person is verified based on the answer from the end-user device.

Type: Grant

Filed: April 29, 2019

Date of Patent: February 22, 2022

Assignee: International Business Machines Corporation

Inventors: Uri Kartoun, Fang Lu, Yoonyoung Park, Kenney Ng
Speech recognition method and speech recognition device

Patent number: 11250843

Abstract: Disclosed are a speech recognition method capable of communicating with other electronic devices and an external server in a 5G communication condition by performing speech recognition by executing an artificial intelligence (AI) algorithm and/or a machine learning algorithm. The speech recognition method may comprise performing speech recognition by using an acoustic model and a language model stored in a speech database, determining whether the speech recognition of the spoken sentence is successful, storing speech recognition failure data when the speech recognition of the spoken sentence fails, analyzing the speech recognition failure data of the spoken sentence and updating the acoustic model or the language model by adding the recognition failure data to a learning database of the acoustic model or the language model when the cause of the speech recognition failure is due to the acoustic model or the language model and machine-learning the acoustic model or the language model.

Type: Grant

Filed: September 11, 2019

Date of Patent: February 15, 2022

Assignee: LG Electronics Inc.

Inventor: Hwan Sik Yun
System and method for feature based beam steering

Patent number: 11227588

Abstract: A method, computer program product, and computer system for identifying, by a computing device, a plurality of sources. One or more feature values of a plurality of features may be assigned to a first source of the plurality of sources. One or more feature values of the plurality of features may be assigned to a second source of the plurality of sources. A first score for the first source and a second score for the second source may be determined based upon, at least in part, the one or more feature values assigned to the first source and the second source. One of the first source and the second source may be selected for spatial processing based upon, at least in part, the first score for the first source and the second score for the second source.

Type: Grant

Filed: December 7, 2018

Date of Patent: January 18, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Tobias Wolff, Simon Graf
Voice recognition to authenticate a mobile payment

Patent number: 11222337

Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.

Type: Grant

Filed: June 2, 2020

Date of Patent: January 11, 2022

Assignee: Capital One Services, LLC

Inventors: Lawrence Douglas, Paul Y. Moreton
Ambient cooperative intelligence system and method

Patent number: 11222103

Abstract: A method, computer program product, and computing system for detecting the issuance of a verbal command by a requester to a virtual assistant; authenticating that the requester has the authority to issue the verbal command to the virtual assistant; if the requester is authenticated, allowing the effectuation of the verbal command to the virtual assistant; and if the requester is not authenticated, preventing the effectuation of the verbal command to the virtual assistant.

Type: Grant

Filed: October 29, 2020

Date of Patent: January 11, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Guido Remi Marcel Gallopyn, William F. Ganong, III
Personalized updates upon invocation of a service

Patent number: 11218565

Abstract: A primary virtual assistant and local virtual assistant herein can provide secondary information, including, without limitation, user-specific information, without having direct access to such information. For example, a user may invoke a secondary virtual assistant, through the local virtual assistant connected to a primary virtual assistant system. This invocation may be sent through the primary virtual assistant to the third party provider, i.e., the secondary virtual assistant. The secondary virtual assistant has access to the secondary information, for example, email, calendars, other types of information specifically associated with the user or learned from past user actions while this information is not directly available to the primary virtual assistant. This secondary information is then provided to the local virtual assistant in response to the invocation to be provided to the user.

Type: Grant

Filed: October 23, 2019

Date of Patent: January 4, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Alice Jane Bernheim Brush, Lisa Stifelman, James Francis Gilsinan, IV, Karl Rolando Henderson, Jr., Robert Juan Miller, Nikhil Rajkumar Jain, Hanjiang Zhou, Oliver Scholz, Hisami Suzuki
Real-time privacy filter

Patent number: 11210461

Abstract: A masking system prevents a human agent from receiving sensitive personal information (SPI) provided by a caller during caller-agent communication. The masking system includes components for detecting the SPI, including automated speech recognition and natural language processing systems. When the caller communicates with the agent, e.g., via a phone call, the masking system processes the incoming caller audio. When the masking system detects SPI in the caller audio stream or when the masking system determines a high likelihood that incoming caller audio will include SPI, the caller audio is masked such that it cannot be heard by the agent. The masking system collects the SPI from the caller audio and sends it to the organization associated with the agent for processing the caller's request or transaction without giving the agent access to caller SPI.

Type: Grant

Filed: July 3, 2018

Date of Patent: December 28, 2021

Assignee: Interactions LLC

Inventors: David Thomson, Ethan Selfridge
Movies with user defined alternate endings

Patent number: 11183219

Abstract: User engagement with movies is increased by enabling users to use their own vision, imagination, and creativity to generate user created alternate endings and/or sequences. In the context of movies presented through computer simulation consoles, the simulation community activity can be enhanced by providing the option of sharing user-customized creations as well as watching (and possibly rating) other user's creations.

Type: Grant

Filed: May 1, 2019

Date of Patent: November 23, 2021

Assignee: Sony Interactive Entertainment Inc.

Inventor: Elke Wiggeshoff
Driving intent expansion via anomaly detection in a modular conversational system

Patent number: 11182557

Abstract: Embodiments provide for driving intent expansion via anomaly detection by ranking, according to anomaly scores, a plurality of historic utterances that have been associated by a classifier with a given intent of a plurality of predefined intents; identifying a given utterance from the plurality of historic utterances having a given anomaly score greater than an anomaly threshold; in response to verifying that the given utterance is associated with the given intent, adding the given utterance to a training dataset as a positive example for the given intent; and in response to verifying that the given utterance is not associated with the given intent, adding the given utterance to the training dataset as a complement example for the given intent. A complement example for one intent may be added as a positive example for a different intent. The training dataset may be used to train or retrain an intent classifier.

Type: Grant

Filed: November 5, 2018

Date of Patent: November 23, 2021

Assignee: International Business Machines Corporation

Inventors: Neil R. Mallinar, Tin Kam Ho
Method and apparatus for measuring confidence

Patent number: 11176424

Abstract: A confidence measurement method according to an embodiment includes additionally training a feature extractor of a classification model trained using training data including a plurality of images, using the training data, such that feature vectors for images labeled with the same class among feature vectors for the plurality of images become closer to each other in an embedding space for the feature extractor, and measuring confidence of a classification result for an input image by the trained classification model using the additionally trained feature extractor.

Type: Grant

Filed: October 28, 2019

Date of Patent: November 16, 2021

Assignee: SAMSUNG SDS CO., LTD.

Inventors: Ji-hyeon Seo, Young-rock Oh, Jeong-hyung Park
Proactive caching of assistant action content at a client device to enable on-device resolution of spoken or typed utterances

Patent number: 11170777

Abstract: Implementations can reduce the time required to obtain responses from an automated assistant through proactive caching, locally at a client device, of proactive assistant cache entries—and through on-device utilization of the proactive assistant cache entries. Different proactive cache entries can be provided to different client devices, and various implementations relate to technique(s) utilized in determining which proactive cache entries to provide to which client devices. In some of those implementations, in determining which proactive cache entries to provide (proactively or in response to a request) to a given client device, a remote system selects, from a superset of candidate proactive cache entries, a subset of the cache entries for providing to the given client device.

Type: Grant

Filed: May 31, 2019

Date of Patent: November 9, 2021

Assignee: GOOGLE LLC

Inventors: Daniel Cotting, Zaheed Sabur, Lan Huo, Bryan Christopher Horling, Behshad Behzadi, Lucas Mirelmann, Michael Golikov, Denis Burakov, Steve Cheng, Bohdan Vlasyuk, Sergey Nazarov, Mario Bertschler, Luv Kothari
Speech audio pre-processing segmentation

Patent number: 11138979

Abstract: An apparatus includes processor(s) to: divide a speech data set into multiple data chunks that each represent a chunk of speech audio; derive a threshold amplitude based on at least one peak amplitude of the speech audio; designate each data chunk with a peak amplitude below the threshold amplitude a pause data chunk; within a set of temporally consecutive data chunks of the multiple data chunks, identify a longest subset of temporally consecutive pause data chunks; within the set of temporally consecutive data chunks, designate the longest subset of temporally consecutive pause data chunks as a likely sentence pause of a candidate set of likely sentence pauses; based on at least the candidate set, divide the speech data set into multiple data segments that each represent a speech segment of the speech audio; and perform speech-to-text conversion, to identify a sentence spoken in each speech segment.

Type: Grant

Filed: December 30, 2020

Date of Patent: October 5, 2021

Assignee: SAS INSTITUTE INC.

Inventors: Xiaozhuo Cheng, Xu Yang, Xiaolong Li
Method and system of automatic speech recognition with highly efficient decoding

Patent number: 11120786

Abstract: A system, article, and method of automatic speech recognition with highly efficient decoding is accomplished by frequent beam width adjustment.

Type: Grant

Filed: March 27, 2020

Date of Patent: September 14, 2021

Assignee: Intel Corporation

Inventors: Piotr Rozen, Joachim Hofer
Answer to question neural networks

Patent number: 11093813

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying answers to questions using neural networks. One of the methods includes receiving an input text passage and an input question string; processing the input text passage using an encoder neural network to generate a respective encoded representation for each passage token in the input text passage; at each time step: processing a decoder input using a decoder neural network to update the internal state of the decoder neural network; and processing the respective encoded representations and a preceding output of the decoder neural network using a matching vector neural network to generate a matching vector for the time step; and generating an answer score that indicates how well the input text passage answers a question posed by the input question string.

Type: Grant

Filed: October 18, 2017

Date of Patent: August 17, 2021

Assignee: GOOGLE LLC

Inventors: Ni Lao, Lukasz Mieczyslaw Kaiser, Nitin Gupta, Afroz Mohiuddin, Preyas Popat
Segment-based speaker verification using dynamically generated phrases

Patent number: 11056120

Abstract: A method includes obtaining enrollment audio data representing a particular user speaking an enrollment phrase, and in response to receiving a request to verify an identity of an unverified user, prompting the unverified user to speak a verification utterance. The method also includes receiving verification audio data representing the unverified user speaking the verification utterance and determining whether the unverified user speaking the verification phrase includes the particular user who spoke the enrollment phrase based on the enrollment audio data and the verification audio data. The method also includes verifying the identity of the unverified user as the particular user.

Type: Grant

Filed: November 6, 2019

Date of Patent: July 6, 2021

Assignee: Google LLC

Inventors: Dominik Roblek, Matthew Sharifi
Confidential audio content loss mitigation

Patent number: 11030337

Abstract: Methods, computer program products, and systems are presented. The method computer program products, and systems can include, for instance: embedding a signature sound on audio content of a first conference in which a first client computer device is connected, wherein the audio content includes confidential content, the signature sound being a sound that is machine detectable and human inaudible; emitting through an audio output device of the first client computer device the audio content of the first conference having embedded thereon the signature sound; receiving by an audio input device of a second client computer device the audio content of the first conference having embedded thereon the signature sound; and in response to the receiving activating a process to mitigate loss of the confidential content.

Type: Grant

Filed: August 24, 2018

Date of Patent: June 8, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Martin G. Keen, Sarbajit Rakshit, John M. Ganci, Jr., James E. Bostick
Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface

Patent number: 11017766

Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user.

Type: Grant

Filed: October 17, 2018

Date of Patent: May 25, 2021

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, William Zhang
Systems for and methods of finding relevant documents by analyzing tags

Patent number: 10963522

Abstract: A method of determining relevancies of objects to a search query includes associating multiple tags with multiple objects, recording bookmarks to the multiple objects, or both, and determining a relevance score for each of the multiple objects and a search query. One embodiment of the method combines full-text relevance algorithms with tag relevance algorithms. Other embodiments include statistical relevance algorithms such as statistical classification or rank regression algorithms. When a user executes a search query, a results list containing the objects is returned, with the objects organized based on the relevance scores. The objects are organized by, for example, listing those with the highest relevance scores first or by marking them with an indication of their relevance.

Type: Grant

Filed: June 16, 2017

Date of Patent: March 30, 2021

Assignee: Pinterest, Inc.

Inventors: Yunshan Lu, Michael Tanne
Sender and recipient disambiguation

Patent number: 10848591

Abstract: Systems and methods for sender profile and/or recipient profile disambiguation and/or confirmation are disclosed. In instances where a sender profile is not indicated by a user sending a communication from a communal device, heuristic data may be utilized to infer the sender profile. Similar heuristic data may also be used when selection of the sender profile is associated with a low confidence level. Heuristic data may also be used to infer the recipient profile when the user does not indicate the recipient profile or when selection of the recipient profile is associated with a low confidence. Various confirmations may result from the sender and recipient profile disambiguation.

Type: Grant

Filed: June 7, 2017

Date of Patent: November 24, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Christo Frank Devaraj, James Alexander Stanton, Sumedha Arvind Kshirsagar, Christopher Geiger Parker, Aaron Takayanagi Barnet, Venkatesh Kancharla, Gregory Michael Hart
Methods and apparatus for reducing latency in speech recognition applications

Patent number: 10832682

Abstract: The method comprises receive first audio comprising speech from a user of a computing device, detecting an end of speech in the first audio, generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech, determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result, and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.

Type: Grant

Filed: February 6, 2020

Date of Patent: November 10, 2020

Assignee: Nuance Communications, Inc.

Inventor: Mark Fanty
Information processing method, information processing apparatus, and non-transitory recording medium

Patent number: 10811006

Abstract: An information processing method for information stored in a storage includes: holding a dialog history of a dialog including a question to a user and a reply from the user to the question, determining whether a manner in which a third reply indicating neither a first reply nor a second reply appears in a reply history of the reply included in the held dialog history satisfies a predetermined condition, the first reply indicating an affirmative in response to the question, the second reply indicating a negative in response to the question; and performing presentation regarding the information stored in the storage if the manner is determined to satisfy the predetermined condition.

Type: Grant

Filed: November 22, 2017

Date of Patent: October 20, 2020

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Katsuyoshi Yamagami, Mitsuru Endo, Takashi Ushio
Decoding data using decoders and neural networks

Patent number: 10804938

Abstract: Systems and methods are disclosed for decoding data. A first block of data may be obtained from a storage medium or received from a computing device. The first block of data includes a first codeword generated based on an error correction code. A first set of likelihood values is obtained from a neural network. The first set of likelihood values indicates probabilities that the first codeword will be decoded into one of a plurality of decoded values. A second set of likelihood values is obtained from a decoder based on the first block of data. The second set of likelihood values indicates probabilities that the first codeword will be decoded into one of the plurality of decoded values. The first codeword is decoded to obtain a decoded value based on the first set of likelihood values and the second set of likelihood values.

Type: Grant

Filed: September 25, 2018

Date of Patent: October 13, 2020

Assignee: Western Digital Technologies, Inc.

Inventor: Minghai Qin
Speaker recognition with assessment of audio frame contribution

Patent number: 10726849

Abstract: This application describes methods and apparatus for speaker recognition. An apparatus according to an embodiment has an analyzer (202) for analyzing each frame of a sequence of frames of audio data (AIN) which correspond to speech sounds uttered by a user to determine at least one characteristic of the speech sound of that frame. An assessment module (203) determines, for each frame of audio data, a contribution indicator of the extent to which the frame of audio data should be used for speaker recognition processing based on the determined characteristic of the speech sound. In this way frames which correspond to speech sounds that are of most use for speaker discrimination may be emphasized and/or frames which correspond to speech sounds that are of least use for speaker discrimination may be de-emphasized.

Type: Grant

Filed: August 1, 2017

Date of Patent: July 28, 2020

Assignee: Cirrus Logic, Inc.

Inventors: John Paul Lesso, John Laurence Melanson
Maintaining privacy of personal information

Patent number: 10726832

Abstract: Systems and processes for operating an intelligent automated assistant to perform intelligent list reading are provided. In accordance with one example, a method includes, at an electronic device having one or more processors, receiving a natural-language input corresponding to a domain; providing the natural-language input to an external device; receiving, from the external device, a process flow corresponding to the domain; determining, with the process flow corresponding to the domain, a task associated with the natural-language input; performing the task; and providing an output indicating whether the task has been performed.

Type: Grant

Filed: March 9, 2018

Date of Patent: July 28, 2020

Assignee: Apple Inc.

Inventors: Brandon J. Newendorp, Joanna S. Peterson
Voice recognition to authenticate a mobile payment

Patent number: 10706422

Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.

Type: Grant

Filed: December 19, 2018

Date of Patent: July 7, 2020

Assignee: Capital One Services, LLC

Inventors: Lawrence Douglas, Paul Y. Moreton
Operating a voice response system in a multiuser environment

Patent number: 10650829

Abstract: Methods, systems and computer program products for operating a voice response system in a multiuser environment are provided. Aspects include receiving a voice command from a first user and determining an identity of the first user based at least in part on a voice recognition of the first user. Aspects also include determining an identity of one or more other users in range of the voice response system and obtaining a command hierarchy. Aspects further include performing an action requested by the voice command based on a determination that the first user is authorized to request the voice command, wherein the determination that the first user is authorized to request the voice command is based at least upon the identity of the first user, the identity of one or more other users in range of the voice response system and the command hierarchy.

Type: Grant

Filed: June 6, 2018

Date of Patent: May 12, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Eric V. Kline, Sarbajit K. Rakshit
Seamless text dependent enrollment

Patent number: 10600423

Abstract: Methods and systems for transforming a text-independent enrolment of a customer into a text-dependent enrolment are provided. A request for authentication of a customer that is enrolled in the self-service system with a text-independent voice print is received. A request is transmitted to the customer to repeat a passphrase and the customer's response is received as an audio stream of the passphrase. The customer is authenticated by comparing the audio stream of the passphrase against the text-independent voice print and if the customer is authenticated then a text-dependent voice print is created based on the passphrase. Upon receipt of a subsequent request for authentication of the customer, a request may be transmitted to the customer to repeat the passphrase. Another audio stream of the passphrase may be received. The customer may be authenticated by comparing the another audio stream of the passphrase with the text-dependent voice print.

Type: Grant

Filed: January 23, 2019

Date of Patent: March 24, 2020

Assignee: Nice Ltd.

Inventors: Matan Keret, Omer Kochba, Amnon Buzaglo
Electronic apparatus and control method thereof

Patent number: 10522123

Abstract: An electronic apparatus is disclosed, which includes an input interface configured to receive an audio signal, a processor configured to process the received audio signal, and an output interface configured to output the processed audio signal, in which the processor is configured to obtain a scale of a first octave by applying a filter bank to the audio signal based on a sampling frequency of the audio signal; down-sample the audio signal; and obtain a scale of a second octave lower than the first octave by applying the filter bank to the down-sampled signal.

Type: Grant

Filed: January 12, 2018

Date of Patent: December 31, 2019

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Jong-woo Kim
Method and system for power savings in voice control of a computing device

Patent number: 10497369

Abstract: Methods and systems for controlling a portable computing device (“PCD”) are disclosed. In an example method, an always on processor (AoP) of a voice recognition module of the PCD receives a voice command. The AoP determines, without decoding the received voice command, that the received voice command corresponds to a previously determined keyword. The AoP retrieves context data associated with the previously determined keyword. The AoP acts on the voice command using the context data, including in some embodiments automatically triggering a fast dormancy of a communications channel.

Type: Grant

Filed: August 23, 2017

Date of Patent: December 3, 2019

Assignee: Qualcomm Incorporated

Inventors: Nishith Chaubey, Anil Rao, James Francis Geekie
Voice recognition to authenticate a mobile payment

Patent number: 10423959

Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.

Type: Grant

Filed: December 18, 2017

Date of Patent: September 24, 2019

Assignee: Capital One Services, LLC

Inventors: Lawrence Douglas, Paul Y. Moreton

1 2 3 4 5 … next