Patents Examined by Vijay B. Chawan

Systems and methods for training a model to determine whether a query with multiple segments comprises multiple distinct commands or a combined command

Patent number: 10747960

Abstract: Systems and methods are disclosed herein for training a model to accurately determine whether two phrases are conversationally connected. A media guidance application may detect a first phrase and a second phrase, translate each phrase to a string of word types, append each string to the back of a prior string to create a combined string, determine a degree to which any of the individual strings, matches any singleton template, and determine a degree to which the combined string matches any conversational template. Based on the degrees to which the individual and combination strings match the singleton and conversational templates, respectively, strengths of association are correspondingly updated.

Type: Grant

Filed: October 11, 2018

Date of Patent: August 18, 2020

Assignee: ROVI GUIDES, INC.

Inventors: Sashikumar Venkataraman, Ahmed Nizam Mohaideen P, Manik Malhotra
Voice processing device, voice processing method, and program

Patent number: 10748544

Abstract: A voice processing device includes: a sound source localization unit configured to determine a direction of each sound source on the basis of voice signals of a plurality of channels; a sound source separation unit configured to separate signals for respective sound sources indicating components of respective sound sources from the voice signals of the plurality of channels; a speech section detection unit configured to detect a speech section in which the number of speakers is 1 from the signals for respective sound sources; and a speaker identification unit configured to identify a speaker on the basis of the signals for respective sound sources in the speech section.

Type: Grant

Filed: March 23, 2018

Date of Patent: August 18, 2020

Assignee: HONDA MOTOR CO., LTD.

Inventors: Kazuhiro Nakadai, Tomoyuki Sahata
Customizing responses to users in automated dialogue systems

Patent number: 10741176

Abstract: Mechanisms are provided for customizing responses to future questions based on identified anomalies in user profile information. An automated dialogue system monitors information associated with a plurality of entities, where the information includes quantities for variable values associated with the entities. The automated dialogue system, in response to determining that a quantity of a variable value associated with an entity in the plurality of entities has changed by an amount equal to or exceeding a corresponding threshold value, generates response information associated with a quantity of the variable value and an entity to respond to at least one future question. In addition, the automated dialogue system stores the responsive information in association with the entity for later retrieval in response to initiation of a dialogue session with the automated dialogue system. Time thresholds may be established for determining when to stop using the responsive information for responding to questions.

Type: Grant

Filed: January 31, 2018

Date of Patent: August 11, 2020

Assignee: International Business Machines Corporation

Inventors: Mustafa Canim, Robert G. Farrell, John A. Gunnels, Arun K. Iyengar
System and method of automated model adaptation

Patent number: 10733977

Abstract: Methods, systems, and computer readable media for automated transcription model adaptation includes obtaining audio data from a plurality of audio files. The audio data is transcribed to produce at least one audio file transcription which represents a plurality of transcription alternatives for each audio file. Speech analytics are applied to each audio file transcription. A best transcription is selected from the plurality of transcription alternatives for each audio file. Statistics from the selected best transcription are calculated. An adapted model is created from the calculated statistics.

Type: Grant

Filed: March 29, 2017

Date of Patent: August 4, 2020

Assignee: Verint Systems Ltd.

Inventors: Ran Achituv, Omer Ziv, Roni Romano, Ido Shapira, Daniel Baum
Multisensory speech detection

Patent number: 10720176

Abstract: A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.

Type: Grant

Filed: August 22, 2018

Date of Patent: July 21, 2020

Assignee: Google LLC

Inventors: Dave Burke, Michael J. Lebeau, Konrad Gianno, Trausti T. Kristjansson, John Nicholas Jitkoff, Andrew W. Senior
Computer-based systems utilizing textual embedding space software engines for identifying candidate phrases in a text document and methods of use thereof

Patent number: 10719666

Abstract: A method for identifying phrases in a text document having a similar discourse to a candidate phrase includes separating text in a document file into a plurality of phrases and generating a plurality of embedding vectors in a textual embedding space by inputting the plurality of phrases into an embedding engine. A mapping of each embedding vector in the textual embedding space is generated with each corresponding phrase and a document location of each corresponding phrase in the document file. A candidate phrase is received by a user and a candidate embedding vector is generated using the embedding engine. Similarity scores are computed based on the plurality of embedding space distances between the candidate phrase embedding vector location and each respective location of each embedding vector in the textual embedding space. A listing of phrases with the highest similarity scores are outputted with respective document locations in the text.

Type: Grant

Filed: January 31, 2020

Date of Patent: July 21, 2020

Assignee: Capital One Services, LLC

Inventors: Austin Walters, Vincent Pham, Ernest Kwak, Galen Rafferty, Reza Farivar, Jeremy Goodsitt, Anh Truong
Multisensory speech detection

Patent number: 10714120

Abstract: A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.

Type: Grant

Filed: June 25, 2018

Date of Patent: July 14, 2020

Assignee: Google LLC

Inventors: Dave Burke, Michael J. Lebeau, Konrad Gianno, Trausti T. Kristjansson, John Nicholas Jitkoff, Andrew W. Senior
Multi-modal input on an electronic device

Patent number: 10713010

Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.

Type: Grant

Filed: October 24, 2018

Date of Patent: July 14, 2020

Assignee: Google LLC

Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. Lebeau
Music detection and identification

Patent number: 10714092

Abstract: A sensor processing unit comprises a sensor processor. The sensor processor is configured to communicatively couple with a microphone. The sensor processor is configured to acquire, from the microphone, a sample captured by the microphone from an environment in which the microphone is disposed. The sensor processor is configured to perform music activity detection on the audio sample to detect for music within the audio sample. Responsive to detection of music within the audio sample, the sensor processor is configured to send a music detection signal to an external processor located external to the sensor processing unit, the music detection signal indicating that music has been detected in the environment.

Type: Grant

Filed: October 1, 2018

Date of Patent: July 14, 2020

Assignee: InvenSense, Inc.

Inventors: William Kerry Keal, Ajay Kumar Dhanapalan, Sangnam Choi, Carlo Murgia, Eitan A. Medina, Taro Kimura
Multimedia processing circuit and electronic system

Patent number: 10706855

Abstract: A multimedia processing circuit is provided. The multimedia processing circuit includes a smart interpreter engine and an audio engine. The smart interpreter engine includes a noise suppression module, a vocal identification module and a speech to text converter. The noise suppression module is utilized for performing a noise suppression process on speech data corresponding to a first language. The vocal identification module is utilized for performing a vocal identification process on the noise-suppressed speech data corresponding to the first language to generate vocal identification data corresponding to the first language. The speech to text converter is utilized for converting the vocal identification data corresponding to the first language into text data corresponding to the first language.

Type: Grant

Filed: November 26, 2018

Date of Patent: July 7, 2020

Assignee: ACER INCORPORATED

Inventors: Gianna Tseng, Shih-Cheng Huang, Shang-Yao Lin, Szu-Ting Chou
Active noise cancellation for bone conduction speaker of a head-mounted wearable device

Patent number: 10699691

Abstract: A head-mounted wearable device (HMWD) may incorporate bone conduction speakers (BCS) to generate audio output that is perceptible to a wearer. During operation, vibrations of the BCS may produce sound in the surrounding air that is perceptible to bystanders. An active noise cancellation module monitors the output and generates cancellation audio that is out of phase with the sound leaked by the BCS. An air conduction speaker emits the cancellation audio, producing destructive acoustic interference to the leaked sound. As a result, the user of the HMWD is able to hear audio clearly while bystanders are not.

Type: Grant

Filed: June 29, 2017

Date of Patent: June 30, 2020

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Geng Ye, Zhen Xu, Haoxing Zhang, Bozhao Tan, Xuan Zhong, Jianchun Dong
Voice data processing method, apparatus and storage medium

Patent number: 10692503

Abstract: A voice data processing method and apparatus are provided. The method includes obtaining an I-Vector vector of each of voice samples, and determining a target seed sample in the voice samples. A first cosine distance is calculated between an I-Vector vector of the target seed sample and an I-Vector vector of a target remaining voice sample, where the target remaining voice sample is a voice sample other than the target seed sample in the voice samples. A target voice sample is filtered from the voice samples or the target remaining voice sample according to the first cosine distance, to obtain a target voice sample whose first cosine distance is greater than a first threshold.

Type: Grant

Filed: March 3, 2017

Date of Patent: June 23, 2020

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Xingming Jin, Wei Li, Fangmai Zheng, Fuzhang Wu, Bilei Zhu, Binghua Qian, Ke Li, Yongjian Wu, Feiyue Huang
Hotword suppression

Patent number: 10692496

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotwords are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to playback of an utterance. The actions further include providing the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample. The actions further include receiving, from the model, data indicating whether the audio data includes the audio watermark. The actions further include, based on the data indicating whether the audio data includes the audio watermark, determining to continue or cease processing of the audio data.

Type: Grant

Filed: May 21, 2019

Date of Patent: June 23, 2020

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Taral Pradeep Joglekar, Vijayaditya Peddinti, Michiel A. U. Bacchiani
Hearing aid

Patent number: 10694298

Abstract: A hearing aid configured for detecting and enhancing speech within an audio environment is disclosed. An incoming audio stream is continuously monitored for the presence of speech within the audio stream. A Codebook Excited Linear Prediction (“CELP”) encoder analyzes the incoming audio stream and outputs an indication of a presence or absence of human speech within the incoming audio stream. Upon detection of human speech, the hearing aid in real time may: amplify the audio input to make the speech more audible to a wearer; filter non-speech audio through isolation of the speech by passing the output of the CELP encoder directly to a CELP decoder; activate a beam-steering process which makes dominant a microphone closest to a speaker while de-prioritizing input from other microphones of the hearing aid; and/or shape the audio spectrum conveyed by the audio input using a response curve optimized for better clarity of human speech.

Type: Grant

Filed: October 22, 2019

Date of Patent: June 23, 2020

Inventors: Zeev Neumeier, W. Leo Hoarty
Identification of candidate training utterances from human conversations with an intelligent interactive assistant

Patent number: 10685645

Abstract: A method for creating binary classification models and using the binary classification models to select candidate training utterances from a plurality of live utterances is provided. The method may include receiving a plurality of intents and associated training utterances. The method may include creating, from the training utterances, a binary classification model for each intent. The binary classification model may include a vector representation of a line of demarcation between utterances associated with the intent and utterances disassociated from the intent. The method may also include receiving live utterances. An intent may be determined for each live utterance. The method may include creating a vector representation of the live utterance. The method may include selecting candidate training utterances based on a comparison between the vector representation of the live utterance and the vector representation included in the binary classification model of the intent determined for the live utterance.

Type: Grant

Filed: August 9, 2018

Date of Patent: June 16, 2020

Assignee: Bank of America Corporation

Inventors: Maruthi Z. Shanmugam, Luis Gerardo Mojica de la Vega, Donatus Asumu
Delivery of a digital therapeutic method and system

Patent number: 10682086

Abstract: Disclosed herein is a system and method for delivering a digital therapeutic, specific to a user emotional or mental state (EMS). This entails recognizing at least one EMS selected by the user from a plurality of EMS, said selected EMS indicating at least one of a feeling, sensation, type of discomfort, mood, mental state, emotional condition, or physical status of the user. Pushing at least a primary-level message personalized to the user based on at least one stored message coupled to the selected EMS; wherein the at least primary-level messages contain at least one of a text, image, sound, video, art asset, suggested action or recommended behavior. The actions suggested or behaviors recommended are supported by at least one independent source of peer-reviewed research, as verified by a credentialed EMS expert.

Type: Grant

Filed: April 20, 2018

Date of Patent: June 16, 2020

Assignee: Aebeze labs

Inventor: Michael Phillips Moskowitz
Surgical system with voice control

Patent number: 10667878

Abstract: A surgical system includes a plurality of voice sensors located in a surgical environment and configured to detect sound and generate a first plurality of signals. The surgical system also includes a position indicator, in proximity to a designated user, configured to indicate a first position of the designated user and generate a second signal representative of the first position. The surgical system further includes a processor configured to receive the first plurality of signals and the second signal and determine, based on the first plurality of signals, a second position. The processor is also configured to compare the detected sound with registered voice command of the designated user stored in a memory to verify the designated user's credentials, and send a command signal to a surgical instrument to carry out an operation related to the voice command based on at least one of the verification of the designated user's credentials, the first position and the second position.

Type: Grant

Filed: June 20, 2018

Date of Patent: June 2, 2020

Assignee: Ethicon LLC

Inventors: David J. Cagle, Eric Smith, Jeffrey L. Aldridge, Mary E. Mootoo, Ryan Asher
Systems and methods for principled bias reduction in production speech models

Patent number: 10657955

Abstract: Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.

Type: Grant

Filed: January 30, 2018

Date of Patent: May 19, 2020

Assignee: Baidu USA LLC

Inventors: Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
Methods, apparatus and system for rendering an audio program

Patent number: 10650833

Abstract: A method for generating a bitstream indicative of an object based audio program is described. The bitstream comprises a sequence of containers. A first container of the sequence of containers comprises a plurality of substream entities for a plurality of substreams of the object based audio program and a presentation section. The method comprises determining a set of object channels. The method further comprises providing a set of object related metadata for the set of object channels. In addition, the method comprises inserting a first set of object channel frames and a first set of object related metadata frames into a respective set of substream entities of the first container. Furthermore, the method comprises inserting presentation data into the presentation section.

Type: Grant

Filed: October 2, 2018

Date of Patent: May 12, 2020

Assignee: Dolby International AB

Inventors: Christof Fersch, Alexander Stahlmann
Determining phonetic relationships

Patent number: 10650810

Abstract: Systems and methods of determining phonetic relationships are provided. For instance data indicative of an input text phrase input by a user can be received. An audio output corresponding to a spoken rendering of the input text phrase can be determined. A text transcription of the audio output of the input text phrase can be determined. The text transcription can be a textual representation of the audio output. The text transcription can be compared against a plurality of test phrases to identify a match between the text transcription and at least one test phrase.

Type: Grant

Filed: September 29, 2017

Date of Patent: May 12, 2020

Assignee: GOOGLE LLC

Inventors: Nikhil Chandru Rao, Saisuresh Krishnakumaran

prev … 7 8 9 10 11 12 13 14 15 … next