Patents Examined by Vijay B. Chawan
  • Patent number: 10747960
    Abstract: Systems and methods are disclosed herein for training a model to accurately determine whether two phrases are conversationally connected. A media guidance application may detect a first phrase and a second phrase, translate each phrase to a string of word types, append each string to the back of a prior string to create a combined string, determine a degree to which any of the individual strings, matches any singleton template, and determine a degree to which the combined string matches any conversational template. Based on the degrees to which the individual and combination strings match the singleton and conversational templates, respectively, strengths of association are correspondingly updated.
    Type: Grant
    Filed: October 11, 2018
    Date of Patent: August 18, 2020
    Assignee: ROVI GUIDES, INC.
    Inventors: Sashikumar Venkataraman, Ahmed Nizam Mohaideen P, Manik Malhotra
  • Patent number: 10748544
    Abstract: A voice processing device includes: a sound source localization unit configured to determine a direction of each sound source on the basis of voice signals of a plurality of channels; a sound source separation unit configured to separate signals for respective sound sources indicating components of respective sound sources from the voice signals of the plurality of channels; a speech section detection unit configured to detect a speech section in which the number of speakers is 1 from the signals for respective sound sources; and a speaker identification unit configured to identify a speaker on the basis of the signals for respective sound sources in the speech section.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: August 18, 2020
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Tomoyuki Sahata
  • Patent number: 10741176
    Abstract: Mechanisms are provided for customizing responses to future questions based on identified anomalies in user profile information. An automated dialogue system monitors information associated with a plurality of entities, where the information includes quantities for variable values associated with the entities. The automated dialogue system, in response to determining that a quantity of a variable value associated with an entity in the plurality of entities has changed by an amount equal to or exceeding a corresponding threshold value, generates response information associated with a quantity of the variable value and an entity to respond to at least one future question. In addition, the automated dialogue system stores the responsive information in association with the entity for later retrieval in response to initiation of a dialogue session with the automated dialogue system. Time thresholds may be established for determining when to stop using the responsive information for responding to questions.
    Type: Grant
    Filed: January 31, 2018
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Mustafa Canim, Robert G. Farrell, John A. Gunnels, Arun K. Iyengar
  • Patent number: 10733977
    Abstract: Methods, systems, and computer readable media for automated transcription model adaptation includes obtaining audio data from a plurality of audio files. The audio data is transcribed to produce at least one audio file transcription which represents a plurality of transcription alternatives for each audio file. Speech analytics are applied to each audio file transcription. A best transcription is selected from the plurality of transcription alternatives for each audio file. Statistics from the selected best transcription are calculated. An adapted model is created from the calculated statistics.
    Type: Grant
    Filed: March 29, 2017
    Date of Patent: August 4, 2020
    Assignee: Verint Systems Ltd.
    Inventors: Ran Achituv, Omer Ziv, Roni Romano, Ido Shapira, Daniel Baum
  • Patent number: 10720176
    Abstract: A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: July 21, 2020
    Assignee: Google LLC
    Inventors: Dave Burke, Michael J. Lebeau, Konrad Gianno, Trausti T. Kristjansson, John Nicholas Jitkoff, Andrew W. Senior
  • Patent number: 10719666
    Abstract: A method for identifying phrases in a text document having a similar discourse to a candidate phrase includes separating text in a document file into a plurality of phrases and generating a plurality of embedding vectors in a textual embedding space by inputting the plurality of phrases into an embedding engine. A mapping of each embedding vector in the textual embedding space is generated with each corresponding phrase and a document location of each corresponding phrase in the document file. A candidate phrase is received by a user and a candidate embedding vector is generated using the embedding engine. Similarity scores are computed based on the plurality of embedding space distances between the candidate phrase embedding vector location and each respective location of each embedding vector in the textual embedding space. A listing of phrases with the highest similarity scores are outputted with respective document locations in the text.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: July 21, 2020
    Assignee: Capital One Services, LLC
    Inventors: Austin Walters, Vincent Pham, Ernest Kwak, Galen Rafferty, Reza Farivar, Jeremy Goodsitt, Anh Truong
  • Patent number: 10714120
    Abstract: A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.
    Type: Grant
    Filed: June 25, 2018
    Date of Patent: July 14, 2020
    Assignee: Google LLC
    Inventors: Dave Burke, Michael J. Lebeau, Konrad Gianno, Trausti T. Kristjansson, John Nicholas Jitkoff, Andrew W. Senior
  • Patent number: 10713010
    Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: July 14, 2020
    Assignee: Google LLC
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. Lebeau
  • Patent number: 10714092
    Abstract: A sensor processing unit comprises a sensor processor. The sensor processor is configured to communicatively couple with a microphone. The sensor processor is configured to acquire, from the microphone, a sample captured by the microphone from an environment in which the microphone is disposed. The sensor processor is configured to perform music activity detection on the audio sample to detect for music within the audio sample. Responsive to detection of music within the audio sample, the sensor processor is configured to send a music detection signal to an external processor located external to the sensor processing unit, the music detection signal indicating that music has been detected in the environment.
    Type: Grant
    Filed: October 1, 2018
    Date of Patent: July 14, 2020
    Assignee: InvenSense, Inc.
    Inventors: William Kerry Keal, Ajay Kumar Dhanapalan, Sangnam Choi, Carlo Murgia, Eitan A. Medina, Taro Kimura
  • Patent number: 10706855
    Abstract: A multimedia processing circuit is provided. The multimedia processing circuit includes a smart interpreter engine and an audio engine. The smart interpreter engine includes a noise suppression module, a vocal identification module and a speech to text converter. The noise suppression module is utilized for performing a noise suppression process on speech data corresponding to a first language. The vocal identification module is utilized for performing a vocal identification process on the noise-suppressed speech data corresponding to the first language to generate vocal identification data corresponding to the first language. The speech to text converter is utilized for converting the vocal identification data corresponding to the first language into text data corresponding to the first language.
    Type: Grant
    Filed: November 26, 2018
    Date of Patent: July 7, 2020
    Assignee: ACER INCORPORATED
    Inventors: Gianna Tseng, Shih-Cheng Huang, Shang-Yao Lin, Szu-Ting Chou
  • Patent number: 10699691
    Abstract: A head-mounted wearable device (HMWD) may incorporate bone conduction speakers (BCS) to generate audio output that is perceptible to a wearer. During operation, vibrations of the BCS may produce sound in the surrounding air that is perceptible to bystanders. An active noise cancellation module monitors the output and generates cancellation audio that is out of phase with the sound leaked by the BCS. An air conduction speaker emits the cancellation audio, producing destructive acoustic interference to the leaked sound. As a result, the user of the HMWD is able to hear audio clearly while bystanders are not.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: June 30, 2020
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Geng Ye, Zhen Xu, Haoxing Zhang, Bozhao Tan, Xuan Zhong, Jianchun Dong
  • Patent number: 10692503
    Abstract: A voice data processing method and apparatus are provided. The method includes obtaining an I-Vector vector of each of voice samples, and determining a target seed sample in the voice samples. A first cosine distance is calculated between an I-Vector vector of the target seed sample and an I-Vector vector of a target remaining voice sample, where the target remaining voice sample is a voice sample other than the target seed sample in the voice samples. A target voice sample is filtered from the voice samples or the target remaining voice sample according to the first cosine distance, to obtain a target voice sample whose first cosine distance is greater than a first threshold.
    Type: Grant
    Filed: March 3, 2017
    Date of Patent: June 23, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Xingming Jin, Wei Li, Fangmai Zheng, Fuzhang Wu, Bilei Zhu, Binghua Qian, Ke Li, Yongjian Wu, Feiyue Huang
  • Patent number: 10692496
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotwords are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to playback of an utterance. The actions further include providing the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample. The actions further include receiving, from the model, data indicating whether the audio data includes the audio watermark. The actions further include, based on the data indicating whether the audio data includes the audio watermark, determining to continue or cease processing of the audio data.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: June 23, 2020
    Assignee: Google LLC
    Inventors: Alexander H. Gruenstein, Taral Pradeep Joglekar, Vijayaditya Peddinti, Michiel A. U. Bacchiani
  • Patent number: 10694298
    Abstract: A hearing aid configured for detecting and enhancing speech within an audio environment is disclosed. An incoming audio stream is continuously monitored for the presence of speech within the audio stream. A Codebook Excited Linear Prediction (“CELP”) encoder analyzes the incoming audio stream and outputs an indication of a presence or absence of human speech within the incoming audio stream. Upon detection of human speech, the hearing aid in real time may: amplify the audio input to make the speech more audible to a wearer; filter non-speech audio through isolation of the speech by passing the output of the CELP encoder directly to a CELP decoder; activate a beam-steering process which makes dominant a microphone closest to a speaker while de-prioritizing input from other microphones of the hearing aid; and/or shape the audio spectrum conveyed by the audio input using a response curve optimized for better clarity of human speech.
    Type: Grant
    Filed: October 22, 2019
    Date of Patent: June 23, 2020
    Inventors: Zeev Neumeier, W. Leo Hoarty
  • Patent number: 10685645
    Abstract: A method for creating binary classification models and using the binary classification models to select candidate training utterances from a plurality of live utterances is provided. The method may include receiving a plurality of intents and associated training utterances. The method may include creating, from the training utterances, a binary classification model for each intent. The binary classification model may include a vector representation of a line of demarcation between utterances associated with the intent and utterances disassociated from the intent. The method may also include receiving live utterances. An intent may be determined for each live utterance. The method may include creating a vector representation of the live utterance. The method may include selecting candidate training utterances based on a comparison between the vector representation of the live utterance and the vector representation included in the binary classification model of the intent determined for the live utterance.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: June 16, 2020
    Assignee: Bank of America Corporation
    Inventors: Maruthi Z. Shanmugam, Luis Gerardo Mojica de la Vega, Donatus Asumu
  • Patent number: 10682086
    Abstract: Disclosed herein is a system and method for delivering a digital therapeutic, specific to a user emotional or mental state (EMS). This entails recognizing at least one EMS selected by the user from a plurality of EMS, said selected EMS indicating at least one of a feeling, sensation, type of discomfort, mood, mental state, emotional condition, or physical status of the user. Pushing at least a primary-level message personalized to the user based on at least one stored message coupled to the selected EMS; wherein the at least primary-level messages contain at least one of a text, image, sound, video, art asset, suggested action or recommended behavior. The actions suggested or behaviors recommended are supported by at least one independent source of peer-reviewed research, as verified by a credentialed EMS expert.
    Type: Grant
    Filed: April 20, 2018
    Date of Patent: June 16, 2020
    Assignee: Aebeze labs
    Inventor: Michael Phillips Moskowitz
  • Patent number: 10667878
    Abstract: A surgical system includes a plurality of voice sensors located in a surgical environment and configured to detect sound and generate a first plurality of signals. The surgical system also includes a position indicator, in proximity to a designated user, configured to indicate a first position of the designated user and generate a second signal representative of the first position. The surgical system further includes a processor configured to receive the first plurality of signals and the second signal and determine, based on the first plurality of signals, a second position. The processor is also configured to compare the detected sound with registered voice command of the designated user stored in a memory to verify the designated user's credentials, and send a command signal to a surgical instrument to carry out an operation related to the voice command based on at least one of the verification of the designated user's credentials, the first position and the second position.
    Type: Grant
    Filed: June 20, 2018
    Date of Patent: June 2, 2020
    Assignee: Ethicon LLC
    Inventors: David J. Cagle, Eric Smith, Jeffrey L. Aldridge, Mary E. Mootoo, Ryan Asher
  • Patent number: 10657955
    Abstract: Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.
    Type: Grant
    Filed: January 30, 2018
    Date of Patent: May 19, 2020
    Assignee: Baidu USA LLC
    Inventors: Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
  • Patent number: 10650833
    Abstract: A method for generating a bitstream indicative of an object based audio program is described. The bitstream comprises a sequence of containers. A first container of the sequence of containers comprises a plurality of substream entities for a plurality of substreams of the object based audio program and a presentation section. The method comprises determining a set of object channels. The method further comprises providing a set of object related metadata for the set of object channels. In addition, the method comprises inserting a first set of object channel frames and a first set of object related metadata frames into a respective set of substream entities of the first container. Furthermore, the method comprises inserting presentation data into the presentation section.
    Type: Grant
    Filed: October 2, 2018
    Date of Patent: May 12, 2020
    Assignee: Dolby International AB
    Inventors: Christof Fersch, Alexander Stahlmann
  • Patent number: 10650810
    Abstract: Systems and methods of determining phonetic relationships are provided. For instance data indicative of an input text phrase input by a user can be received. An audio output corresponding to a spoken rendering of the input text phrase can be determined. A text transcription of the audio output of the input text phrase can be determined. The text transcription can be a textual representation of the audio output. The text transcription can be compared against a plurality of test phrases to identify a match between the text transcription and at least one test phrase.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: May 12, 2020
    Assignee: GOOGLE LLC
    Inventors: Nikhil Chandru Rao, Saisuresh Krishnakumaran