Patents Examined by Richemond Dorvil
  • Patent number: 11769004
    Abstract: A computer system may create a language model corpus including multilingual alignment for training a combined language model and train (or pre-train) the combined language model. The computer system may create an adverse medication reaction corpus to include adverse medication reaction utterances and label an N-gram of an utterance in the adverse medication reaction utterances as a response to query, for multiple N-grams. The computer system may generate a code-mixed utterance model to perform code-mixed utterances in a turn by turn dialogue, by at least adding additional output layer including at least a start vector, language vector, and a query vector including at least the labeled N-gram, which are additional to the combined language model's predicted next words.
    Type: Grant
    Filed: January 2, 2020
    Date of Patent: September 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Victor Abayomi Akinwande, Celia Cintas, Aisha Walcott, William Ogallo, Sekou Lionel Remy
  • Patent number: 11763827
    Abstract: A method and device for extracting information from acoustic signals receives acoustic signals by a microphone, processes them in an analog front-end circuit, converts the processed signals from the analog front-end circuit to digital signals by sampling at a rate of less than 1 kHz or more preferably less than 500 kHz; and processes the digital signals by a digital back-end classifier circuit. The analog front-end processing decomposes the received signals into frequency components using a bank of analog N-path bandpass filters having different subband center frequencies.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: September 19, 2023
    Assignee: The Board of Trustees of the Leland Stanford Junior University
    Inventors: Boris Murmann, Daniel Augusto Villamizar
  • Patent number: 11763839
    Abstract: According to one embodiment, a voice activity detection apparatus comprises a processing circuit. The processing circuit calculates an acoustic feature based on an acoustic signal; calculates a non-acoustic feature based on a non-acoustic signal; calculates a correlation coefficient based on the acoustic feature and the non-acoustic feature; and detects a voice section and/or a non-voice section based on a comparison of the correlation coefficient with a threshold, the voice section being a time section in which voice is presence, the non-voice section being a time section in which voice is absence.
    Type: Grant
    Filed: August 25, 2021
    Date of Patent: September 19, 2023
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Uihyun Kim
  • Patent number: 11756555
    Abstract: A system is provided to categorize voice prints during a voice authentication. The system includes a processor and a computer readable medium operably coupled thereto, to perform voice authentication operations which include receiving an enrollment of a user in the biometric authentication system, requesting a first voice print comprising a sample of a voice of the user, receiving the first voice print of the user during the enrollment, accessing a plurality of categorizations of the voice prints for the voice authentication, wherein each of the plurality of categorizations comprises a portion of the voice prints based on a plurality of similarity scores of distinct voice prints in the portion to a plurality of other voice prints, determining, using a hidden layer of a neural network, one of the plurality of categorizations for the first voice print, and encoding the first voice print with the one of the plurality of categorizations.
    Type: Grant
    Filed: May 6, 2021
    Date of Patent: September 12, 2023
    Assignee: NICE LTD.
    Inventors: Natan Katz, Tal Haguel
  • Patent number: 11756572
    Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.
    Type: Grant
    Filed: December 2, 2020
    Date of Patent: September 12, 2023
    Assignee: Google LLC
    Inventors: Joel Shor, Alanna Foster Slocum
  • Patent number: 11756544
    Abstract: Implementations described herein receive audio data that captures a spoken utterance, generate, based on processing the audio data, a recognition that corresponds to the spoken utterance, and determine, based on processing the recognition, that the spoken utterance is ambiguous (i.e., is interpretable as requesting performance of a first particular action exclusively and is also interpretable a second particular action exclusively). In response to determining that the spoken utterance is ambiguous, implementations determine to provide an enhanced clarification prompt that renders output that is in addition to natural language. The enhanced clarification prompt solicits further user interface input for disambiguating between the first particular action and the second particular action.
    Type: Grant
    Filed: December 15, 2020
    Date of Patent: September 12, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11735167
    Abstract: Disclosed is an electronic device recognizing an utterance voice in units of individual characters. The electronic device includes: a voice receiver; and a processor configured to: obtain a recognition character converted from a character section of a user voice received through the voice receiver, and recognize a candidate character having high acoustic feature related similarity with the character section among a plurality of acquired candidate characters as an utterance character of the character section based on a confusion possibility with the acquired recognition character.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: August 22, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jihun Park, Dongheon Seok
  • Patent number: 11727302
    Abstract: A method and apparatus for building a conversation understanding system based on artificial intelligence, a device and a computer-readable storage medium. In embodiments of the present disclosure, it is feasible to obtain the training feedback information provided by conversation service conducted by the user and the basic conversation understanding system, then according to the training feedback information, perform adjustment processing for a service state of the basic conversation understanding system, to obtain an adjustment state of the basic conversation understanding system. It is possible to perform data merging processing according to the training feedback information and the adjustment state of the basic conversation understanding system, to obtain model training data for building the model conversation understanding system.
    Type: Grant
    Filed: June 12, 2018
    Date of Patent: August 15, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ke Sun, Shiqi Zhao, Dianhai Yu, Haifeng Wang
  • Patent number: 11715461
    Abstract: Computer implemented method and system for automatic speech recognition. A first speech sequence is processed, using a time reduction operation of an encoder NN, into a second speech sequence comprising a second set of speech frame feature vectors that each concatenate information from a respective plurality of speech frame feature vectors included in the first set and includes fewer speech frame feature vectors than the first speech sequence. The second speech sequence is transformed, using a self-attention operation of the encoder NN, into a third speech sequence comprising a third set of speech frame feature vectors. The third speech sequence is processed using a probability operation of the encoder NN, to predict a sequence of first labels corresponding to the third set of speech frame feature vectors, and using a decoder NN to predict a sequence of second labels corresponding to the third set of speech frame feature vectors.
    Type: Grant
    Filed: October 21, 2020
    Date of Patent: August 1, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Md Akmal Haidar, Chao Xing
  • Patent number: 11705109
    Abstract: A method of detecting live speech comprises: receiving a signal containing speech; obtaining a first component of the received signal in a first frequency band, wherein the first frequency band includes audio frequencies; and obtaining a second component of the received signal in a second frequency band higher than the first frequency band. Then, modulation of the first component of the received signal is detected; modulation of the second component of the received signal is detected; and the modulation of the first component of the received signal and the modulation of the second component of the received signal are compared. It may then be determined that the speech may not be live speech, if the modulation of the first component of the received signal differs from the modulation of the second component of the received signal.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: July 18, 2023
    Assignee: Cirrus Logic, Inc.
    Inventors: John Paul Lesso, Toru Ido
  • Patent number: 11704504
    Abstract: Provided are an interactive machine translation method and apparatus, a device, and a medium. The method includes: acquiring a source statement input by a user; translating the source statement into a first target statement; determining whether the user adjusts a first vocabulary in the first target statement; and in response to determining that the user adjusts the first vocabulary in the first target statement, acquiring a second vocabulary for replacing the first vocabulary, and adjusting, based on the second vocabulary, a vocabulary sequence located in a front of the first vocabulary and a vocabulary sequence located behind the first vocabulary in the first target statement to generate a second target statement.
    Type: Grant
    Filed: February 16, 2021
    Date of Patent: July 18, 2023
    Assignee: Beijing Bytedance Network Technology Co., Ltd.
    Inventors: Lei Li, Mingxuan Wang, Hao Zhou, Zewei Sun
  • Patent number: 11704498
    Abstract: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second training sample set.
    Type: Grant
    Filed: March 12, 2021
    Date of Patent: July 18, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Zhi Li, Hua Wu
  • Patent number: 11694692
    Abstract: A system accesses a first digital audio file that includes a plurality of spoken instructions. The system converts the first digital audio file to a first spectrogram image, applies a filter to determine whether an image quality of the first spectrogram image is below a predetermined image quality, and in response, generates a second spectrogram image from the first spectrogram image using a training model. The system converts the second spectrogram image to a second digital audio file and converts the second digital audio file into multiple vectors that each correspond to a particular spoken instruction. The system identifies related vectors and concatenates the related vectors together in order to create a plurality of concatenated vectors. The system generates, using the plurality of concatenated vectors, a third digital audio file that includes concatenated spoken instructions from the first digital audio file.
    Type: Grant
    Filed: November 11, 2020
    Date of Patent: July 4, 2023
    Assignee: Bank of America Corporation
    Inventors: Madhusudhanan Krishnamoorthy, Ayesha Farha Ameer Hamza, Ramya Gangathara Rao
  • Patent number: 11669694
    Abstract: A method of obtaining, by an electronic device, a sentence corresponding to context information, including obtaining first output information including at least one word output by decoding the context information based on at least one data; based on detecting that a first token is not included in the first output information, determining whether a number of words included in the first output information is greater than or equal to a reference value; based on a result of the determining, replacing the at least one data with other data; and obtaining the sentence corresponding to the context information based on at least one output information obtained by decoding the context information based on the other data.
    Type: Grant
    Filed: October 22, 2020
    Date of Patent: June 6, 2023
    Assignees: SAMSUNG ELECTRONICS CO., LTD., NEW YORK UNIVERSITY
    Inventors: Yoonjung Choi, Jaedeok Kim, Ilia Kulikov, Sean Welleck, Yuanzhe Pang, Kyunghyun Cho
  • Patent number: 11669687
    Abstract: Systems, apparatuses, methods, and computer program products are disclosed for determining robustness information for an NLP model. Modification rules, such as replacement rules and/or insertion rules, are used to generate instances of modified test data based on instances of test data that comprise words and have a syntax and a semantic meaning. The instances of test data and modified test data are provided to the NLP model and the output of the NLP model is analyzed to determine output changing instances of modified test data, which are instances of modified test data yielded output from the NLP model that is different and/or not similar to the output yielded from the NLP model for the corresponding instance of test data. Robustness information for the NLP model is determined based at least in part on the output changing instances of modified test data. White and/or black box attacks may be performed.
    Type: Grant
    Filed: November 12, 2020
    Date of Patent: June 6, 2023
    Assignee: Wells Fargo Bank, N.A.
    Inventors: Tarun Joshi, Rahul Singh, Vijayan Nair, Agus Sudjianto
  • Patent number: 11664010
    Abstract: Systems and methods for generating a natural language domain corpus to train a machine learning natural language understanding process. A base utterance expressing an intent and an intent profile indicating at least one of categories, keywords, concepts, sentiment, entities, or emotion of the intent are received. Machine translation translates the base utterance into a plurality of foreign language utterances and back into respective utterances in the target natural language to create a normalized utterance set. Analysis of each utterance in the normalized utterance set determines respective meta information for each such utterance. Comparison of the meta information to the intent profile determines a highest ranking matching utterance within the normalized utterance set. A set of natural language data to train a machine learning natural language understating process is created based on further natural language translations of the highest ranking matching utterance.
    Type: Grant
    Filed: November 3, 2020
    Date of Patent: May 30, 2023
    Assignee: Florida Power & Light Company
    Inventors: Brien H. Muschett, Joshua D. Calhoun
  • Patent number: 11662610
    Abstract: A smart device input method based on facial vibration includes: collecting a facial vibration signal generated when a user performs voice input; extracting a Mel-frequency cepstral coefficient from the facial vibration signal; and taking the Mel-frequency cepstral coefficient as an observation sequence to obtain text input corresponding to the facial vibration signal by using a trained hidden Markov model. The facial vibration signal is collected by a vibration sensor arranged on glasses. The vibration signal is processed by: amplifying the collected facial vibration signal; transmitting the amplified facial vibration signal to the smart device via a wireless module; and intercepting a section from the received facial vibration signal as an effective portion and extracting the Mel-frequency cepstral coefficient from the effective portion by the smart device.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: May 30, 2023
    Assignee: SHENZHEN UNIVERSITY
    Inventors: Kaishun Wu, Maoning Guan
  • Patent number: 11651166
    Abstract: A learning device of a phrase generation model includes a memory; and a processor configured to execute learning the phrase generation model including an encoder and a decoder, by using, as training data, a 3-tuple. The 3-tuple includes a combination of phrases and at least one of a conjunctive expression representing a relationship between the phrases, and a relational label indicating the relationship represented by the conjunctive expression. The encoder is configured to convert a phrase into a vector from a 2-tuple. The 2-tuple includes a phrase and at least one of the conjunctive expression and the relational label. The decoder is configured to generate, from the converted vector and the conjunctive expression or the relational label, a phrase having the relationship represented by the conjunctive expression or the relational label with respect to the phrase.
    Type: Grant
    Filed: February 22, 2019
    Date of Patent: May 16, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
  • Patent number: 11645476
    Abstract: A computer generates a formal planning domain description. The computer receives a first text-based description of a domain in an AI environment. The domain includes an action and an associated attribute, and the description is written in natural language. The computer receives the first text-based description of the domain and extracts a first set of domain actions and associated action attributes. The computer receives audio-visual elements depicting the domain, generates a second text-based description, and extracts a second set of domain actions and associated action attributes. The computer constructs finite state machines corresponding to the extracted actions and attributes. The computer converts the FSMs into a symbolic model, written in a formal planning language, that describes the domain.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: May 9, 2023
    Assignee: International Business Machines Corporation
    Inventors: Mattia Chiari, Yufang Hou, Hiroshi Kajino, Akihiro Kishimoto, Radu Marinescu
  • Patent number: 11640819
    Abstract: A non-transitory computer-readable recording medium having stored therein an update program that causes a computer to execute a procedure, the procedure includes calculating a selection rate of each of a plurality of quantization points included in a quantization table, based on quantization data obtained by quantizing features of a plurality of utterance data, and updating the quantization table by updating the plurality of quantization points based on the selection rate.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: May 2, 2023
    Assignee: FUJITSU LIMITED
    Inventor: Naoshi Matsuo