Patents Examined by Douglas Godbold
  • Patent number: 11935532
    Abstract: Aspects of the disclosure relate to receiving a stateless application programming interface (“API”) request. The API request may store an utterance, previous utterance data and a sequence of labels, each label in the sequence of labels being associated with a previous utterance expressed by a user during an interaction. The previous utterance data may, in certain embodiments, be limited to a pre-determined number of utterances occurring prior to the utterance. Embodiments process the utterance, using a natural language processor in electronic communication with the first processor, to output an utterance intent, a semantic meaning of the utterance and an utterance parameter. The utterance parameter may include words in the utterance and be associated with the intent. The natural language processor may append the utterance intent, the semantic meaning of the utterance and the utterance parameter to the API request. A signal extractor processor may append the plurality of utterance signals to the API request.
    Type: Grant
    Filed: December 1, 2021
    Date of Patent: March 19, 2024
    Assignee: Bank of America Corporation
    Inventors: Ramakrishna R. Yannam, Emad Noorizadeh, Isaac Persing, Sushil Golani, Hari Gopalkrishnan, Dana Patrice Morrow Branch
  • Patent number: 11935546
    Abstract: Audio streaming devices, systems, and methods may employ adaptive differential pulse code modulation (ADPCM) techniques providing for optimum performance even while ensuring robustness against transmission errors. One illustrative device includes: a difference element that produces a sequence of prediction error values by subtracting predicted values from audio samples; a scaling element that produces scaled error values by dividing each prediction error by a corresponding envelope estimate; a quantizer that operates on the scaled error values to produce quantized error values; a multiplier that uses the corresponding envelope estimates to produce reconstructed error values; a predictor that produces the next audio sample values based on the reconstructed error values; and an envelope estimator.
    Type: Grant
    Filed: May 9, 2022
    Date of Patent: March 19, 2024
    Assignee: SEMICONDUCTOR COMPONENTS INDUSTRIES, LLC
    Inventor: Erkan Onat
  • Patent number: 11935557
    Abstract: Various embodiments set forth systems and techniques for explaining domain-specific terms detected in a media content stream. The techniques include detecting a speech portion included in an audio signal; determining that the speech portion comprises a domain-specific term; determining an explanatory phrase associated with the domain-specific term; and integrating the explanatory phrase associated with the domain-specific term into playback of the audio signal.
    Type: Grant
    Filed: February 1, 2021
    Date of Patent: March 19, 2024
    Assignee: Harman International Industries, Incorporated
    Inventors: Stefan Marti, Evgeny Burmistrov, Joseph Verbeke, Priya Seshadri
  • Patent number: 11935553
    Abstract: It is possible to stably learn, in a short time, a model that can output embedded vectors for calculating a set of time frequency points at which the same sound source is dominant. Parameters of the neural network are learned based on a spectrogram of a sound source signal formed by a plurality of sound sources such that embedded vectors for time frequency points at which the same sound source is dominant are similar to embedded vectors for each of time frequency points output by a neural network, which is a CNN.
    Type: Grant
    Filed: February 22, 2019
    Date of Patent: March 19, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Hirokazu Kameoka, Li Li
  • Patent number: 11935551
    Abstract: The present invention relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR). A system and a method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank providing a plurality of analysis subband signals of the low frequency component of the signal. It also comprises a non-linear processing unit to generate a synthesis subband signal with a synthesis frequency by modifying the phase of a first and a second of the plurality of analysis subband signals and by combining the phase-modified analysis subband signals. Finally, it comprises a synthesis filter bank for generating the high frequency component of the signal from the synthesis subband signal.
    Type: Grant
    Filed: May 3, 2023
    Date of Patent: March 19, 2024
    Assignee: DOLBY INTERNATIONAL AB
    Inventors: Lars Villemoes, Per Hedelin
  • Patent number: 11929077
    Abstract: Embodiments of systems and methods for user enrollment in speaker authentication and speaker identification systems are disclosed. In some embodiments, the enrollment process includes collecting speech samples that are examples of multiple speech types spoken by a user, computing a speech representation for each speech sample, and aggregating the example speech representations to form a robust overall representation or user voiceprint of the user's speech.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: March 12, 2024
    Assignee: DTS Inc.
    Inventors: Michael M. Goodwin, Teodora Ceanga, Eloy Geenjaar, Gadiel Seroussi, Brandon Smith
  • Patent number: 11929078
    Abstract: Certain embodiments of the present disclosure provide techniques training a user detection model to identify a user of a software application based on voice recognition. The method generally includes receiving a data set including a plurality of voice interactions with users of a software application. For each respective recording in the data set, a spectrogram representation is generated based on the respective recording. A plurality of voice recognition models are trained. Each of the plurality of voice recognition models is trained based on the spectrogram representation for each of the plurality of voice recordings in the data set. The plurality of voice recognition models are deployed to an interactive voice response system.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: March 12, 2024
    Assignee: Intuit, Inc.
    Inventors: Shanshan Tuo, Divya Beeram, Meng Chen, Neo Yuchen, Wan Yu Zhang, Nivethitha Kumar, Kavita Sundar, Tomer Tal
  • Patent number: 11929069
    Abstract: Methods, apparatus, and computer readable media are described related to automated assistants that proactively incorporate, into human-to-computer dialog sessions, unsolicited content of potential interest to a user. In various implementations, based on content of an existing human-to-computer dialog session between a user and an automated assistant, an entity mentioned by the user or automated assistant may be identified. Fact(s)s related to the entity or to another entity that is related to the entity may be identified based on entity data contained in database(s). For each of the fact(s), a corresponding measure of potential interest to the user may be determined. Unsolicited natural language content may then be generated that includes one or more of the facts selected based on the corresponding measure(s) of potential interest. The automated assistant may then incorporate the unsolicited content into the existing human-to-computer dialog session or a subsequent human-to-computer dialog session.
    Type: Grant
    Filed: August 25, 2021
    Date of Patent: March 12, 2024
    Assignee: GOOGLE LLC
    Inventors: Vladimir Vuskovic, Stephan Wenger, Zineb Ait Bahajji, Martin Baeuml, Alexandru Dovlecel, Gleb Skobeltsyn
  • Patent number: 11929060
    Abstract: A method for training a speech recognition model includes receiving a set of training utterance pairs each including a non-synthetic speech representation and a synthetic speech representation of a same corresponding utterance. At each of a plurality of output steps for each training utterance pair in the set of training utterance pairs, the method also includes determining a consistent loss term for the corresponding training utterance pair based on a first probability distribution over possible non-synthetic speech recognition hypotheses generated for the corresponding non-synthetic speech representation and a second probability distribution over possible synthetic speech recognition hypotheses generated for the corresponding synthetic speech representation. The first and second probability distributions are generated for output by the speech recognition model.
    Type: Grant
    Filed: February 8, 2021
    Date of Patent: March 12, 2024
    Assignee: Google LLC
    Inventors: Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Jose Moreno Mengibar
  • Patent number: 11922934
    Abstract: The present disclosure provides method and apparatus for generating a response in a human-machine conversation. A first sound input may be received in the conversation. A first audio attribute may be extracted from the first sound input, wherein the first audio attribute indicates a first condition of a user. A second sound input may be received in the conversation. A second audio attribute may be extracted from the second sound input, wherein the second audio attribute indicates a second condition of a user. A difference between the second audio attribute and the first audio attribute is determined, wherein the difference indicates a condition change of the user from the first condition to the second condition. A response to the second sound input is generated based at least on the condition change.
    Type: Grant
    Filed: April 19, 2018
    Date of Patent: March 5, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jian Luan, Zhe Xiao, Xingyu Na, Chi Xiu, Jianzhong Ju, Xiang Xu
  • Patent number: 11922951
    Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.
    Type: Grant
    Filed: January 3, 2022
    Date of Patent: March 5, 2024
    Assignee: GOOGLE LLC
    Inventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
  • Patent number: 11922926
    Abstract: A system may include processor(s), and memory in communication with the processor(s) and storing instructions configured to cause the system to correct ASR errors. The system may receive a transcription comprising transcribed word(s) and may determine whether the transcribed word(s) exceed associated predefined confidence level(s). Responsive to determining a transcribed word does not exceed a predefined confidence level, the system may generate a predicted word. The system may calculate a distance between numerical representations of the transcribed word and the predicted word and may determine whether the distance exceeds a predefined threshold. Responsive to determining the distance exceeds the predefined threshold, the system may determine whether at least one red flag word of a list of red flag words corresponds to a context of the transcription, and, responsive to making that determination, may classify the transcription as associated with a first category.
    Type: Grant
    Filed: September 14, 2021
    Date of Patent: March 5, 2024
    Assignee: CAPITAL ONE SERVICES, LLC
    Inventors: Aysu Ezen Can, Feng Qiu, Guadalupe Bonilla, Meredith Leigh Critzer, Michael Mossoba, Alexander Lin, Tyler Maiman, Mia Rodriguez, Vahid Khanagha, Joshua Edwards
  • Patent number: 11922928
    Abstract: Apparatus and methods for leveraging machine learning and artificial intelligence to assess a sentiment of an utterance expressed by a user during an interaction between an interactive response system and the user is provided. The methods may include a natural language processor processing the utterance to output an utterance intent. The methods may also include a signal extractor processing the utterance, the utterance intent and previous utterance data to output utterance signals. The methods may additionally include an utterance sentiment classifier using a hierarchy of rules to extract, from a database, a label, the extracting being based on the utterance signals. The methods may further include a sequential neural network classifier using a trained algorithm to process the label and a sequence of historical labels to output a sentiment score.
    Type: Grant
    Filed: December 1, 2021
    Date of Patent: March 5, 2024
    Assignee: Bank of America Corporation
    Inventors: Ramakrishna R. Yannam, Isaac Persing, Emad Noorizadeh, Sushil Golani, Hari Gopalkrishnan, Dana Patrice Morrow Branch
  • Patent number: 11922303
    Abstract: Embodiments described herein provides a training mechanism that transfers the knowledge from a trained BERT model into a much smaller model to approximate the behavior of BERT. Specifically, the BERT model may be treated as a teacher model, and a much smaller student model may be trained using the same inputs to the teacher model and the output from the teacher model. In this way, the student model can be trained within a much shorter time than the BERT teacher model, but with comparable performance with BERT.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: March 5, 2024
    Assignee: Salesforce, Inc.
    Inventors: Wenhao Liu, Ka Chun Au, Shashank Harinath, Bryan McCann, Govardana Sachithanandam Ramachandran, Alexis Roos, Caiming Xiong
  • Patent number: 11914968
    Abstract: The application belongs to the field of big data, and particularly relates to an official document processing method, device, computer equipment and storage medium. The method includes the following steps of: performing format analysis on the to-be-reviewed official document, then acquiring the to-be-reviewed official document of standard file type, and identifying all file components and contents in the to-be-reviewed official document of standard file type; performing text format detection, text content detection and frame layout detection synchronously by a preset text processing model, obtaining a format detection result, a content detection result and a layout detection result; generating a detected error content according to the format detection result, content detection result and layout detection result, calling out a standard writing rule corresponding to the detected error content, marking the detected error content and the standard writing rule in the to-be-reviewed official document.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: February 27, 2024
    Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.
    Inventors: Xiaohui Jin, Xiaowen Ruan, Liang Xu
  • Patent number: 11908482
    Abstract: This application provides a packet loss retransmission method, a computer-readable storage medium, and an electronic device. The packet loss retransmission method includes: obtaining a loudness corresponding to a target audio data packet; and in response to receiving a packet loss state indicating that the target audio data packet is lost, in accordance with a determination that the loudness corresponding to the target audio data packet meets a first threshold: retransmitting the target audio data packet. The technical solutions of this application may alleviate the problem of long data retransmission time, and improve data transmission efficiency.
    Type: Grant
    Filed: April 26, 2022
    Date of Patent: February 20, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 11908458
    Abstract: A computer-implemented method for customizing a recurrent neural network transducer (RNN-T) is provided. The computer implemented method includes synthesizing first domain audio data from first domain text data, and feeding the synthesized first domain audio data into a trained encoder of the recurrent neural network transducer (RNN-T) having an initial condition, wherein the encoder is updated using the synthesized first domain audio data and the first domain text data. The computer implemented method further includes synthesizing second domain audio data from second domain text data, and feeding the synthesized second domain audio data into the updated encoder of the recurrent neural network transducer (RNN-T), wherein the prediction network is updated using the synthesized second domain audio data and the second domain text data. The computer implemented method further includes restoring the updated encoder to the initial condition.
    Type: Grant
    Filed: December 29, 2020
    Date of Patent: February 20, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, George Andrei Saon, Brian E. D. Kingsbury
  • Patent number: 11900956
    Abstract: The present technology relates to a signal processing device and method, and a program making it possible to reduce the computational complexity of decoding at low cost. A signal processing device includes: a priority information generation unit configured to generate priority information about an audio object on the basis of a plurality of elements expressing a feature of the audio object. The present technology may be applied to an encoding device and a decoding device.
    Type: Grant
    Filed: January 13, 2023
    Date of Patent: February 13, 2024
    Assignee: Sony Group Corporation
    Inventors: Yuki Yamamoto, Toru Chinen, Minoru Tsuji
  • Patent number: 11887610
    Abstract: An audio decoding method includes obtaining an encoded bitstream; performing bitstream demultiplexing on the encoded bitstream, to obtain a high frequency band parameter of a current frame of an audio signal, wherein the high frequency band parameter indicates a location, a quantity, and an amplitude or energy of a tone component comprised in a high frequency band signal of the current frame; obtaining a reconstructed high frequency band signal of the current frame based on the high frequency band parameter; and obtaining an audio output signal of the current frame based on the reconstructed high frequency band signal of the current frame.
    Type: Grant
    Filed: July 12, 2022
    Date of Patent: January 30, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bingyin Xia, Jiawei Li, Zhe Wang
  • Patent number: 11887592
    Abstract: Methods, apparatus, and computer readable media are described related to automated assistants that proactively incorporate, into human-to-computer dialog sessions, unsolicited content of potential interest to a user. In various implementations, based on content of an existing human-to-computer dialog session between a user and an automated assistant, an entity mentioned by the user or automated assistant may be identified. Fact(s)s related to the entity or to another entity that is related to the entity may be identified based on entity data contained in database(s). For each of the fact(s), a corresponding measure of potential interest to the user may be determined. Unsolicited natural language content may then be generated that includes one or more of the facts selected based on the corresponding measure(s) of potential interest. The automated assistant may then incorporate the unsolicited content into the existing human-to-computer dialog session or a subsequent human-to-computer dialog session.
    Type: Grant
    Filed: August 25, 2021
    Date of Patent: January 30, 2024
    Assignee: GOOGLE LLC
    Inventors: Vladimir Vuskovic, Stephan Wenger, Zineb Ait Bahajji, Martin Baeuml, Alexandru Dovlecel, Gleb Skobeltsyn