Patents Examined by Douglas Godbold
-
Patent number: 11935532Abstract: Aspects of the disclosure relate to receiving a stateless application programming interface (“API”) request. The API request may store an utterance, previous utterance data and a sequence of labels, each label in the sequence of labels being associated with a previous utterance expressed by a user during an interaction. The previous utterance data may, in certain embodiments, be limited to a pre-determined number of utterances occurring prior to the utterance. Embodiments process the utterance, using a natural language processor in electronic communication with the first processor, to output an utterance intent, a semantic meaning of the utterance and an utterance parameter. The utterance parameter may include words in the utterance and be associated with the intent. The natural language processor may append the utterance intent, the semantic meaning of the utterance and the utterance parameter to the API request. A signal extractor processor may append the plurality of utterance signals to the API request.Type: GrantFiled: December 1, 2021Date of Patent: March 19, 2024Assignee: Bank of America CorporationInventors: Ramakrishna R. Yannam, Emad Noorizadeh, Isaac Persing, Sushil Golani, Hari Gopalkrishnan, Dana Patrice Morrow Branch
-
Patent number: 11935546Abstract: Audio streaming devices, systems, and methods may employ adaptive differential pulse code modulation (ADPCM) techniques providing for optimum performance even while ensuring robustness against transmission errors. One illustrative device includes: a difference element that produces a sequence of prediction error values by subtracting predicted values from audio samples; a scaling element that produces scaled error values by dividing each prediction error by a corresponding envelope estimate; a quantizer that operates on the scaled error values to produce quantized error values; a multiplier that uses the corresponding envelope estimates to produce reconstructed error values; a predictor that produces the next audio sample values based on the reconstructed error values; and an envelope estimator.Type: GrantFiled: May 9, 2022Date of Patent: March 19, 2024Assignee: SEMICONDUCTOR COMPONENTS INDUSTRIES, LLCInventor: Erkan Onat
-
Patent number: 11935557Abstract: Various embodiments set forth systems and techniques for explaining domain-specific terms detected in a media content stream. The techniques include detecting a speech portion included in an audio signal; determining that the speech portion comprises a domain-specific term; determining an explanatory phrase associated with the domain-specific term; and integrating the explanatory phrase associated with the domain-specific term into playback of the audio signal.Type: GrantFiled: February 1, 2021Date of Patent: March 19, 2024Assignee: Harman International Industries, IncorporatedInventors: Stefan Marti, Evgeny Burmistrov, Joseph Verbeke, Priya Seshadri
-
Patent number: 11935553Abstract: It is possible to stably learn, in a short time, a model that can output embedded vectors for calculating a set of time frequency points at which the same sound source is dominant. Parameters of the neural network are learned based on a spectrogram of a sound source signal formed by a plurality of sound sources such that embedded vectors for time frequency points at which the same sound source is dominant are similar to embedded vectors for each of time frequency points output by a neural network, which is a CNN.Type: GrantFiled: February 22, 2019Date of Patent: March 19, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hirokazu Kameoka, Li Li
-
Patent number: 11935551Abstract: The present invention relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR). A system and a method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank providing a plurality of analysis subband signals of the low frequency component of the signal. It also comprises a non-linear processing unit to generate a synthesis subband signal with a synthesis frequency by modifying the phase of a first and a second of the plurality of analysis subband signals and by combining the phase-modified analysis subband signals. Finally, it comprises a synthesis filter bank for generating the high frequency component of the signal from the synthesis subband signal.Type: GrantFiled: May 3, 2023Date of Patent: March 19, 2024Assignee: DOLBY INTERNATIONAL ABInventors: Lars Villemoes, Per Hedelin
-
Patent number: 11929077Abstract: Embodiments of systems and methods for user enrollment in speaker authentication and speaker identification systems are disclosed. In some embodiments, the enrollment process includes collecting speech samples that are examples of multiple speech types spoken by a user, computing a speech representation for each speech sample, and aggregating the example speech representations to form a robust overall representation or user voiceprint of the user's speech.Type: GrantFiled: December 22, 2020Date of Patent: March 12, 2024Assignee: DTS Inc.Inventors: Michael M. Goodwin, Teodora Ceanga, Eloy Geenjaar, Gadiel Seroussi, Brandon Smith
-
Patent number: 11929078Abstract: Certain embodiments of the present disclosure provide techniques training a user detection model to identify a user of a software application based on voice recognition. The method generally includes receiving a data set including a plurality of voice interactions with users of a software application. For each respective recording in the data set, a spectrogram representation is generated based on the respective recording. A plurality of voice recognition models are trained. Each of the plurality of voice recognition models is trained based on the spectrogram representation for each of the plurality of voice recordings in the data set. The plurality of voice recognition models are deployed to an interactive voice response system.Type: GrantFiled: February 23, 2021Date of Patent: March 12, 2024Assignee: Intuit, Inc.Inventors: Shanshan Tuo, Divya Beeram, Meng Chen, Neo Yuchen, Wan Yu Zhang, Nivethitha Kumar, Kavita Sundar, Tomer Tal
-
Patent number: 11929069Abstract: Methods, apparatus, and computer readable media are described related to automated assistants that proactively incorporate, into human-to-computer dialog sessions, unsolicited content of potential interest to a user. In various implementations, based on content of an existing human-to-computer dialog session between a user and an automated assistant, an entity mentioned by the user or automated assistant may be identified. Fact(s)s related to the entity or to another entity that is related to the entity may be identified based on entity data contained in database(s). For each of the fact(s), a corresponding measure of potential interest to the user may be determined. Unsolicited natural language content may then be generated that includes one or more of the facts selected based on the corresponding measure(s) of potential interest. The automated assistant may then incorporate the unsolicited content into the existing human-to-computer dialog session or a subsequent human-to-computer dialog session.Type: GrantFiled: August 25, 2021Date of Patent: March 12, 2024Assignee: GOOGLE LLCInventors: Vladimir Vuskovic, Stephan Wenger, Zineb Ait Bahajji, Martin Baeuml, Alexandru Dovlecel, Gleb Skobeltsyn
-
Patent number: 11929060Abstract: A method for training a speech recognition model includes receiving a set of training utterance pairs each including a non-synthetic speech representation and a synthetic speech representation of a same corresponding utterance. At each of a plurality of output steps for each training utterance pair in the set of training utterance pairs, the method also includes determining a consistent loss term for the corresponding training utterance pair based on a first probability distribution over possible non-synthetic speech recognition hypotheses generated for the corresponding non-synthetic speech representation and a second probability distribution over possible synthetic speech recognition hypotheses generated for the corresponding synthetic speech representation. The first and second probability distributions are generated for output by the speech recognition model.Type: GrantFiled: February 8, 2021Date of Patent: March 12, 2024Assignee: Google LLCInventors: Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Jose Moreno Mengibar
-
Patent number: 11922934Abstract: The present disclosure provides method and apparatus for generating a response in a human-machine conversation. A first sound input may be received in the conversation. A first audio attribute may be extracted from the first sound input, wherein the first audio attribute indicates a first condition of a user. A second sound input may be received in the conversation. A second audio attribute may be extracted from the second sound input, wherein the second audio attribute indicates a second condition of a user. A difference between the second audio attribute and the first audio attribute is determined, wherein the difference indicates a condition change of the user from the first condition to the second condition. A response to the second sound input is generated based at least on the condition change.Type: GrantFiled: April 19, 2018Date of Patent: March 5, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Jian Luan, Zhe Xiao, Xingyu Na, Chi Xiu, Jianzhong Ju, Xiang Xu
-
Patent number: 11922951Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.Type: GrantFiled: January 3, 2022Date of Patent: March 5, 2024Assignee: GOOGLE LLCInventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
-
Patent number: 11922926Abstract: A system may include processor(s), and memory in communication with the processor(s) and storing instructions configured to cause the system to correct ASR errors. The system may receive a transcription comprising transcribed word(s) and may determine whether the transcribed word(s) exceed associated predefined confidence level(s). Responsive to determining a transcribed word does not exceed a predefined confidence level, the system may generate a predicted word. The system may calculate a distance between numerical representations of the transcribed word and the predicted word and may determine whether the distance exceeds a predefined threshold. Responsive to determining the distance exceeds the predefined threshold, the system may determine whether at least one red flag word of a list of red flag words corresponds to a context of the transcription, and, responsive to making that determination, may classify the transcription as associated with a first category.Type: GrantFiled: September 14, 2021Date of Patent: March 5, 2024Assignee: CAPITAL ONE SERVICES, LLCInventors: Aysu Ezen Can, Feng Qiu, Guadalupe Bonilla, Meredith Leigh Critzer, Michael Mossoba, Alexander Lin, Tyler Maiman, Mia Rodriguez, Vahid Khanagha, Joshua Edwards
-
Patent number: 11922928Abstract: Apparatus and methods for leveraging machine learning and artificial intelligence to assess a sentiment of an utterance expressed by a user during an interaction between an interactive response system and the user is provided. The methods may include a natural language processor processing the utterance to output an utterance intent. The methods may also include a signal extractor processing the utterance, the utterance intent and previous utterance data to output utterance signals. The methods may additionally include an utterance sentiment classifier using a hierarchy of rules to extract, from a database, a label, the extracting being based on the utterance signals. The methods may further include a sequential neural network classifier using a trained algorithm to process the label and a sequence of historical labels to output a sentiment score.Type: GrantFiled: December 1, 2021Date of Patent: March 5, 2024Assignee: Bank of America CorporationInventors: Ramakrishna R. Yannam, Isaac Persing, Emad Noorizadeh, Sushil Golani, Hari Gopalkrishnan, Dana Patrice Morrow Branch
-
Patent number: 11922303Abstract: Embodiments described herein provides a training mechanism that transfers the knowledge from a trained BERT model into a much smaller model to approximate the behavior of BERT. Specifically, the BERT model may be treated as a teacher model, and a much smaller student model may be trained using the same inputs to the teacher model and the output from the teacher model. In this way, the student model can be trained within a much shorter time than the BERT teacher model, but with comparable performance with BERT.Type: GrantFiled: May 18, 2020Date of Patent: March 5, 2024Assignee: Salesforce, Inc.Inventors: Wenhao Liu, Ka Chun Au, Shashank Harinath, Bryan McCann, Govardana Sachithanandam Ramachandran, Alexis Roos, Caiming Xiong
-
Patent number: 11914968Abstract: The application belongs to the field of big data, and particularly relates to an official document processing method, device, computer equipment and storage medium. The method includes the following steps of: performing format analysis on the to-be-reviewed official document, then acquiring the to-be-reviewed official document of standard file type, and identifying all file components and contents in the to-be-reviewed official document of standard file type; performing text format detection, text content detection and frame layout detection synchronously by a preset text processing model, obtaining a format detection result, a content detection result and a layout detection result; generating a detected error content according to the format detection result, content detection result and layout detection result, calling out a standard writing rule corresponding to the detected error content, marking the detected error content and the standard writing rule in the to-be-reviewed official document.Type: GrantFiled: December 11, 2020Date of Patent: February 27, 2024Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.Inventors: Xiaohui Jin, Xiaowen Ruan, Liang Xu
-
Patent number: 11908482Abstract: This application provides a packet loss retransmission method, a computer-readable storage medium, and an electronic device. The packet loss retransmission method includes: obtaining a loudness corresponding to a target audio data packet; and in response to receiving a packet loss state indicating that the target audio data packet is lost, in accordance with a determination that the loudness corresponding to the target audio data packet meets a first threshold: retransmitting the target audio data packet. The technical solutions of this application may alleviate the problem of long data retransmission time, and improve data transmission efficiency.Type: GrantFiled: April 26, 2022Date of Patent: February 20, 2024Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Junbin Liang
-
Patent number: 11908458Abstract: A computer-implemented method for customizing a recurrent neural network transducer (RNN-T) is provided. The computer implemented method includes synthesizing first domain audio data from first domain text data, and feeding the synthesized first domain audio data into a trained encoder of the recurrent neural network transducer (RNN-T) having an initial condition, wherein the encoder is updated using the synthesized first domain audio data and the first domain text data. The computer implemented method further includes synthesizing second domain audio data from second domain text data, and feeding the synthesized second domain audio data into the updated encoder of the recurrent neural network transducer (RNN-T), wherein the prediction network is updated using the synthesized second domain audio data and the second domain text data. The computer implemented method further includes restoring the updated encoder to the initial condition.Type: GrantFiled: December 29, 2020Date of Patent: February 20, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gakuto Kurata, George Andrei Saon, Brian E. D. Kingsbury
-
Patent number: 11900956Abstract: The present technology relates to a signal processing device and method, and a program making it possible to reduce the computational complexity of decoding at low cost. A signal processing device includes: a priority information generation unit configured to generate priority information about an audio object on the basis of a plurality of elements expressing a feature of the audio object. The present technology may be applied to an encoding device and a decoding device.Type: GrantFiled: January 13, 2023Date of Patent: February 13, 2024Assignee: Sony Group CorporationInventors: Yuki Yamamoto, Toru Chinen, Minoru Tsuji
-
Patent number: 11887610Abstract: An audio decoding method includes obtaining an encoded bitstream; performing bitstream demultiplexing on the encoded bitstream, to obtain a high frequency band parameter of a current frame of an audio signal, wherein the high frequency band parameter indicates a location, a quantity, and an amplitude or energy of a tone component comprised in a high frequency band signal of the current frame; obtaining a reconstructed high frequency band signal of the current frame based on the high frequency band parameter; and obtaining an audio output signal of the current frame based on the reconstructed high frequency band signal of the current frame.Type: GrantFiled: July 12, 2022Date of Patent: January 30, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Bingyin Xia, Jiawei Li, Zhe Wang
-
Patent number: 11887592Abstract: Methods, apparatus, and computer readable media are described related to automated assistants that proactively incorporate, into human-to-computer dialog sessions, unsolicited content of potential interest to a user. In various implementations, based on content of an existing human-to-computer dialog session between a user and an automated assistant, an entity mentioned by the user or automated assistant may be identified. Fact(s)s related to the entity or to another entity that is related to the entity may be identified based on entity data contained in database(s). For each of the fact(s), a corresponding measure of potential interest to the user may be determined. Unsolicited natural language content may then be generated that includes one or more of the facts selected based on the corresponding measure(s) of potential interest. The automated assistant may then incorporate the unsolicited content into the existing human-to-computer dialog session or a subsequent human-to-computer dialog session.Type: GrantFiled: August 25, 2021Date of Patent: January 30, 2024Assignee: GOOGLE LLCInventors: Vladimir Vuskovic, Stephan Wenger, Zineb Ait Bahajji, Martin Baeuml, Alexandru Dovlecel, Gleb Skobeltsyn