Patents Examined by Jialong He
-
Patent number: 11373655Abstract: An apparatus includes processor(s) to: perform preprocessing operations of a segmentation technique including divide speech data set into data chunks representing chunks of speech audio, use an acoustic model with each data chunk to identify pauses in the speech audio, and analyze a length of time of each identified pause to identify a candidate set of likely sentence pauses in the speech audio; and perform speech-to-text operations including divide the speech data set into data segments that each representing segments of the speech audio based on the candidate set of likely sentence pauses, use the acoustic model with each data segment to identify likely speech sounds in the speech audio, analyze the identified likely speech sounds to identify candidate sets of words likely spoken in the speech audio, and generate a transcript of the speech data set based at least on the candidate sets of words likely spoken.Type: GrantFiled: October 12, 2021Date of Patent: June 28, 2022Assignee: SAS INSTITUTE INC.Inventors: Xiaolong Li, Xiaozhuo Cheng, Xu Yang
-
Patent number: 11367432Abstract: A method for generating final transcriptions representing numerical sequences of utterances in a written domain includes receiving audio data for an utterance containing a numeric sequence, and decoding, using a sequence-to-sequence speech recognition model, the audio data for the utterance to generate, as output from the sequence-to-sequence speech recognition model, an intermediate transcription of the utterance. The method also includes processing, using a neural corrector/denormer, the intermediate transcription to generate a final transcription that represents the numeric sequence of the utterance in a written domain. The neural corrector/denormer is trained on a set of training samples, where each training sample includes a speech recognition hypothesis for a training utterance and a ground-truth transcription of the training utterance. The ground-truth transcription of the training utterance is in the written domain.Type: GrantFiled: March 26, 2020Date of Patent: June 21, 2022Assignee: Google LLCInventors: Charles Caleb Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu
-
Patent number: 11354512Abstract: A dialog generation method includes: training a sequence to sequence (seq2seq)-based dialog model using a loss function including topic range constraint information; and generating a dialog using the trained dialog model. With the dialog generation method, topic range constraint information is introduced in the process of dialog model training using a loss function including the topic range constraint information, thus helping to prevent the trained model from producing low-quality meaningless replies.Type: GrantFiled: December 5, 2019Date of Patent: June 7, 2022Assignee: Advanced New Technologies Co., Ltd.Inventors: Xiaofu Chang, Linlin Chao, Peng Xu, Xiaolong Li
-
Patent number: 11355105Abstract: A home appliance including a first microphone that is disposed on a surface of a housing, a second microphone that is disposed on an inside of the housing, and a processor configured to perform signal processing for first voice data that is acquired from the first microphone, and perform voice recognition using the signal-processed first voice data. The processor is further configured to generate noise data using second voice data that is acquired from the second microphone and perform signal processing for the first voice data using the generated noise data.Type: GrantFiled: December 23, 2019Date of Patent: June 7, 2022Assignee: Samsung Electronics Co., Ltd.Inventor: Nokhaeng Lee
-
Patent number: 11348571Abstract: The present disclosure provides methods, computing devices, and storage media for generating a training corpus. The method includes: mining out pieces of data from user behavior logs associated with a target application, each piece of data including a first behavior log and a second behavior log, the first behavior log including a user speech and a corresponding speech recognition result, the second behavior log belonging to the same user as the first behavior log and time-dependent with the first behavior log; and determining the user speech and the corresponding speech recognition result in each piece of data as a positive feedback sample or a negative feedback sample, based on the first behavior log and the second behavior log.Type: GrantFiled: March 5, 2020Date of Patent: May 31, 2022Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Shiqiang Ding, Jizhou Huang, Zhongwei Jiang, Wentao Ma
-
Patent number: 11340923Abstract: A plurality of messages are received and for each respective one of the received plurality of messages: a series of steps can be performed, including identifying language in the respective message, determining at least one heuristic, generating a first modified version of the respective message and generating at least one second modified version of the respective message. The second modified version(s) include language not in the respective message and not in the first modified version of the respective message. Further the at least one second modified version represents the determined at least one other heuristic. A device can be prompted to respond to a survey that includes the respective message(s), at least one first modified version, and/or at least one second modified version. A second survey can be generated of the respective message(s), modified version(s), and/or second modified version(s).Type: GrantFiled: May 27, 2020Date of Patent: May 24, 2022Assignee: NEWRISTICS LLCInventor: Gaurav Kapoor
-
Patent number: 11334716Abstract: Embodiments relate to an intelligent computer platform to selectively amend one or more tokens in a document. A first document set is subjected to natural language processing (NLP) and a vector score is identified for two or more documents of the first document set. Upon receipt of a new document, the new document is subjected to NLP and a new document vector score is identified. The new document is analyzed against the first document set, and the identified vector score of the first document set is compared to the vector score of the new document. One or more tokens of the new document are amended responsive to the comparison, and a new document version is created from the selective amendment.Type: GrantFiled: September 9, 2019Date of Patent: May 17, 2022Assignee: International Business Machines CorporationInventors: Charles E. Beller, Christopher F. Ackermann, Kristen Maria Summers, David McQuenney, Rob High
-
Patent number: 11335350Abstract: An apparatus includes processor(s) to: perform pre-processing operations including derive an audio noise level of speech audio of a speech data set, derive a first relative weighting for first and second segmentation techniques for identifying likely sentence pauses in the speech audio based on the audio noise level, and select likely sentence pauses for a converged set of likely sentence pauses from likely sentence pauses identified by the first and/or second segmentation techniques based on the first relative weighting; and perform speech-to-text processing operations including divide the speech data set into data segments representing speech segments of the speech audio based on the converged set of likely sentence pauses, and derive a second relative weighting based on the audio noise level for selecting words indicated by an acoustic model or by a language model as being most likely spoken in the speech audio for inclusion in a transcript.Type: GrantFiled: October 12, 2021Date of Patent: May 17, 2022Assignee: SAS INSTITUTE INC.Inventors: Xiaolong Li, Xiaozhuo Cheng, Xu Yang
-
Patent number: 11328130Abstract: The present disclosure is directed to systems, methods and devices for providing real-time translation for group communications. A speech input may be received from a first group communication device associated with a first language. One or more groups to distribute the speech input may be determined, wherein each of the one or more groups comprises at least one group communication device associated with a language that is different than the first language. The received speech input may be translated into a corresponding language for each of the one or more groups, and the translated speech may be sent to each group communication device of the one or more groups in a language corresponding to each of the one or more groups.Type: GrantFiled: November 6, 2018Date of Patent: May 10, 2022Assignee: Orion Labs, Inc.Inventors: Justin Black, Gregory Albrecht, Dan Phung
-
Patent number: 11322149Abstract: Disclosed herein are an artificial intelligence apparatus for generating recipe information including a learning processor configured to generate recipe text including at least one of cooking ingredient information or description text of cooking from cooking content, by providing the cooking content to a recipe text generation model, and a processor configured to generate recipe information of the cooking based on the recipe text, and a method of operating the same.Type: GrantFiled: January 23, 2020Date of Patent: May 3, 2022Assignee: LG ELECTRONICS INC.Inventor: Hyeongjin Kim
-
Patent number: 11322145Abstract: A voice processing device includes a voice receiver that receives a voice, an imager, an image acquirer that acquires a captured image captured by the imager, an utterer identifier that identifies an utterer based on the voice received by the voice receiver and the captured image acquired by the image acquirer, a voice determiner that determines whether the voice is a specific word based on the voice received by the voice receiver and an image of the utterer identified by the utterer identifier, the image being included in the captured image, and a voice transmitter that switches a transmission destination of the voice received by the voice receiver based on a determination result by the voice determiner.Type: GrantFiled: February 26, 2020Date of Patent: May 3, 2022Assignee: SHARP KABUSHIKI KAISHAInventors: Keiko Hirukawa, Satoshi Terada
-
Patent number: 11314942Abstract: A computer-implemented method for providing agent assisted transcriptions of user utterances. A user utterance is received in response to a prompt provided to the user at a remote client device. An automatic transcription is generated from the utterance using a language model based upon an application or context, and presented to a human agent. The agent reviews the transcription and may replace at least a portion of the transcription with a corrected transcription. As the agent inputs the corrected transcription, accelerants are presented to the user comprising suggested texted to be inputted. The accelerants may be determined based upon an agent input, an application or context of the transcription, the portion of the transcription being replaced, or any combination thereof. In some cases, the user provides textual input, to which the agent transcribes an intent associated with the input with the aid of one or more accelerants.Type: GrantFiled: March 20, 2020Date of Patent: April 26, 2022Assignee: Interactions LLCInventors: Ethan Selfridge, Michael Johnston, Robert Lifgren, James Dreher, John Leonard
-
Patent number: 11308946Abstract: Methods and an apparatus for performing feature extraction on speech in a microphone signal with embedded noise processing to reduce the amount of processing are provided. In embodiments, feature extraction and the noise estimate use an output of the same Fourier Transform, such that the noise filtering of the speech is embedded with the feature extraction of the speech.Type: GrantFiled: July 25, 2019Date of Patent: April 19, 2022Assignee: Cerence Operating CompanyInventors: Jianzhong Teng, Xiao-Lin Ren, Xingui Zeng, Yi Gao
-
Patent number: 11301625Abstract: A simultaneous interpretation system using a translation unit bilingual corpus includes a microphone configured to receive an utterance of a user, a memory in which a program for recognizing the utterance of the user and generating a translation result is stored, and a processor configured to execute the program stored in the memory, wherein the processor executes the program so as to convert the received utterance of the user into text, store the text in a speech recognition buffer, perform translation unit recognition with respect to the text on the basis of a learning model for translation unit recognition, and in response to the translation unit recognition being completed, generate a translation result corresponding to the translation unit on the basis of a translation model for translation performance.Type: GrantFiled: November 13, 2019Date of Patent: April 12, 2022Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTEInventors: Yoon Hyung Roh, Jong Hun Shin, Young Ae Seo
-
Patent number: 11302332Abstract: A method for providing a natural language conversation, which is implemented by an interactive agent system, may include receiving a natural language input, determining a user intent based on the natural language input, and providing a natural language response corresponding to the natural language input, based on the natural language input and/or the determined user intent, which is associated with execution of a specific task, provision of specific information, and/or a simple statement. The provision of the natural language response includes determining whether a first condition is satisfied based on whether it is possible to obtain all sufficient information from the natural language input, without having to request additional information, and when the first condition is satisfied, determining whether a second condition is satisfied and providing a natural language response belonging to a category of substantial replies when the second condition is satisfied.Type: GrantFiled: October 30, 2018Date of Patent: April 12, 2022Assignee: DEEPBRAIN AI INC.Inventors: Seyoung Jang, Dosang Yoon, Jaeho Seol
-
Patent number: 11302346Abstract: A computer based system and method for automatically detecting frustration in an interaction, may include: identifying in the interaction using a set of linguistic rules, natural language patterns related to frustration, wherein the linguistic rules further define weights associated with the natural language patterns and rule metadata; reviewing the rule metadata associated with the identified natural language patterns to identify override attributes, wherein if the rule metadata does not include override attributes, then a frustration level in the interaction is determined based on the identified natural language patterns and weights associated with the identified natural language patterns; and if the rule metadata includes override attributes than the frustration level is determined based on the identified override attributes.Type: GrantFiled: January 30, 2020Date of Patent: April 12, 2022Assignee: NICE LTD.Inventors: Jessica Perri, Amelie Stephan, Julia Laski, Mark Schmelzenbach, Sara Olson, Shaun Matthews
-
Patent number: 11295750Abstract: An apparatus for encoding an audio input signal to obtain an encoded audio signal is provided. The apparatus comprises a transformation module configured to transform the audio input signal from an original domain to a transform domain to obtain a transformed audio signal. Moreover, the apparatus comprises an encoding module, configured to quantize the transformed audio signal to obtain a quantized signal, and configured to encode the quantized signal to obtain the encoded audio signal. The transformation module is configured to transform the audio input signal depending on a plurality of predefined power values of quantization noise in the original domain.Type: GrantFiled: October 25, 2018Date of Patent: April 5, 2022Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.Inventors: Johannes Fischer, Tom Bäckström
-
Patent number: 11295088Abstract: Systems and processes for modifying word predictions are provided. In one example, a user input is received including one or more words. A prediction of a word sequence corresponding to one or more words is obtained, and context information associated with the word sequence is obtained. In accordance with a determination, based on the context information, that the prediction of the word sequence corresponds to a predetermined semantic reference, the prediction of the word sequence is modified, and an output is provided corresponding to the modified prediction of the word sequence. In accordance with a determination, based on the context information, that the prediction of the word sequence does not correspond to a predetermined semantic reference, an output is provided corresponding to the prediction of the word sequence.Type: GrantFiled: November 20, 2019Date of Patent: April 5, 2022Assignee: Apple Inc.Inventor: Jerome R. Bellegarda
-
Patent number: 11295733Abstract: A dialogue system includes: a speech recognizes configured to generate an input sentence by converting a speech of a user into a text; a dialogue manager configured to generate a meaning representation for the input sentence; and a result processor configured to generate a plurality of output sentences corresponding to the meaning representation. The dialogue manager generates a meaning representation for each of the plurality of output sentences. The result processor generates a system response based on the meaning representation for the input sentence and the meaning representation for each of the plurality of output sentences.Type: GrantFiled: February 12, 2020Date of Patent: April 5, 2022Assignees: HYUNDAI MOTOR COMPANY, KIA MOTORS CORPORATIONInventors: Youngmin Park, Seona Kim, Jeong-Eom Lee
-
Patent number: 11289103Abstract: Systems and methods for providing forward error correction for a multi-channel audio signal are described. Blocks of an audio stream are buffered into a frame. A transformation can be applied that compacts the energy of each block into a plurality of transformed channels. The energy compaction transform may compact the most energy of a block into the first transformed channel and to compact decreasing amounts of energy into each subsequent transformed channel. The transformed frame may be encoded using any suitable codec and transmitted in a packet over a network. Improved forward error correction may be provided by attaching a low bit rate encoding of the first transformed channel to a subsequent packet. To reconstruct a lost packet, the low bit rate encoding of the first channel for the lost packet may be combined with a packet loss concealment version of the other channels, constructed from a previously-received packet.Type: GrantFiled: July 14, 2020Date of Patent: March 29, 2022Assignee: Dolby Laboratories Licensing CorporationInventors: Shen Huang, Michael Eckert, Glenn N. Dickins