Patents Examined by Olujimi A Adesanya
-
Patent number: 11962990Abstract: In general, disclosed is a device that includes one or more processors, coupled to the memory, configured to perform an energy analysis with respect to one or more audio objects, in the ambisonics domain, in the first time segment. The one or more processors are also configured to perform a similarity measure between the one or more audio objects, in the ambisonics domain, in the first time segment, and the one or more audio objects, in the ambisonics domain, in the second time segment. In addition, the one or more processors are configured to perform a reorder of the one or more audio objects, in the ambisonics domain, in the first time segment with the one or more audio objects, in the ambisonics domain, in the second time segment, to generate one or more reordered audio objects in the first time segment.Type: GrantFiled: October 11, 2021Date of Patent: April 16, 2024Assignee: QUALCOMM IncorporatedInventors: Dipanjan Sen, Sang-Uk Ryu
-
Patent number: 11948554Abstract: The acoustic feature extraction means 82 extracts an acoustic feature, using predetermined parameters, from an acoustic pattern obtained as a result of processing on an acoustic signal. The language vector calculation means 83 calculates a language vector from a given label that represents an attribute of a source of the acoustic signal and that is associated with the acoustic pattern. The similarity calculation means 84 calculates a similarity between the acoustic feature and the language vector. The parameter update means 85 learns parameters so that the similarity becomes larger, and updates the predetermined parameters to the parameters obtained by learning.Type: GrantFiled: September 20, 2018Date of Patent: April 2, 2024Assignee: NEC CORPORATIONInventors: Tatsuya Komatsu, Reishi Kondo, Sakiko Mishima
-
Patent number: 11935550Abstract: A system comprising an audio compression device is provided. The audio compression device receives a plurality of pulse code modulated (PCM) samples, performs a quantization of the plurality of the PCM samples, and determines a plurality of time windows for the plurality of quantized PCM samples. The audio compression device further determines a first number of delta bits and a first sample count for a first time window of a plurality of time windows, where the first time window includes a first group of time-domain consecutive samples according to the first sample count. The audio compression device encodes the sample into the first number of delta bits based on a difference from a previous sample to generate a first stream of delta bits, and encodes the first sample count and the first number of delta bits in a corresponding first header for the first time window.Type: GrantFiled: March 31, 2023Date of Patent: March 19, 2024Assignee: The ADT Security CorporationInventor: Andrew P. Weir
-
Patent number: 11907863Abstract: Embodiments are provided that relate to a computer system, a computer program product, and a computer-implemented method for improving performance of a dialog system employing an automated virtual dialog agent. Embodiments involve utilizing an automated virtual agent to receive a natural language request and generate a corresponding response, automatically identifying and resolving a corresponding knowledge gap between the request and response, and refining the automated virtual agent with the resolved knowledge gap.Type: GrantFiled: July 24, 2020Date of Patent: February 20, 2024Assignee: International Business Machines CorporationInventors: Daniela Rosu, Ruchi Mahindru
-
Patent number: 11899519Abstract: Systems, methods, and devices with reduced power consumption in network microphone devices. In one embodiment, a network microphone device is configured to perform a method that includes (i) capturing audio content; (ii) using a first algorithm to perform a keyword detection process for determining whether the audio content includes a keyword; (iii) responsive to determining that the audio content includes the keyword, using a second, more computationally intensive algorithm to perform a wake-word detection process for determining whether the audio content includes a wake word; and (iv) responsive to performing the wake-word detection process, (a) causing a voice service corresponding to the wake word to process the audio content if the wake-word detection process confirms that the audio content includes the wake word or (b) ceasing performance of the wake-word detection process if the wake-word detection process disconfirms that the audio content includes the wake word.Type: GrantFiled: October 23, 2018Date of Patent: February 13, 2024Assignee: Sonos, Inc.Inventors: Nick D'Amato, Daniele Giacobello, Joachim Fainberg, Klaus Hartung
-
Patent number: 11900070Abstract: A computer-implemented method according to one embodiment includes receiving, at a deep neural network (DNN), a plurality of sentences each having an associated label; training the DNN, utilizing the plurality of sentences and associated labels; and producing a linguistic expression (LE) utilizing the trained DNN.Type: GrantFiled: February 3, 2020Date of Patent: February 13, 2024Assignee: International Business Machines CorporationInventors: Prithviraj Sen, Siddhartha Brahma, Yunyao Li, Laura Chiticariu, Rajasekar Krishnamurthy, Shivakumar Vaithyanathan, Marina Danilevsky Hailpern
-
Patent number: 11899699Abstract: This application discloses a keyword generating method, an apparatus, a device and a storage medium, which relate to the field of natural language processing in the field of artificial intelligence. A specific implementation scheme includes: inputting a target text into a text processing model, obtaining a word sequence corresponding to the target text, and generating a semantic representation sequence corresponding to the word sequence; making prediction about each semantic representation vector in the semantic representation sequence respectively to obtain a prediction result; and if the prediction result indicates that a word corresponding to the semantic representation vector is capable of triggering a generation of a keyword, outputting the keyword based on the semantic representation vector and the prediction result. This method improves the accuracy of generating keywords.Type: GrantFiled: June 14, 2021Date of Patent: February 13, 2024Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Kaichun Yao, Chuan Qin, Hengshu Zhu, Chao Ma, Jingshuai Zhang
-
Patent number: 11893991Abstract: Systems and methods for e-commerce systems using natural language understanding are described. A computing device is configured receive a user utterance including at least one identified semantic component and at least one missing semantic component and generate a context stack including a set of context entries. Each of the context entries includes a root intent element, an entity list element, and a dialogue stack and each context entry in the set of context entries is associated with one of a user utterance or a system utterance. The computing device is further configured to review at least one context entry in the set of context entries to locate the at least one missing semantic element within the dialogue stack and generate an intent flow execution request including the at least one semantic element from the first speech data and the missing semantic element.Type: GrantFiled: June 24, 2022Date of Patent: February 6, 2024Assignee: Walmart Apollo, LLCInventors: Snehasish Mukherjee, Shankara Bhargava Subramanya
-
Patent number: 11881205Abstract: The present disclosure relates to a speech synthesis method and device, and a computer-readable storage medium, and relates to the field of computer technology. The method of the present disclosure includes: dividing a text into a plurality of segments according to a language category to which each of the segments belongs; converting each of the segments into a phoneme corresponding to the segment to generate a phoneme sequence of the text according to the language category to which each of the segments belongs; inputting the phoneme sequence into a speech synthesis model trained in advance and converting the phoneme sequence into a vocoder characteristic parameter; and inputting the vocoder characteristic parameter into a vocoder to generate a speech.Type: GrantFiled: March 30, 2020Date of Patent: January 23, 2024Assignees: BEIJING JINGDONG SHANGKE INFORMATION TECHNOLOGY CO, LTD., BEIJING JINGDONG CENTURY TRADING CO., LTD.Inventors: Zhizheng Wu, Zhengchen Zhang, Wei Song, Yonghui Rao, Zhihang Xie, Guanghui Xu, Shuyong Liu, Bosen Ma, Shuangwen Qiu, Junmin Lin
-
Patent number: 11875116Abstract: A method including inputting, into a phrase recognition model comprising a neural network, a vector comprising a plurality of ngrams of text. The method also includes applying, using the phrase recognition model, a filter to the plurality of ngrams during execution. The filter has a skip word setting of at least one. The method also includes determining, based on the skip word setting, at least one ngram in the vector to be skipped to form at least one skip word. The method also includes outputting an intermediate score for a set of ngrams that match the filter. The method also includes calculating a scalar number representing a semantic meaning of the at least one skip word. The method also includes generating based on the scalar number and the intermediate score, a final score for the set of ngrams. A computer action is performed using the final score.Type: GrantFiled: December 20, 2019Date of Patent: January 16, 2024Assignee: Intuit Inc.Inventors: Oren Sar Shalom, Alexander Zhicharevich, Adi Shalev, Yehezkel Shraga Resheff
-
Patent number: 11868453Abstract: In one embodiment, certain words or phrases spoken by customers during calls to a call center are used to identify or authenticate the user. Words or phrases such as a customer's name, or an account number or telephone number, are selected for a customer. Recordings of the selected words or phrases spoken by the customer during previous calls are used to generate voiceprints that are stored and associated with the customer. Later, when the customer calls the call center, instances of the customer speaking the selected words are extracted from the call (referred to herein as “audio-of-interest”) and are compared against the voiceprints stored for the customer. If the voiceprints match the audio-of-interest the customer is authenticated.Type: GrantFiled: November 7, 2019Date of Patent: January 9, 2024Assignee: Verint Americas Inc.Inventors: Marc Adam Calahan, Timothy Andrew Furnell, Jamie Richard Williams
-
Patent number: 11861477Abstract: A device receives document information associated with a document, and receives a request to identify insights in the document information. The device performs, based on the request, natural language processing on the document information to identify words, phrases, and sentences in the document information, and utilizes a first machine learning model with the words, the phrases, and the sentences to identify information indicating abstract insights, concrete insights, and non-insights in the document. The device utilizes a second machine learning model to match the abstract insights with particular concrete insights that are different than the concrete insights, and utilizes a third machine learning model to determine particular insights based on the non-insights. The device generates an insight document that includes the concrete insights, the abstract insights matched with the particular concrete insights, and the particular insights determined based on the non-insights.Type: GrantFiled: January 6, 2022Date of Patent: January 2, 2024Assignee: Capital One Services, LLCInventor: Joni Bridget Jezewski
-
Patent number: 11854562Abstract: A method (and structure and computer product) to permit zero-shot voice conversion with non-parallel data includes receiving source speaker speech data as input data into a content encoder of a style transfer autoencoder system, the content encoder providing a source speaker disentanglement of the source speaker speech data by reducing speaker style information of the input source speech data while retaining content information and receiving target speaker input speech as input data into a target speaker encoder. The output of the content encoder and the target speaker encoder are combined in a decoder of the style transfer autoencoder, and the output of the decoder provides the content information of the input source speech data in a style of the target speaker speech information.Type: GrantFiled: May 14, 2019Date of Patent: December 26, 2023Assignee: International Business Machines CorporationInventors: Yang Zhang, Shiyu Chang
-
Patent number: 11854575Abstract: A microphone acquires audio data of a user's speech. The audio data is processed to determine sentiment data indicative of perceived emotional content of the speech. For example, the sentiment data may include values for one or more of valence that is based on a particular change in pitch over time or activation that is based on speech pace. A simplified user interface provides the user with a graphical representation based on the sentiment data and associated descriptors. For example, a visual indicator such as a dot may be shown at particular coordinates onscreen that correspond to the sentiment data. Text descriptors may also be presented near the dot. As the user continues speaking, and new audio data is processed, the interface is dynamically updated. The user may use this information to assess their state of mind, facilitate interactions with others, and so forth.Type: GrantFiled: September 30, 2019Date of Patent: December 26, 2023Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Alexander Jonathan Pinkus, Hanhan Wang, Samuel Elbert McGowan, Bilyana Slavova, Narendra Gyanchandani, David Cole
-
Patent number: 11847413Abstract: Techniques are disclosed for building a dictionary of words from combinations of symbols generated based on input data. A neuro-linguistic behavior recognition system includes a neuro-linguistic module that generates a linguistic model that describes data input from a source (e.g., video data, SCADA data, etc.). To generate words for the linguistic model, a lexical analyzer component in the neuro-linguistic module receives a stream of symbols, each symbol generated based on an ordered stream of normalized vectors generated from input data. The lexical analyzer component determines words from combinations of the symbols based on a hierarchical learning model having one or more levels. Each level indicates a length of the words to be identified at that level. Statistics are evaluated for the words identified at each level. The lexical analyzer component identifies one or more of the words having statistical significance.Type: GrantFiled: May 24, 2021Date of Patent: December 19, 2023Assignee: Intellective Ai, Inc.Inventors: Gang Xu, Ming-Jung Seow, Tao Yang, Wesley Kenneth Cobb
-
Patent number: 11823673Abstract: An electronic device is provided. The electronic device includes a microphone, a communication circuitry, an indicator configured to provide at least one visual indication, and a processor configured to be electrically connected with the microphone, the communication circuitry, and the indicator, and a memory. The memory stores instructions, when executed, cause the processor to receive a first voice input through the microphone, perform a first voice recognition for the first voice input, if a first specified word for waking up the electronic device is included in a result of the first voice recognition, display a first visual indication through the indicator, receive a second voice input through the microphone, perform a second voice recognition for the second voice input, and if a second specified word corresponding to the first visual indication is included in a result of the second voice recognition, wake up the electronic device.Type: GrantFiled: November 9, 2020Date of Patent: November 21, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: No Joon Park, Hyo Jung Lee, Jun Hyung Park, Tae Hee Lee, Geon Soo Kim, Han Jib Kim, Yong Joon Jeon
-
Patent number: 11816136Abstract: For a passage text and a corresponding answer text, perform a word-level soft alignment to obtain contextualized passage embeddings and contextualized answer embeddings, and a hidden level soft alignment on the contextualized passage embeddings and the contextualized answer embeddings to obtain a passage embedding matrix. Construct a passage graph of the passage text based on the passage embedding matrix, and apply a bidirectional gated graph neural network to the passage graph until a final state embedding is determined, during which intermediate node embeddings are fused from both incoming and outgoing edges. Obtain a graph-level embedding from the final state embedding, and decode the final state embedding to generate an output sequence word-by-word. Train a machine learning model to generate at least one question corresponding to the passage text and the answer text, by evaluating the output sequence with a hybrid evaluator combining cross-entropy evaluation and reinforcement learning evaluation.Type: GrantFiled: October 23, 2022Date of Patent: November 14, 2023Assignees: International Business Machines Corporation, RENSSELAER POLYTECHNIC INSTITUTEInventors: Lingfei Wu, Yu Chen, Mohammed J. Zaki
-
Patent number: 11797610Abstract: A natural language interfacing system may use a knowledge acquisition tool to obtain structured representations from user input text. The system may initiate interaction with a request for input and a partial statement with blank text slots labeled by field types. The system may receive input text to fill in a slot of the partial statement and perform semantic parsing on the input text to identify a trigger concept. The system may generate a list of templates defining different semantic frames for the trigger concept. A generated template may include additional generated slots and/or suggested slot-fillers to guide user input. In response to a template selection, the partial statement includes the trigger concept annotated with a semantic frame. This process is repeated by iteratively updating the list of templates until the statement is completed. The statement is mapped to a structured representation including semantic frames.Type: GrantFiled: September 15, 2020Date of Patent: October 24, 2023Assignee: Elemental Cognition Inc.Inventors: David Ferrucci, Clifton James McFate, Aditya Kalyanpur, Andrea Bradshaw, David Melville
-
Patent number: 11790890Abstract: Methods, systems, apparatus, including computer programs encoded on a computer storage medium, for a user device to learn offline voice actions. In one aspect, the method includes actions of detecting, by the user device, an utterance at a first time when the user device is connected to a server by a network, providing, by the user device, the utterance to the server using the network, receiving, by the user device and from the server, an update to the grammar of the user device, detecting, by the user device, a subsequent utterance of the utterance at a second time when the user device is not connected to the server by a network, and in response to detecting, by the user device, the subsequent utterance of the utterance at the second time, identifying, by the user device, an operation to perform based on (i) the subsequent utterance, and (ii) the updated grammar.Type: GrantFiled: November 8, 2021Date of Patent: October 17, 2023Assignee: GOOGLE LLCInventors: Vikram Aggarwal, Moises Morgenstern Gali
-
Patent number: 11769018Abstract: Methods and systems for attention behavioral analysis for a conversational question and answer system are disclosed. A multi-modality input is selected from a plurality of multimodality conversations among two or more users. The system annotates the first modality inputs and at least one attention region in the first modality input corresponding to a set of entities and semantic relationships in a unified modality is identified by a discrete aspect of information bounded by the attention elements. The system models the representations of the multimodality inputs at different levels of granularity, which includes entity level, turn level, conversational level. The method proposed uses a network that consists of multilevel encoder-decoder architecture that is used to determine unified focalized attention, analyze and construct one or more responses for one or more turns in a conversation.Type: GrantFiled: November 24, 2020Date of Patent: September 26, 2023Assignee: Openstream Inc.Inventor: Rajasekhar Tumuluri