Specialized Equations Or Comparisons Patents (Class 704/236)
-
Patent number: 12094455Abstract: Systems and methods for selectively ignoring an occurrence of a wakeword within audio input data is provided herein. In some embodiments, a wakeword may be detected to have been uttered by an individual within a modified time window, which may account for hardware delays and echoing offsets. The detected wakeword that occurs during this modified time window may, in some embodiments, correspond to a word included within audio that is outputted by a voice activated electronic device. This may cause the voice activated electronic device to activate itself, stopping the audio from being outputted. By identifying when these occurrences of the wakeword within outputted audio are going to happen, the voice activated electronic device may selectively determine when to ignore the wakeword, and furthermore, when not to ignore the wakeword.Type: GrantFiled: September 6, 2023Date of Patent: September 17, 2024Assignee: Amazon Technologies, Inc.Inventors: James David Meyers, Kurt Wesley Piersol
-
Patent number: 12067979Abstract: A computer-implemented method includes generating an empirically derived acoustic confusability measure by processing example utterances and iterating from an initial estimate of the acoustic confusability measure to improve the measure. The method can further include using the acoustic confusability measure to selectively limit phrases to make recognizable by a speech recognition application.Type: GrantFiled: February 17, 2023Date of Patent: August 20, 2024Assignee: PROMPTU SYSTEMS CORPORATIONInventors: Harry William Printz, Naren Chittar
-
Patent number: 12028433Abstract: A global architecture (GLP), as disclosed herein, is based on the thin server architectural pattern; it delivers all its services in the form of web services and there are no user interface components executed on the GLP. Each web service exposed by the GLP is stateless, which allows the GLP to be highly scalable. The GLP is further decomposed into components. Each component is a microservice, making the overall architecture fully decoupled. Each microservice has fail-over nodes and can scale up on demand. This means the GLP has no single point of failure, making the platform both highly scalable and available. The GLP architecture provides the capability to build and deploy a microservice instance for each course-recipient-user combination. Because each student interacts with their own microservice, this makes the GLP scale up to the limit of cloud resources available—i.e. near infinity.Type: GrantFiled: July 20, 2020Date of Patent: July 2, 2024Assignee: Pearson Management Services LimitedInventors: James Walsh, Suhail Khaki
-
Patent number: 12002486Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.Type: GrantFiled: September 13, 2019Date of Patent: June 4, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka
-
Patent number: 11962321Abstract: There is provided an analog-stochastic converter for converting an analog voltage signal into a pulse signal having a corresponding probability. The analog-stochastic converter is implemented using a threshold switching element and a simple logic circuit, thereby reducing a size of the analog-stochastic converter and enabling a low power operation thereof. In addition, in order to update a weight, instead of an analog signal, a probability signal is applied using the above-described analog-stochastic converter, thereby updating a weight in a fully-parallel manner in a synaptic element array having an intersection structure. Accordingly, it is possible to shorten a time for weight update.Type: GrantFiled: November 22, 2021Date of Patent: April 16, 2024Assignee: POSTECH RESEARCH AND BUSINESS DEVELOPMENT FOUNDATIONInventors: Hyun Sang Hwang, Myoung Hoon Kwak
-
Patent number: 11922969Abstract: A speech emotion detection system may obtain to-be-detected speech data. The system may generate speech frames based on framing processing and the to-be-detected speech data. The system may extract speech features corresponding to the speech frames to form a speech feature matrix corresponding to the to-be-detected speech data. The system may input the speech feature matrix to an emotion state probability detection model. The system may generate, based on the speech feature matrix and the emotion state probability detection model, an emotion state probability matrix corresponding to the to-be-detected speech data. The system may input the emotion state probability matrix and the speech feature matrix to an emotion state transition model. The system may generate an emotion state sequence based on the emotional state probability matrix, the speech feature matrix, and the emotional state transition model. The system may determine an emotion state based on the emotion state sequence.Type: GrantFiled: October 8, 2021Date of Patent: March 5, 2024Assignee: Tencent Technology (Shenzhen) Company LimitedInventor: Haibo Liu
-
Patent number: 11880761Abstract: Systems and methods for adding a new domain to a natural language understanding system to form an updated language understanding system with multiple domain experts are provided. More specifically, the systems and methods are able to add a new domain utilizing data from one or more of the domains already present in the natural language understanding system while keeping the new domain and the already present domains separate from each other.Type: GrantFiled: July 28, 2017Date of Patent: January 23, 2024Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Imed Zitouni, Dongchan Kim, Young-Bum Kim
-
Patent number: 11741950Abstract: A processor-implemented method includes performing speech recognition of a speech signal, generating a plurality of first candidate sentences as a result of the performing of the speech recognition, identifying a respective named entity in each of the plurality of first candidate sentences, determining a standard expression corresponding to the identified respective named entity using phonemes of the corresponding named entity, determining whether to replace the identified named entity in each of the plurality of first candidate sentences with the determined standard expression based on a similarity between the named entity and the standard expression corresponding to the named entity and determining a plurality of second candidate sentences based on the determination result; and outputting a final sentence selected from the plurality of second candidate sentences.Type: GrantFiled: May 14, 2020Date of Patent: August 29, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Jeong-Hoon Park, Jihyun Lee, Hoshik Lee
-
Patent number: 11710482Abstract: Systems and processes for operating a virtual assistant to provide natural assistant interaction are provided. In accordance with one or more examples, a method includes, at an electronic device with one or more processors and memory: receiving a first audio stream including one or more utterances; determining whether the first audio stream includes a lexical trigger; generating one or more candidate text representations of the one or more utterances; determining whether at least one candidate text representation of the one or more candidate text representations is to be disregarded by the virtual assistant. If at least one candidate text representation is to be disregarded, one or more candidate intents are generated based on candidate text representations of the one or more candidate text representations other than the to be disregarded at least one candidate text representation.Type: GrantFiled: October 8, 2020Date of Patent: July 25, 2023Assignee: Apple Inc.Inventors: Juan Carlos Garcia, Paul S. McCarthy, Kurt Piersol
-
Patent number: 11704574Abstract: Techniques for machine-trained analysis for multimodal machine learning vehicle manipulation are described. A computing device captures a plurality of information channels, wherein the plurality of information channels includes contemporaneous audio information and video information from an individual. A multilayered convolutional computing system learns trained weights using the audio information and the video information from the plurality of information channels. The trained weights cover both the audio information and the video information and are trained simultaneously. The learning facilitates cognitive state analysis of the audio information and the video information. A computing device within a vehicle captures further information and analyzes the further information using trained weights. The further information that is analyzed enables vehicle manipulation. The further information can include only video data or only audio data. The further information can include a cognitive state metric.Type: GrantFiled: April 20, 2020Date of Patent: July 18, 2023Assignee: Affectiva, Inc.Inventors: Rana el Kaliouby, Seyedmohammad Mavadati, Taniya Mishra, Timothy Peacock, Panu James Turcot
-
Patent number: 11694673Abstract: Systems and methods for automated communication with Air Traffic Control. The system comprises a processor and memory. The memory stores instructions to execute a method. The method includes receiving audio communication input from an air traffic controller (ATC). The audio communication input is then converted into text input. Next, an aircraft keyword is detected in the text input. The text input is then parsed and one or more data structures are generated from the parsed input. In some examples, the one or more data structures includes command data for controlling the aircraft. Next, the command data in the one or more data structures is verified. The one or more data structures are then transmitted to an onboard flight computer of the aircraft. Last, the one or more data structures are stored in a conversation memory.Type: GrantFiled: December 11, 2019Date of Patent: July 4, 2023Assignee: The Boeing CompanyInventors: Badi Ebrahimifard, Rohan S. Sharma, Jesse G. Francheshini, Fuzhou Hu
-
Patent number: 11640819Abstract: A non-transitory computer-readable recording medium having stored therein an update program that causes a computer to execute a procedure, the procedure includes calculating a selection rate of each of a plurality of quantization points included in a quantization table, based on quantization data obtained by quantizing features of a plurality of utterance data, and updating the quantization table by updating the plurality of quantization points based on the selection rate.Type: GrantFiled: October 30, 2020Date of Patent: May 2, 2023Assignee: FUJITSU LIMITEDInventor: Naoshi Matsuo
-
Patent number: 11626121Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.Type: GrantFiled: October 25, 2022Date of Patent: April 11, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11545043Abstract: An interface for an educational tool on an electronic device is described. The interface comprises a main menu to display at least an icon for the educational tool. The main menu appears on a display screen of the electronic device. The interface also comprises a summary menu to list a subset of at least one function of the educational tool. The summary menu is accessed directly from the main menu when a user selects the icon for the educational tool. The interface further comprises an exhibitory window to display a rhyming riddle function of the educational tool selected by the user from the subset listed on the summary menu. The rhyming riddle function presents the user with a rhyming riddle to be solved.Type: GrantFiled: August 29, 2022Date of Patent: January 3, 2023Inventor: Marlyn Andrew Morgan
-
Patent number: 11514354Abstract: An Artificial Intelligence (AI) based performance prediction system predicts the performance and behavior of an entity via a complex structure made of iterative and parallel machine learning (ML) model rebuilds with real time data collection. The engine selects a best model at every level and scores the entity to help in predicting the behavior of the entity. Model selection is based on various model selection criteria. The selected model determines a propensity score that indicates a likelihood of the entity migrating from a currently categorized segment to another segment of higher or lower value. Accordingly, messages or alerts with one or more of corrective actions or system enhancements can be transmitted based on the status of the entity via various targeting channels and a post treatment analysis is carried out to find the effect of the corrective actions on the entity.Type: GrantFiled: June 4, 2018Date of Patent: November 29, 2022Assignee: ACCENTURE GLOBAL SOLUTIONS LIMITEDInventors: Mamta Aggarwal Rajnayak, Charu Nahata, Sorabh Kalra, Harshila Srivastav
-
Patent number: 11482213Abstract: Systems, methods, and computer-readable media for correcting transcriptions created through automatic speech recognition. A transcription of speech created using an automatic speech recognition system can be received. One or more domain-specific contexts associated with the speech can be identified and a text span that includes a mistranscribed entry can be recognized from the speech based on the one or more domain-specific contexts. Additionally, features can be extracted from the mistranscribed entry and the extracted features can be matched against an index of domain-specific entries to identify a correct entry of the mistranscribed entry. Subsequently, the transcription can be corrected by replacing with the mistranscribed entry with the correct entry.Type: GrantFiled: January 29, 2019Date of Patent: October 25, 2022Assignee: CISCO TECHNOLOGY, INC.Inventors: Karthik Raghunathan, Arushi Raghuvanshi, Vijay Ramakrishnan Thimmaiyah, Lucien Serapio Carroll, Varsha Ravikumar Embar
-
Patent number: 11468243Abstract: A computing device can receive a communication including text that can be presented on a display screen of the computing device. A camera of the computing device can capture image data. The computing device can determine, from the image data, an identity represented in the image data. The computing device can determine an amount of the communication to present on the display screen based on the identity. The computing device can determine, from the image data, user attention is directed toward the display screen. The computing device can present the amount of the communication on the display screen. In some embodiments, the computing device can determine which content of the communication to display based on the identity. The computing device can display a summary of the communication. The computing device can display an amount of the summary and/or the content of the summary based on the identity.Type: GrantFiled: July 1, 2019Date of Patent: October 11, 2022Assignee: Amazon Technologies, Inc.Inventor: Ryan H. Cassidy
-
Patent number: 11455999Abstract: Data is received that encapsulates a spoken response to a prompt text comprising a string of words. Thereafter, the received data is transcribed into a string of words. The string of words is then compared with a prompt so that a similarity grid representation of the comparison can be generated that characterizes a level of similarity between the string of words in the spoken response and the string of words in the prompt text. The grid representation is then scored using at least one machine learning model. The score indicates a likelihood of the spoken response having been off-topic. Data providing the encapsulated score can then be provided. Related apparatus, systems, techniques and articles are also described.Type: GrantFiled: April 9, 2020Date of Patent: September 27, 2022Assignee: Educational Testing ServiceInventors: Xinhao Wang, Su-Youn Yoon, Keelan Evanini, Klaus Zechner, Yao Qian
-
Patent number: 11443734Abstract: A text search query including one or more words may be received. An ASR index created for an audio recording may be searched over using the query to produce ASR search results including words, each word associated with a confidence score. For each of the words in the ASR search results associated with a confidence score below a threshold (and in some cases having one or more preceding words in the ASR index and one or more subsequent words in the ASR index), a phonetic representation of the audio recording may be searched for the word having the confidence score below the threshold, where it occurs in the audio recording, possibly after the one or more preceding words and in the audio recording before the one or more subsequent words, to produce phonetic search results. Search results may be returned include ASR and phonetic results.Type: GrantFiled: August 26, 2019Date of Patent: September 13, 2022Assignee: NICE LTD.Inventors: William Mark Finlay, Robert William Morris, Peter S. Cardillo, Maria Michaela Kunin
-
Patent number: 11423913Abstract: An apparatus for generating an error concealment signal, includes: an LPC representation generator for generating a replacement LPC representation; an LPC synthesizer for filtering a codebook information using the replacement LPC representation; and a noise estimator for estimating a noise estimate during a reception of good audio frames, wherein the noise estimate depends on the good audio frames representation generator is configured to use the noise estimate estimated by the noise estimator in generating the replacement LPC representation.Type: GrantFiled: March 27, 2020Date of Patent: August 23, 2022Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Michael Schnabel, Jérémie Lecomte, Ralph Sperschneider, Manuel Jander
-
Patent number: 11410686Abstract: In one aspect, a computerized method for implementing voice and acupressure-based lifestyle management includes the step of measuring a speed at which a user is speaking. A wearable device records the user's voice with a microphone and communicates a digital recording of the user's voice to a computer processor. The method includes the step of measuring a time spacing between a set of user's words and a length of the set of user's words. The method includes the step of determining at least one anomaly by comparing the digital recording of the user's voice with a benchmark recording of the user's voice. The method includes the step of alerting the user of the detected anomaly.Type: GrantFiled: July 2, 2019Date of Patent: August 9, 2022Assignee: VOECE, INC.Inventor: Rashmi Panda
-
Patent number: 11405506Abstract: Systems and methods are provided for attribute-based client callbacks. A client is prompted to leave a voice message. Attributes are extracted from the voice message and, based on the attributes, tokens created for the selection of an appropriate agent is connected to the client, such as having skills or attributes matching one or more tokens. A callback application server transmits prompts and receives requests for client callbacks. an interaction manager determines agent availability and arranges callback handling, and a session management server initiates callbacks to connect the selected agent with the client.Type: GrantFiled: June 29, 2020Date of Patent: August 2, 2022Assignee: Avaya Management L.P.Inventors: Manish Dusad, Kazim Hussain
-
Patent number: 11392773Abstract: Techniques for generating conversational training data are described. In some instances, a request to generate conversational training data for a goal-oriented conversation model is received, a transitional graph of intents is traversed to generate a conversation template for each intent of the transitional graph, each intent being a task to fulfill a request and comprising one or more slot to be filled by a user of the bot machine learning model, the conversation template including a path including at least one placeholder for an utterance or a slot level utterance, and at least utterances from one or more dictionaries are sampled to fill in the placeholders for the utterances of the path to generate conversational training data.Type: GrantFiled: January 31, 2019Date of Patent: July 19, 2022Assignee: Amazon Technologies, Inc.Inventors: Rashmi Gangadharaiah, Ajay Mishra, Roger Scott Jenke, Meghana Puvvadi
-
Patent number: 11393479Abstract: An apparatus for generating an error concealment signal includes an LPC (linear prediction coding) representation generator for generating a first replacement LPC representation and a different second replacement LPC representation; an LPC synthesizer for filtering a first codebook information using the first replacement representation to obtain a first replacement signal and for filtering a different second codebook information using the second replacement LPC representation to obtain a second replacement signal; and a replacement signal combiner for combining the first replacement signal and the second replacement signal to obtain the error concealment signal.Type: GrantFiled: March 3, 2020Date of Patent: July 19, 2022Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Michael Schnabel, Jérémie Lecomte, Ralph Sperschneider, Manuel Jander
-
Patent number: 11368583Abstract: One example method of operation may include identifying call data associated with a received call, identifying call parameters from the call data, and the call parameters include one or more call routing parameters associated with call routing of the call and one or more call session parameters associated with a call session of the call, assigning weights to one or more of the call routing parameters and the call session parameters, determining a scam score for the call based on a sum of the weights applied to the call routing parameters and the call session parameters, and blocking the call when the scam score is greater than or equal to a predetermined threshold scam score.Type: GrantFiled: November 3, 2020Date of Patent: June 21, 2022Assignee: FIRST ORION CORP.Inventors: Mark Hamilton Botner, Collin Michael Turney, Daniel Francis Kliebhan, Robert Francis Piscopo, Jr., Charles Donald Morgan, Jamelle Adnan Brown, Chee-Fung Choy, Samuel Kenton Welch, Nysia Inet George, Andrew Collin Shaddox
-
Patent number: 11357431Abstract: Methods and apparatus to identify an emotion evoked by media are disclosed. An example apparatus includes a synthesizer to generate a first synthesized sample based on a pre-verbal utterance associated with a first emotion. A feature extractor is to identify a first value of a first feature of the first synthesized sample. The feature extractor to identify a second value of the first feature of first media evoking an unknown emotion. A classification engine is to create a model based on the first feature. The model is to establish a relationship between the first value of the first feature and the first emotion. The classification engine is to identify the first media as evoking the first emotion when the model indicates that the second value corresponds to the first value.Type: GrantFiled: October 16, 2020Date of Patent: June 14, 2022Assignee: The Nielsen Company (US), LLCInventors: Robert T. Knight, Ramachandran Gurumoorthy, Alexander Topchy, Ratnakar Dev, Padmanabhan Soundararajan, Anantha Pradeep
-
Patent number: 11361675Abstract: Provided is a system and a non-transitory computer-readable medium having computer-executable instructions stored thereon which, when executed by one or more processors effectuate operations comprising dividing a text of a plurality of words in a foreign language into one or more Interpretation Phrases, each of the Interpretation Phrases being made of one or more words chosen at an optimal composition for a user to listen to, read along, and maintain comprehension and engagement, wherein the optimal composition is determined based on the biographical data of the user and the historical usage by the user; reading aloud the first Interpretation Phrase by a narrator; and after reading aloud the first Interpretation Phrase, interpreting aloud the first Interpretation Phrase into the said user's native language to provide understanding of the Interpretation Phrase in the user's native language, to maintain the flow of the story, and to create and promote subconscious associations between native and foreign languagType: GrantFiled: November 7, 2018Date of Patent: June 14, 2022Assignee: MAGICAL TRANSLATIONS, INC.Inventor: Leslie Omana Begert
-
Patent number: 11347803Abstract: Systems and methods for adaptive question answering are provided in which an answer is adaptive to a user's characteristics, goals and needs by continuously learning from user interactions and adapting both the context and data visualization. An exemplary system comprises software modules embodied on a computer network, and the software modules comprise an interpretation engine, an answering engine and a learning engine.Type: GrantFiled: January 27, 2020Date of Patent: May 31, 2022Assignee: Cuddle Artificial Intelligence Private LimitedInventors: Neha Prabhugaonkar, Abhay Parab, Natwar Mall
-
Patent number: 11336972Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated video preview generation. Example methods may include determining video content, determining a first shot transition, a second shot transition, a third shot transition, and a fourth shot transition in the video content, and determining that human speech is present during the first shot transition and the second shot transition. Example methods may include determining a first timestamp associated with the third shot transition, determining a second timestamp associated with the fourth shot transition, generating a first video preview of the video content, where the first video preview includes a segment of the video content from the first timestamp to the second timestamp, and causing presentation of the first video preview, where the first video preview does not include a segment of the video content between the first shot transition and the second shot transition.Type: GrantFiled: January 5, 2021Date of Patent: May 17, 2022Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Muhammad Raffay Hamid, Kewen Chen, Anne TuAnh Thanh Thuy Ho, Guy Friedel, Arun Velayudhan Pillai, Dhaval Damani, Jacob William Jensen, Zuzanna Maria Stepniakowska Coggins, Maciej Tadeusz Golonka, Anantha Krishna Hodrali Srinivasa Bhatta
-
Patent number: 11328733Abstract: Systems and methods for speaker verification comprise optimizing a neural network by minimizing a generalized negative log likelihood function, including receiving a training batch of audio samples comprising a plurality of utterances for each of a plurality of speakers, extracting features from the audio samples to generate a batch of features, processing the batch of features using a neural network to generate a plurality of embedding vectors configured to differentiate audio samples by speaker, computing a generalized negative log-likelihood loss (GNLL) value for the training batch based, at least in part, on the embedding vectors, and modifying weights of the neural network to reduce the GNLL value. Computing the GNLL may include generating a centroid vector for each of a plurality of speakers, based at least in part on the embedding vectors.Type: GrantFiled: September 24, 2020Date of Patent: May 10, 2022Assignee: SYNAPTICS INCORPORATEDInventors: Saeed Mosayyebpour Kaskari, Atabak Pouya
-
Patent number: 11308964Abstract: Systems, apparatus, methods, and articles of manufacture for cooperatively-overlapped and Artificial Intelligence (AI)-managed interfaces. For example, multiple cooperatively and/or partially overlapped interfaces may be provided (e.g., via an electronic and/or touch-screen device), with such interfaces being dynamically managed by various AI components, such as natural language processing, machine learning techniques, and/or neural network data processing.Type: GrantFiled: August 18, 2020Date of Patent: April 19, 2022Assignee: The Travelers Indemnity CompanyInventors: Douglas Calegari, Stephen Ziegelmayer
-
Patent number: 11270692Abstract: A speech recognition method, performed by a computer, with an improved recognition accuracy is disclosed. The method includes: performing speech recognition of an input speech to acquire a plurality of recognition candidates through a plurality of speech recognition processes different from each other for a section having a reliability lower than a predetermined value; verifying similarities between each of the acquired plurality of recognition candidates and meta-information corresponding to the input speech; and determining, based on the verified similarities, a recognition result of the low-reliability section from among the acquired plurality of recognition candidates.Type: GrantFiled: June 28, 2019Date of Patent: March 8, 2022Assignee: FUJITSU LIMITEDInventors: Yusuke Hamada, Keisuke Asakura
-
Patent number: 11264012Abstract: Conversations between agents of a contact center and a customer are often transcribed so that text is maintained. However, text conversations consist only of text and omit significant portions of a conversation that are conveyed outside of the specific words spoken. By determining the emotion, tone, or other aspect in a conversation, which may contradict the text content, a data structure may be maintained such that the textual content is annotated with emotion or tonal information and/or utilized in a routing decision to cause a communication network to be altered, such as to include at least one additional node based upon a particular emotion or tone.Type: GrantFiled: December 31, 2019Date of Patent: March 1, 2022Assignee: Avaya Inc.Inventors: Piyush Mital, Nikita Kotak, Asmita Gokhale, Robert E. Braudes
-
Patent number: 11257484Abstract: According to some embodiments, a multi-layer speech recognition transcript post processing system may include a data-driven, statistical layer associated with a trained automatic speech recognition model that selects an initial transcript. A rule-based layer may receive the initial transcript from the data-driven, statistical layer and execute at least one pre-determined rule to generate a first modified transcript. A machine learning approach layer may receive the first modified transcript from the rule-based layer and perform a neural model inference to create a second modified transcript. A human editor layer may receive the second modified transcript from the machine learning approach layer along with an adjustment from at least one human editor. The adjustment may create, in some embodiments, a final transcript that may be used to fine-tune the data-driven, statistical layer.Type: GrantFiled: August 21, 2019Date of Patent: February 22, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Dimitrios Basile Dimitriadis, Xie Chen, Nanshan Zeng, Yu Shi, Liyang Lu
-
Patent number: 11250840Abstract: Some embodiments provide a method of training a MT network to detect a wake expression that directs a digital assistant to perform an operation based on a request that follows the expression. The MT network includes processing nodes with configurable parameters. The method iteratively selects different sets of input values with known sets of output values. Each of a first group of input value sets includes a vocative use of the expression. Each of a second group of input value sets includes a non-vocative use of the expression. For each set of input values, the method uses the MT network to process the input set to produce an output value set and computes an error value that expresses an error between the produced output value set and the known output value set. Based on the error values, the method adjusts configurable parameters of the processing nodes of the MT network.Type: GrantFiled: April 5, 2019Date of Patent: February 15, 2022Assignee: PERCEIVE CORPORATIONInventor: Steven L. Teig
-
Patent number: 11244698Abstract: Systems and methods are provided for analyzing voice-based audio inputs. A voice-based audio input associated with a user (e.g., wherein the voice-based audio input is a prompt or a command) is received and measures of one or more features are extracted. One or more parameters are calculated based on the measures of the one or more features. The occurrence of one or more mistriggers is identified by inputting the one or more parameters into a predictive model. Further, systems and methods are provided for identifying human mental health states using mobile device data. Mobile device data (including sensor data) associated with a mobile device corresponding to a user is received. Measurements are derived from the mobile device data and input into a predictive model. The predictive model is executed and outputs probability values of one or more symptoms associated with the user.Type: GrantFiled: March 8, 2019Date of Patent: February 8, 2022Assignee: Cogito CorporationInventors: Joshua Feast, Ali Azarbayejani, Skyler Place
-
Patent number: 11244689Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining voice characteristics are provided. One of the methods includes: obtaining speech data of a speaker; inputting the speech data into a model trained at least by jointly minimizing a first loss function and a second loss function, wherein the first loss function comprises a non-sampling-based loss function and the second loss function comprises a Gaussian mixture loss function with non-unit multi-variant covariance matrix; and obtaining from the trained model one or more voice characteristics of the speaker.Type: GrantFiled: March 22, 2021Date of Patent: February 8, 2022Assignee: ALIPAY (HANGZHOU) INFORMATION TECHNOLOGY CO., LTD.Inventors: Zhiming Wang, Kaisheng Yao, Xiaolong Li
-
Patent number: 11238859Abstract: A natural-language voice chatbot is initiated and a voice session is established between the chatbot and a customer while the customer is operating a vehicle device within a vehicle. A pre-staged order is taken from a customer during the session and the session is suspended until the customer arrives at a store associated with the pre-staged order. A location-based trigger is raised when the customer is detected as being present at a transaction terminal of a store; the session is resumed on the transaction terminal and/or the vehicle device. The pre-stage order is confirmed during the resumed session and payment is obtained from the customer for the order when payment was not already obtained from the customer. The order is sent to a fulfillment station and, in an embodiment, the items associated with the order are delivered to the customer while the customer remains at the terminal.Type: GrantFiled: June 28, 2019Date of Patent: February 1, 2022Assignee: NCR CorporationInventors: Matthew Robert Burris, Shelby Frances Apps, Andrew Cohen, Gary C. Dalton, Jason Robert Dyer, Jodessiah Sumpter
-
Patent number: 11200217Abstract: A method includes searching for data contained in a structured data structure. The method includes receiving a query. The query includes a structured data structure path and a first element related to the structured data structure path. One or more patterns are created comprising at least a portion of the structured data structure path and one or more elements related to the first element. For each of the one or more patterns, a hash is created. The created hashes are looked-up in a hash index to identify one or more structured data structures correlated to the hashes. The one or more structured data structures are identified to a user.Type: GrantFiled: May 26, 2017Date of Patent: December 14, 2021Assignee: PERFECT SEARCH CORPORATIONInventors: Bruce R. Tietjen, Ronald P. Millett
-
Patent number: 11194998Abstract: An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user.Type: GrantFiled: July 24, 2017Date of Patent: December 7, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Kazuhito Koishida, Alexander A Popov, Uros Batricevic, Steven Nabil Bathiche
-
Patent number: 11113607Abstract: A response generation apparatus ensures accurate output. A computer stores graph knowledge including a response generation module generating a response to an input document including a plurality of sentences, the graph knowledge database includes graph data that manages a structure of each type of graph knowledge, and the response generation module generates a first graph knowledge from each of the sentences; searches a second graph knowledge similar to each of the plurality of first graph knowledge while referring to the graph data on the basis of the plurality of first graph knowledge; identifies the plurality of second graph knowledge included in a dense location where a density of the second graph knowledge is high in a graph space; searches third graph knowledge for generating the response while referring to the graph data on the basis of the identified second graph knowledge; and generates the response using the third graph knowledge.Type: GrantFiled: June 7, 2017Date of Patent: September 7, 2021Assignee: HITACHI, LTD.Inventors: Toshinori Miyoshi, Miaomei Lei, Hiroki Sato
-
Patent number: 11102590Abstract: A hearing device, e.g. a hearing aid, comprises a) a multitude of input units, each providing an electric input signal representing sound in the environment of the user in a time-frequency representation, wherein the sound is a mixture of speech and additive noise or other distortions, e.g. reverberation, b) a multitude of beamformer filtering units, each being configured to receive at least two, e.g. all, of said multitude of electric input signals, each of said multitude of beamformer filtering units being configured to provide a beamformed signal representative of the sound in a different one of a multitude of spatial segments, e.g. spatial cells, around the user, c) a multitude of speech probability estimators each configured to receive the beamformed signal for a particular spatial segment and to estimate a probability that said particular spatial segment contains speech at a given point in time and frequency, wherein at least one, e.g.Type: GrantFiled: July 17, 2019Date of Patent: August 24, 2021Assignee: Oticon A/SInventor: Jesper Jensen
-
Patent number: 11087744Abstract: Term masking is performed by generating a time-alignment value for a plurality of identifiable units of sound in vocal audio content contained in a mixed audio track, force-aligning each of the plurality of identifiable units of sound to the vocal audio content based on the time-alignment value, thereby generating a plurality of force-aligned identifiable units of sound, identifying from the plurality of force-aligned identifiable units of sound a force-aligned identifiable unit of sound to be muddled, and audio muddling the force-aligned identifiable unit of sound to be muddled.Type: GrantFiled: December 17, 2019Date of Patent: August 10, 2021Assignee: Spotify ABInventors: Andreas Jansson, Eric J. Humphrey, Rachel Malia Bittner, Sravana K. Reddy
-
Patent number: 11023520Abstract: Implementations relate to techniques for providing context-dependent search results. The techniques can include receiving a query and background audio. The techniques can also include identifying the background audio, establishing concepts related to the background audio and obtaining terms related to the concepts related to the background audio. The techniques can also include obtaining search results based on the query and on at least one of the terms. The techniques can also include providing the search results.Type: GrantFiled: January 10, 2019Date of Patent: June 1, 2021Assignee: GOOGLE LLCInventors: Jason Sanders, John J. Lee, Gabriel Taubman
-
Patent number: 11024311Abstract: The various implementations described herein include methods and systems for determining device leadership among voice interface devices. In one aspect, a method is performed at a first electronic device of a plurality of electronic devices, each having microphones, a speaker, processors, and memory storing programs for execution by the processors. The first device detects a voice input. It determines a device state and a relevance of the voice input. It identifies a subset of electronic devices from the plurality to which the voice input is relevant. In accordance with a determination that the subset includes the first device, the first device determines a first score of a criterion associated with the voice input and receives second scores of the criterion from other devices in the subset. In accordance with a determination that the first score is higher than the second scores, the first device responds to the detected input.Type: GrantFiled: February 10, 2020Date of Patent: June 1, 2021Assignee: GOOGLE LLCInventors: Kenneth Mixter, Diego Melendo Casado, Alexander Houston Gruenstein, Terry Tai, Christopher Thaddeus Hughes, Matthew Nirvan Sharifi
-
Patent number: 11002789Abstract: An analog circuit fault feature extraction method based on a parameter random distribution neighbor embedding winner-take-all method, comprising the following steps: (1) collecting a time-domain response signal of an analog circuit under test, wherein the input of the analog circuit under test is excited by using a pulse signal, a voltage signal is sampled at an output end, and the collected time-domain response signal is an output voltage signal of the analog circuit; (2) applying a discrete wavelet packet transform for the collected time-domain response signal to acquire each wavelet node signal; (3) calculating energy values and kurtosis values of the acquired wavelet node signals to form an initial fault feature data set of the analog circuit; and (4) analyzing the initial fault feature data by the parameter random distribution neighbor embedding winner-take-all method, to acquire optimum low-dimensional feature data.Type: GrantFiled: October 20, 2018Date of Patent: May 11, 2021Assignee: WUHAN UNIVERSITYInventors: Yigang He, Wei He, Hui Zhang, Liulu He, Baiqiang Yin, Bing Li
-
Patent number: 10997277Abstract: An integrated circuit device such as a neural network accelerator can be programmed to select a numerical value based on a multinomial distribution. In various examples, the integrated circuit device can include an execution engine that includes multiple separate execution units. The multiple execution units can operate in parallel on different streams of data. For example, to make a selection based on a multinomial distribution, the execution units can be configured to perform cumulative sums on sets of numerical values, where the numerical values represent probabilities. In this example, to then obtain cumulative sums across the sets of numerical values, the largest values from the sets can be accumulated, and then added, in parallel to the sets. The resulting cumulative sum across all the numerical values can then be used to randomly select a specific index, which can provide a particular numerical value as the selected value.Type: GrantFiled: March 26, 2019Date of Patent: May 4, 2021Assignee: Amazon Technologies, Inc.Inventors: Yu Zhou, Vignesh Vivekraja, Ron Diamant
-
Patent number: 10997964Abstract: A system, method and computer-readable storage devices are for normalizing text for ASR and TTS in a language-neutral way. The system described herein divides Unicode text into meaningful chunks called “atomic tokens.” The atomic tokens strongly correlate to their actual pronunciation, and not to their meaning. The system combines the tokenization with a data-driven classification scheme, followed by class-determined actions to convert text to normalized form. The classification labels are based on pronunciation, unlike alternative approaches that typically employ Named Entity-based categories. Thus, this approach is relatively simple to adapt to new languages. Non-experts can easily annotate training data because the tokens are based on pronunciation alone.Type: GrantFiled: August 16, 2019Date of Patent: May 4, 2021Assignee: AT&T INTELLECTUAL PROPERTY 1, L.P.Inventors: Ladan Golipour, Alistair D. Conkie
-
Patent number: 10971135Abstract: Systems, methods, and computer-readable storage devices for crowd-sourced data labeling. The system requests a respective response from each of a set of entities. The set of entities includes crowd workers. Next, the system incrementally receives a number of responses from the set of entities until one of an accuracy threshold is reached and m responses are received, wherein the accuracy threshold is based on characteristics of the number of responses. Finally, the system generates an output response based on the number of responses.Type: GrantFiled: July 11, 2019Date of Patent: April 6, 2021Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Jason Williams, Tirso Alonso, Barbara B. Hollister, Ilya Dan Melamed
-
Patent number: 10963510Abstract: A natural language processing system that includes an artificial intelligence (AI) engine and a tagging engine. The AI engine is configured to receive a set of audio files and to identify concepts within the set of audio files. The AI engine is further configured to determine a usage frequency for each of the identified concepts and to generate an AI-defined tag for concepts with a usage frequency that is greater than a usage frequency threshold. The tagging engine is configured to receive an audio file and to identify observed concepts within the audio file. The tagging engine is further configured to compare the observed concepts to the first set of concepts, to determine one or more observed concepts matches concepts linked with AI-defined tags, and to modify metadata for the audio file to include AI-defined tags.Type: GrantFiled: August 9, 2018Date of Patent: March 30, 2021Assignee: Bank of America CorporationInventors: James McCormack, Sean M. Gutman, Manu J. Kurian, Sasidhar Purushothaman, Suki Ramasamy, William P. Jacobson