Creating Patterns For Matching Patents (Class 704/243)
-
Patent number: 12230268Abstract: Techniques for providing a contextual voice user interface that enables a user to query a speech processing system with respect to the decisions made to answer the user's command are described. The speech processing system may store speech processing pipeline data used to process a command. At some point after the system outputs content deemed responsive to the command, a user may speak an utterance corresponding to an inquiry with respect to the processing performed to respond to the command. For example, the user may state “why did you tell me that?” In response thereto, the speech processing system may determine the stored speech processing pipeline data used to respond to the command, and may generate output audio data that describes the data and computing decisions involved in determining the content deemed responsive to the command.Type: GrantFiled: January 30, 2023Date of Patent: February 18, 2025Assignee: Amazon Technologies, Inc.Inventors: Michael James Moniz, Abishek Ravi, Ryan Scott Aldrich, Michael Bennett Adams
-
Patent number: 12226221Abstract: A dynamic neuropsychological assessment tool according to an embodiment utilizes speech recognition, speech synthesis and machine learning to assess whether a patient is at risk for a neurological disorder, such as Alzheimer's disease. The dynamic neuropsychological assessment tool enables self-administration by a patient. The tool performs pre-test validation operations on the test environment, test equipment, and the patient's capability for performing the test at that time. The tool also enables dynamic modification of a questionnaire presented to the patient while the patient completes the questionnaire. Also provides the dynamic modification of which tests to present the patient with. The modification can be rule based or modified by a provider. The dynamic neuropsychological assessment tool enables providers and administrators to modify and improve tests and validate them using machine learning based on previously completed assessments and results.Type: GrantFiled: January 9, 2023Date of Patent: February 18, 2025Assignee: Intraneuron, LLCInventors: Vidar Vignisson, Gregory Sahagian
-
Patent number: 12217761Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media relate to a method for target speaker extraction. A target speaker extraction system receives an audio frame of an audio signal. A multi-speaker detection model analyzes the audio frame to determine whether the audio frame includes only a single-speaker or multiple speakers. When the audio frame includes only a single-speaker, the system inputs the audio frame to a target speaker VAD model to suppress speech in the audio frame from a non-target speaker based on comparing the audio frame to a voiceprint of a target speaker. When the audio frame includes multiple speakers, the system inputs the audio frame to a speech separation model to separate the voice of the target speaker from a voice mixture in the audio frame.Type: GrantFiled: October 31, 2021Date of Patent: February 4, 2025Assignee: Zoom Video Communications, Inc.Inventors: Yuhui Chen, Qiyong Liu, Zhengwei Wei, Yangbin Zeng
-
Patent number: 12217745Abstract: A system obtains a first training data set comprising labeled speech data or both labeled and unlabeled data corresponding to a high-resource data set as well as latent speech representations based on the first training data set. The system trains a machine learning model on the first training data set to learn phonetically aware speech representations corresponding to the first training data set. The system applies the latent speech representations to a transformer context network to generate contextual representations. The system aligns each of the contextual representations with a phoneme label to generate phonetically-aware contextual representations. The system causes a refinement engine to further refine the machine learning model.Type: GrantFiled: July 3, 2023Date of Patent: February 4, 2025Assignee: Microsoft Technology Licensing, LLCInventors: Yao Qian, Yu Wu, Kenichi Kumatani, Shujie Liu, Furu Wei, Nanshan Zeng, Xuedong David Huang, Chengyi Wang
-
Patent number: 12205580Abstract: Techniques for selecting a skill component to process a natural language input are described. When a natural language input is received, natural language understanding (NLU) output data representing the natural language input is generated, and skill components (capable of processing the NLU output data) are determined. Thereafter, rules (for preventing the invocation of skill components) are implemented in a tiered manner, resulting in the determination of a subset of the skill components. The subset of skill components is ranked using a machine learning model(s), and the top-ranked skill component is called to process the NLU output data.Type: GrantFiled: September 29, 2020Date of Patent: January 21, 2025Assignee: Amazon Technologies, Inc.Inventors: Joe Pemberton, Michael Schwartz, Vijitha Raji, Archit Jain, Tara Raj, Alexander Go
-
Patent number: 12198718Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.Type: GrantFiled: August 9, 2023Date of Patent: January 14, 2025Assignee: Google LLCInventors: Joel Shor, Alanna Foster Slocum
-
Patent number: 12182683Abstract: A signal matching apparatus comprising at least one receiving unit adapted to receive a signal; at least one memory unit adapted to store predefined reference data and at least one neural network configured to compare a signal profile of the received signal and/or signal parameters derived from the received signal with reference data stored in said memory unit to determine a similarity between the received signal and the predefined reference data.Type: GrantFiled: October 31, 2018Date of Patent: December 31, 2024Assignee: Rohde & Schwarz GmbH & Co. KGInventor: Baris Güzelarslan
-
Patent number: 12165644Abstract: Systems and methods for media playback via a media playback system include capturing sound data via a network microphone device and identifying a candidate wake word in the sound data. Based on identification of the candidate wake word in the sound data, the system selects a first wake-word engine from a plurality of wake-word engines. Via the first wake-word engine, the system analyzes the sound data to detect a confirmed wake word, and, in response to detecting the confirmed wake word, transmits a voice utterance of the sound data to one or more remote computing devices associated with a voice assistant service.Type: GrantFiled: September 1, 2023Date of Patent: December 10, 2024Assignee: Sonos, Inc.Inventors: Joachim Fainberg, Daniele Giacobello, Klaus Hartung
-
Patent number: 12165628Abstract: Techniques are disclosed that enable determining and/or utilizing a misrecognition of a spoken utterance, where the misrecognition is generated using an automatic speech recognition (ASR) model. Various implementations include determining a recognition based on the spoken utterance and a previous utterance spoken prior to the spoken utterance. Additionally or alternatively, implementations include personalizing an ASR engine for a user based on the spoken utterance and the previous utterance spoken prior to the spoken utterance (e.g., based on audio data capturing the previous utterance and a text representation of the spoken utterance).Type: GrantFiled: July 8, 2020Date of Patent: December 10, 2024Assignee: GOOGLE LLCInventors: Ágoston Weisz, Ignacio Lopez Moreno, Alexandru Dovlecel
-
Patent number: 12148417Abstract: Devices and techniques are generally described for confidence score generation for label generation. In some examples, first data may be received from a first computing device. In various further examples, first label data classifying at least one aspect of the first data may be received. First metadata associated with how the first label data was generated may be received. In some cases, the first label data may be generated by a first user. In various examples, a first machine learning model may generate a first confidence score associated with the first label data based at least in part on the first data and second data related to label generation by the first person. In various examples, output data comprising the first confidence score may be sent to the first computing device.Type: GrantFiled: June 22, 2021Date of Patent: November 19, 2024Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Aidan Thomas Cardella, Anand Victor, Vipin Gupta, Zheng Du, John Rajiv Malik, Li Erran Li, Jarrett Alegre Bato, Peng Yang, Alejandro Ricardo Mottini D'Oliveira
-
Patent number: 12118304Abstract: According to one embodiment, a difference extraction device includes processing circuitry. The processing circuitry acquires a text in which an input notation string is described. The processing circuitry converts the input notation string into a pronunciation string. The processing circuitry executes a pronunciation string conversion process in which the pronunciation string is converted into an output notation string. The processing circuitry extracts a difference by comparing the input notation string and the output notation string with each other.Type: GrantFiled: August 31, 2021Date of Patent: October 15, 2024Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Daiki Tanaka, Takehiko Kagoshima, Kenji Iwata, Hiroshi Fujimura
-
Patent number: 12112757Abstract: A voice identity feature extractor training method includes extracting a voice feature vector of training voice. The method may include determining a corresponding I-vector according to the voice feature vector of the training voice. The method may include adjusting a weight of a neural network model by using the I-vector as a first target output of the neural network model, to obtain a first neural network model. The method may include obtaining a voice feature vector of target detecting voice and determining an output result of the first neural network model for the voice feature vector of the target detecting voice. The method may include determining an I-vector latent variable. The method may include estimating a posterior mean of the I-vector latent variable, and adjusting a weight of the first neural network model using the posterior mean as a second target output, to obtain a voice identity feature extractor.Type: GrantFiled: April 14, 2022Date of Patent: October 8, 2024Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Na Li, Jun Wang
-
Patent number: 12027152Abstract: Techniques and apparatuses for recognizing accented speech are described. In some embodiments, an accent module recognizes accented speech using an accent library based on device data, uses different speech recognition correction levels based on an application field into which recognized words are set to be provided, or updates an accent library based on corrections made to incorrectly recognized speech.Type: GrantFiled: April 14, 2023Date of Patent: July 2, 2024Assignee: Google Technology Holdings LLCInventor: Kristin A. Gray
-
Patent number: 12014146Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances.Type: GrantFiled: August 2, 2023Date of Patent: June 18, 2024Assignee: Oracle International CorporationInventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Crystal C. Pan, Vladislav Blinov, Cong Duy Vu Hoang, Elias Luqman Jalaluddin, Duy Vu, Balakota Srinivas Vinnakota
-
Patent number: 11977849Abstract: The disclosure relates to system and method for providing Artificial Intelligence (AI) based automated conversation assistance. The method includes analyzing, using a content analyzing model, at least one conversation stream captured during a real-time conversation between a plurality of users. The method includes identifying an assistance requirement of at least one first user of the at least one user and at least one primary context associated with the at least one conversation stream using the content analyzing model. Further, the method includes identifying at least one intelligent assistive model based on the identified assistance requirement using an AI model. Using the at least one intelligent assistive model, the method generates at least one assistive conversation stream. Contemporaneous to the at least one conversation stream being captured, the method renders the at least one assistive conversation stream to the at least one first user in real-time.Type: GrantFiled: April 26, 2021Date of Patent: May 7, 2024Inventor: Rajiv Trehan
-
Patent number: 11966562Abstract: An approach for automatically generate the Natural Language Interface (NLI) directly from the Graphical User Interface (GUI) code is disclosed. The approach leverages the use of mapping between GUI components to pre-defined NLI components in order to generate the necessary NLI components (e.g., intent example, entities, etc.) from the GUI code representation. The approach can leverage pre-defined patterns in order to generate these intent examples for each kind of NLI components. The created NLI dialog can be used simultaneously with the GUI or as a standalone feature.Type: GrantFiled: March 11, 2021Date of Patent: April 23, 2024Assignee: International Business Machines CorporationInventors: Offer Akrabi, Erez Lev Meir Bilgory, Sami Sobhe Marreed, Alessandro Donatelli, Asaf Adi, Nir Mashkif
-
Patent number: 11887605Abstract: A method including searching, on the basis of a voiceprint feature of a speaker, for an identifier of the speaker in a speaker registry, the voiceprint feature of the speaker being a parameter obtained according to a voice signal of the speaker captured by a microphone array; if position information corresponding to the identifier of the speaker in the speaker registry is different from position information of the speaker, updating the speaker registry, the position information of the speaker being a parameter obtained according to the voice signal of the speaker captured by the microphone array; and labeling the voice signal of the speaker with the identifier of the speaker, so as to track the speaker. The present disclosure enables voice tracking of multiple persons.Type: GrantFiled: February 26, 2021Date of Patent: January 30, 2024Assignee: Alibaba Group Holding LimitedInventors: Gang Liu, Yunfeng Xu, Tao Yu, Zhang Liu
-
Patent number: 11881208Abstract: System and method for generating disambiguated terms in automatically generated transcripts and employing the system are disclosed. Exemplary implementations may: obtain a set of transcripts representing various speech from users; obtain indications of correlated correct and incorrect transcripts of spoken terms; use a vector generation model to generate vectors for individual instances of the correctly transcribed terms and individual instances the incorrectly transcribed terms based on text and contexts of the individual transcribed terms; and train the vector generation model to reduce spatial separation of the vectors generated for the spoken terms in the correlated correct transcripts and the incorrect transcripts.Type: GrantFiled: March 22, 2023Date of Patent: January 23, 2024Assignee: Suki AI, Inc.Inventor: Ahmad Badary
-
Patent number: 11874876Abstract: An electronic device for predicting an intention of a user is configured to provide a question to the user and predict at least one first intention of the user based on context information associated with the user. The device is also configured to determine a question based on the at least one first intention of the user. The device is further configured to provide the question to the user. The device is additionally configured to receive a response to the question from the user. The device is also configured to predict at least one second intention of the user based on the at least one first intention of the user and the response to the question from the user.Type: GrantFiled: August 31, 2020Date of Patent: January 16, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventor: Karoliina Taru Katriina Salminen
-
Patent number: 11842735Abstract: An electronic apparatus and a control method thereof are provided. A method of controlling an electronic apparatus according to an embodiment of the disclosure includes: receiving input of a first utterance, identifying a first task for the first utterance based on the first utterance, providing a response to the first task based on a predetermined response pattern, receiving input of a second utterance, identifying a second task for the second utterance based on the second utterance, determining the degree of association between the first task and the second task, and setting a response pattern for the first task based on the second task based on the determined degree of association satisfying a predetermined condition. The control method of an electronic apparatus may use an artificial intelligence model trained according to at least one of machine learning, a neural network, or a deep learning algorithm.Type: GrantFiled: May 31, 2022Date of Patent: December 12, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Yeonho Lee, Kyenghun Lee, Saebom Jang, Silas Jeon
-
Patent number: 11805378Abstract: Systems, apparatuses, and methods are described for a privacy blocking device configured to prevent receipt, by a listening device, of video and/or audio data until a trigger occurs. A blocker may be configured to prevent receipt of video and/or audio data by one or more microphones and/or one or more cameras of a listening device. The blocker may use the one or more microphones, the one or more cameras, and/or one or more second microphones and/or one or more second cameras to monitor for a trigger. The blocker may process the data. Upon detecting the trigger, the blocker may transmit data to the listening device. For example, the blocker may transmit all or a part of a spoken phrase to the listening device.Type: GrantFiled: October 29, 2020Date of Patent: October 31, 2023Inventor: Thomas Stachura
-
Patent number: 11790900Abstract: A system for audio-visual multi-speaker speech separation. The system includes a processing circuitry and a memory containing instructions that, when executed by the processing circuitry, configure the system to: receive audio signals captured by at least one microphone; receive video signals captured by at least one camera; and apply audio-visual separation on the received audio signals and video signals to provide isolation of sounds from individual sources, wherein the audio-visual separation is based, in part, on angle positions of at least one speaker relative to the at least one camera. The system provides for reliable speech processing and separation in noisy environments and environments with multiple users.Type: GrantFiled: April 6, 2020Date of Patent: October 17, 2023Assignee: HI AUTO LTD.Inventors: Yaniv Shaked, Yoav Ramon, Eyal Shapira, Roy Baharav
-
Patent number: 11790611Abstract: A computer-implemented method, comprising, by an artificial-reality (AR) design tool: receiving, through a user interface (UI) of the AR design tool, instructions to add a voice-command module to an AR effect, the voice-command module having an intent type and at least one slot, the slot associated with one or more entities; establishing, according to instructions received through the UI, a logical connection between the slot and a logic module configured to generate the AR effect depending on a runtime value associated with the slot; and generate, for the AR effect, an executable program configured to: determine that a detected utterance corresponds to the intent type and includes one or more words associated with the slot; select, based on the one or more words, one of the one or more entities as the runtime value for the slot; send the runtime value to the logic module according to the logical connection.Type: GrantFiled: December 30, 2020Date of Patent: October 17, 2023Assignee: Meta Platforms, Inc.Inventors: Stef Marc Smet, Hannes Luc Herman Verlinde, Michael Slater, Benjamin Patrick Blackburne, Ram Kumar Hariharan, Chunjie Jia, Prakarn Nisarat
-
Patent number: 11776550Abstract: A device includes one or more processors configured to receive an audio data sample and to provide the audio data sample to a dynamic classifier. The dynamic classifier is configured to generate a classification output corresponding to the audio data sample. The one or more processors are further configured to selectively access a particular device based on the classification output.Type: GrantFiled: March 9, 2021Date of Patent: October 3, 2023Assignee: QUALCOMM IncorporatedInventor: Taher Shahbazi Mirzahasanloo
-
Patent number: 11769486Abstract: A method, computer program product, and computing system for defining model representative of a plurality of acoustic variations to a speech signal, thus defining a plurality of time-varying spectral modifications. The plurality of time-varying spectral modifications may be applied to a plurality of feature coefficients of a target domain of a reference signal, thus generating a plurality of time-varying spectrally-augmented feature coefficients of the reference signal.Type: GrantFiled: February 18, 2021Date of Patent: September 26, 2023Assignee: Nuance Communications, Inc.Inventors: Patrick A. Naylor, Dushyant Sharma, Uwe Helmut Jost, William F. Ganong, III
-
Patent number: 11755930Abstract: A method and apparatus for controlling learning of a model for estimating an intention of an input utterance is disclosed. A method of controlling learning of a model for estimating an intention of an input utterance among a plurality of intentions includes providing a first index corresponding to the number of registered utterances for each intention, providing a second index corresponding to a learning level for each intention, providing a learning target setting interface such that at least one intention that is to be a learning target is selected from among the intentions based on the first index and the second index, and training the model based on the registered utterances for each intention and setting of the learning target for each intention.Type: GrantFiled: May 13, 2020Date of Patent: September 12, 2023Assignee: KAKAO CORP.Inventors: Seung Won Seo, Tae Uk Kim, Il Nam Park, Myeong Cheol Shin, Hye Ryeon Lee, Sung Eun Choi
-
Patent number: 11749267Abstract: A method for adapting hotword recognition includes receiving audio data characterizing a hotword event detected by a first stage hotword detector in streaming audio captured by a user device. The method also includes processing, using a second stage hotword detector, the audio data to determine whether a hotword is detected by the second stage hot word detector in a first segment of the audio data. When the hotword is not detected by the second stage hotword detector, the method includes, classifying the first segment of the audio data as containing a negative hotword that caused a false detection of the hotword event in the streaming audio by the first stage hotword detector. Based on the first segment of the audio data classified as containing the negative hotword, the method includes updating the first stage hotword detector to prevent triggering the hotword event in subsequent audio data that contains the negative hotword.Type: GrantFiled: November 20, 2020Date of Patent: September 5, 2023Assignee: Google LLCInventors: Aleksandar Kracun, Matthew Sharifi
-
Patent number: 11735171Abstract: Systems and methods are provided for training a machine learning model to learn speech representations. Labeled speech data or both labeled and unlabeled data sets is applied to a feature extractor of a machine learning model to generate latent speech representations. The latent speech representations are applied to a quantizer to generate quantized latent speech representations and to a transformer context network to generate contextual representations. Each contextual representation included in the contextual representations is aligned with a phoneme label to generate phonetically-aware contextual representations. Quantized latent representations are aligned with phoneme labels to generate phonetically aware latent speech representations.Type: GrantFiled: May 14, 2021Date of Patent: August 22, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Yao Qian, Yu Wu, Kenichi Kumatani, Shujie Liu, Furu Wei, Nanshan Zeng, Xuedong David Huang, Chengyi Wang
-
Patent number: 11727923Abstract: A method for conducting a conversation between a user and a virtual agent is disclosed. The method includes receiving, by an ASR sub-system, a plurality of utterances from the user, and converting, by the ASR sub-system, each utterance of the plurality of utterances into a text message. The method further includes determining, by a NLU sub-system, an intent, at least one entity associated to the intent, or a combination thereof from the text message.Type: GrantFiled: November 24, 2020Date of Patent: August 15, 2023Assignee: Coinbase, Inc.Inventors: Arjun Kumeresh Maheswaran, Akhilesh Sudhakar, Bhargav Upadhyay
-
Patent number: 11721358Abstract: A device for calculating cardiovascular heartbeat information is configured to receive an electronic audio signal with information representative of a human voice signal in the time-domain, the human voice signal comprising a vowel audio sound of a certain duration and a fundamental frequency; generate a power spectral profile of a section of the electronic audio signal, and detect the fundamental frequency (F0) in the generated power spectral profile; filter the received audio signal within a band around at least the detected fundamental frequency (F0) and thereby generating a denoised audio signal; generate a time-domain intermediate signal that captures frequency, amplitude and/or phase of the denoised audio signal; detect and calculate heartbeat information within a human cardiac band in the intermediate signal.Type: GrantFiled: June 17, 2020Date of Patent: August 8, 2023Assignee: Stichting IMEC NederlandInventors: Carlos Agell, Evelien Hermeling, Vojkan Mihajlovic
-
Patent number: 11714960Abstract: A syntactic analysis apparatus according to an embodiment of the present disclosure may include an input device receiving a phrase uttered from a user, and a learning device performing at least one or more of extension of an intent output layer for classifying an utterance intent of the user from the uttered phrase and extension of a slot output layer for classifying a slot including information of the phrase and extending a pre-generated utterance syntactic analysis model, such that the uttered phrase is classified into the extended intent output layer and the extended slot output layer, thereby broadly classifying an intent and a slot for the phrase uttered from a user.Type: GrantFiled: June 15, 2020Date of Patent: August 1, 2023Assignees: HYUNDAI MOTOR COMPANY, KIA MOTORS CORPORATION, HYUNDAI AUTOEVER CORP., SEOUL NATIONAL UNIVERSITY R&DB FOUNDATIONInventors: Sung Soo Park, Chang Woo Chun, Chan Ill Park, Su Hyun Park, Jung Kuk Lee, Hyun Tae Kim, Sang goo Lee, Kang Min Yoo, You Hyun Shin, Ji Hun Choi, Sang Hwan Bae
-
Patent number: 11705106Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using a speech synthesis model stored locally at the client device, to generate synthesized speech audio data that includes synthesized speech of the identified textual segment; process the synthesized speech, using an on-device speech recognition model that is stored locally at the client device, to generate predicted output; and generate a gradient based on comparing the predicted output to ground truth output that corresponds to the textual segment. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.Type: GrantFiled: September 20, 2021Date of Patent: July 18, 2023Assignee: GOOGLE LLCInventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
-
Patent number: 11694681Abstract: Artificial assistant system notification techniques are described that overcome the challenges of conventional search techniques. In one example, a user profile is generated to describe aspects of products or services learned through natural language conversations between a user and an artificial assistant system. These aspects may include price as well as non-price aspects such as color, texture, material, and so forth. To learn the aspects, the artificial assistant system may leverage spoken utterances and text initiated by the user as well as learn the aspects from digital images output as part of the conversation. Once generated, the user profile is then usable by the artificial assistant system to assist in subsequent searches.Type: GrantFiled: January 7, 2019Date of Patent: July 4, 2023Assignee: eBay Inc.Inventors: Farah Abdallah, Joshua Benjamin Tanner, Jessica Erin Bullock, Joel Joseph Chengottusseriyil, Jeff Steven White
-
Patent number: 11696364Abstract: Disclosed embodiments include a network device having a split network stack that includes a physical (PHY) layer associated with first and second media access control (MAC) protocol sublayers, a processing device, and memory storing instructions that, when executed by the processing device, cause the processing device to select a route through the split network stack that includes one of the first and second MAC protocol sublayers but not the other one of the first and second MAC protocol sublayers.Type: GrantFiled: May 18, 2021Date of Patent: July 4, 2023Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Xiaolin Lu, Robert Liang, Mehul Soman, Kumaran Vijayasankar, Ramanuja Vedantham
-
Patent number: 11681923Abstract: Intent determination based on one or more multi-model structures can include generating an output from each of a plurality of domain-specific models in response to a received input. The domain-specific models can comprise simultaneously trained machine learning models that are trained using a corresponding local loss metric for each domain-specific model and a global loss metric for the plurality of domain-specific models. The presence or absence of an intent corresponding to one or more domain-specific models can be determined by classifying the output of each domain-specific model.Type: GrantFiled: December 27, 2019Date of Patent: June 20, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Yu Wang, Yilin Shen, Yue Deng, Hongxia Jin
-
Patent number: 11682416Abstract: Providing contextual help in an interactive voice system includes receiving a plurality of user interaction events during a user interaction window, wherein each of the user interaction events comprises one of a low quality voice transcription event from a speech-to-text (STT) service or a no-intent matching event from a natural language processing (NLP) service and receiving a respective transcription confidence score from the STT service for each of the plurality of user interaction events. For a one of the plurality of user interaction events, a determination is made of how to respond to a user providing the user interaction events based on how many events comprise the plurality of events and the transcription confidence score for the one event; and then instructions are provided to cause the determined response to be presented to the user in accordance with the determination of how to respond.Type: GrantFiled: August 3, 2018Date of Patent: June 20, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Igor Ramos, Marc Dickenson
-
Patent number: 11663411Abstract: A method for expanding an initial ontology via processing of communication data, wherein the initial ontology is a structural representation of language elements comprising a set of entities, a set of terms, a set of term-entity associations, a set of entity-association rules, a set of abstract relations, and a set of relation instances. A method for extracting a set of significant phrases and a set of significant phrase co-occurrences from an input set of documents further includes utilizing the terms to identify relations within the training set of communication data, wherein a relation is a pair of terms that appear in proximity to one another.Type: GrantFiled: April 8, 2021Date of Patent: May 30, 2023Assignee: Verint Systems Ltd.Inventors: Daniel Mark Baum, Uri Segal, Ron Wein, Oana Sidi
-
Patent number: 11646011Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.Type: GrantFiled: June 22, 2022Date of Patent: May 9, 2023Assignee: GOOGLE LLCInventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
-
Patent number: 11630958Abstract: The disclosure herein describes determining topics of communication transcripts using trained summarization models. A first communication transcript associated with a first communication is obtained and divided into a first set of communication segments. A first set of topic descriptions is generated based on the first set of communication segments by analyzing each communication segment of the first set of communication segments with a generative language model. A summarization model is trained using the first set of communication segments and associated first set of topic descriptions as training data. The trained summarization model is then applied to a second communication transcript and, based on applying the trained summarization model to the second communication transcript, a second set of topic descriptions of the second communication transcript is generated.Type: GrantFiled: June 2, 2021Date of Patent: April 18, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Royi Ronen, Yarin Kuper, Tomer Rosenthal, Abedelkader Asi, Erez Altus, Rona Shaanan
-
Patent number: 11625467Abstract: A computerized method for voice authentication of a customer in a self-service system is provided. A request for authentication of the customer is received and the customer is enrolled in the self-service system with a text-independent voice print. A passphrase from a plurality of passphrases to transmit to the customer is determined based on comparing each of the plurality of passphrases to a text-dependent or text-independent voice biometric model. The passphrase is transmitted to the customer, and when the customer responds, an audio stream of the passphrase is received. The customer is authenticated by comparing the audio stream of the passphrase against the text-independent voice print. If the customer is authenticated, then the audio stream of the passphrase and the topic of the passphrase may be stored.Type: GrantFiled: May 25, 2021Date of Patent: April 11, 2023Assignee: Nice Ltd.Inventors: Matan Keret, Amnon Buzaglo
-
Patent number: 11620987Abstract: In some cases, one or more heuristics can be automatically generated using a small dataset of segments previously labeled by one or more domain experts. The generated one or more heuristics along with one or more patterns can be used to assign training labels to a large unlabeled dataset of segments. A subset of segments representing an occurrence of verbal harassment can be selected using the assigned training labels. Randomly selected segments can be used as being indicative of a non-occurrence of verbal harassment. The selected subset of segments and randomly selected segments can be used to train one or more machine learning models for verbal harassment detection.Type: GrantFiled: December 28, 2020Date of Patent: April 4, 2023Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.Inventors: Ying Lyu, Kun Han
-
Patent number: 11615783Abstract: System and method for generating disambiguated terms in automatically generated transcriptions including instructions within a knowledge domain and employing the system are disclosed.Type: GrantFiled: December 21, 2021Date of Patent: March 28, 2023Assignee: Suki AI, Inc.Inventor: Ahmad Badary
-
Patent number: 11604457Abstract: The present invention discloses a smart counting method and system in manufacturing, specifically in custom clothing or fabric manufacturing. The smart counting method and system uses a camera to feed real-time image data of a working platform where a worker takes a unfinished clothing or fabric, processes the clothing or fabric, and puts the finished clothing or fabric in a finished pile to a processing unit. The processing unit automatically starts a new work order and counts the number of finished products in this work order by using computer vision techniques.Type: GrantFiled: February 4, 2021Date of Patent: March 14, 2023Inventors: Tyler Compton, Bryce Beagle, Alexander Thiel, Xintian Li
-
Patent number: 11600262Abstract: According to one embodiment, a recognition device includes storage and a processor. The storage is configured to store a first recognition model, a first data set, and tags, for each first recognition model. The processor is configured to acquire a second data set, execute recognition processing of the second recognition target data in the second data set by using the first recognition model, extract a significant tag of the tags stored in the storage in association with the first recognition model, based on the recognition processing result and the second correct data in the second data set, and create a second recognition model based on the acquired second data set and the first data set stored in the storage in association with the extracted tag.Type: GrantFiled: June 3, 2019Date of Patent: March 7, 2023Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA DIGITAL SOLUTIONS CORPORATIONInventors: Koji Yasuda, Kenta Cho
-
Patent number: 11574132Abstract: Methods, systems, and computer program products for unsupervised tunable stylized text transformations are provided herein. A computer-implemented method includes identifying amendable portions of input text by processing at least a portion of the input text using at least one neural network; determining stylistic text modifications to the amendable portions of the input text, the text modifications encompassing a set of stylistic parameters, wherein said determining comprises processing at least a portion of the set of stylistic parameters using at least one neural network; generating a stylized output set of text by transforming at least a portion of the input text, wherein said transforming comprises modifying at least one of the amendable portions of the input text via at least one of the stylistic text modifications encompassed by the set of stylistic parameters; and outputting the stylized output set of text to at least one user.Type: GrantFiled: December 23, 2020Date of Patent: February 7, 2023Assignee: International Business Machines CorporationInventors: Parag Jain, Amar P. Azad, Abhijit Mishra, Karthik Sankaranarayanan
-
Patent number: 11567953Abstract: Systems and methods of returning location and/or event results using information mined from non-textual information are provided. Non-textual information is captured using a hardware component of a user device. Text-based social media content input on the user device is then retrieved. A location of the user device is determined using a global positioning system module in the user device. The non-textual information is converted to a machine-analyzable format, and the converted non-textual information is compared to a database of converted non-textual information samples to analyze and classify the converted non-textual information. The classification is sent to a server for storage in a database in a manner that ties the classification to the geographical location of the user device.Type: GrantFiled: November 18, 2016Date of Patent: January 31, 2023Assignee: eBay Inc.Inventors: Jeremiah Joseph Akin, Jayasree Mekala, Praveen Nuthulapati, Joseph Vernon Paulson, IV, Kamal Zamer
-
Patent number: 11562738Abstract: A system includes acquisition of a domain grammar, determination of an interpolated grammar based on the domain grammar and a base grammar, determination of a delta domain grammar based on an augmented first grammar and the interpolated grammar, determination of an out-of-vocabulary class based on the domain grammar and the base grammar, insertion of the out-of-vocabulary class into a composed transducer composed of the augmented first grammar and one or more other transducers to generate an updated composed transducer, composition of the delta domain grammar and the updated composed transducer, and application of the composition of the delta domain grammar and the updated composed transducer to an output of an acoustic model.Type: GrantFiled: October 28, 2019Date of Patent: January 24, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ziad Al Bawab, Anand U Desai, Shuangyu Chang, Amit K Agarwal, Zoltan Romocsa, Veljko Miljanic, Aadyot Bhatnagar, Hosam Khalil, Christopher Basoglu
-
Patent number: 11562736Abstract: A speech recognition method includes segmenting captured voice information to obtain a plurality of voice segments, and extracting voiceprint information of the voice segments; matching the voiceprint information of the voice segments with a first stored voiceprint information to determine a set of filtered voice segments having voiceprint information that successfully matches the first stored voiceprint information; combining the set of filtered voice segments to obtain combined voice information, and determining combined semantic information of the combined voice information; and using the combined semantic information as a speech recognition result when the combined semantic information satisfies a preset rule.Type: GrantFiled: April 29, 2021Date of Patent: January 24, 2023Assignee: TENCENT TECHNOLOGY (SHEN ZHEN) COMPANY LIMITEDInventor: Qiusheng Wan
-
Patent number: 11557301Abstract: Systems, methods performed by data processing apparatus and computer storage media encoded with computer programs for receiving an utterance from a user in a multi-user environment, each user having an associated set of available resources, determining that the received utterance includes at least one predetermined word, comparing speaker identification features of the uttered predetermined word with speaker identification features of each of a plurality of previous utterances of the predetermined word, the plurality of previous predetermined word utterances corresponding to different known users in the multi-user environment, attempting to identify the user associated with the uttered predetermined word as matching one of the known users in the multi-user environment, and based on a result of the attempt to identify, selectively providing the user with access to one or more resources associated with a corresponding known user.Type: GrantFiled: August 30, 2019Date of Patent: January 17, 2023Assignee: Google LLCInventor: Matthew Sharifi
-
Patent number: 11551682Abstract: An electronic device includes: a camera; a microphone; a display; a memory; and a processor configured to receive an input for activating an intelligent agent service from a user while at least one application is executed, identify context information of the electronic device, control to acquire image information of the user through the camera, based on the identified context information, detect movement of a user's lips included in the acquired image information to recognize a speech of the user, and perform a function corresponding to the recognized speech.Type: GrantFiled: December 13, 2019Date of Patent: January 10, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Sunok Kim, Sungwoon Jang, Hyelim Woo