Subportions Patents (Class 704/254)
  • Patent number: 11281727
    Abstract: Embodiments for managing virtual assistants are described. Information associated with a user in an internet of things (IoT) device environment having a plurality of IoT devices is received. A request from the user is received. In response to the receiving of the request, a first portion of a response to the request is caused to be rendered utilizing a first of the plurality of IoT devices. Movement of the user within the IoT device environment is detected. In response to the detecting of the movement of the user, a second portion of the response to the request is caused to be rendered utilizing a second of the plurality of IoT devices based on said detected movement of the user and said received information about the user.
    Type: Grant
    Filed: July 3, 2019
    Date of Patent: March 22, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Zachary Silverstein, Robert Grant, Ruchika Bengani, Sarbajit Rakshit
  • Patent number: 11276390
    Abstract: An audio interval detection apparatus has a processor and a storage storing instructions that, when executed by the processor, control the processor to: detect, from a target audio signal, a specified audio interval including a specified audio signal representing a state of a phoneme of a same consonant produced continuously over a period longer than a specified time, and, by eliminating, from the target audio signal at least the detected specified audio interval, detect from the target audio signal an utterance audio interval that includes a speech utterance signal representing a speech utterance uttered by a speaker.
    Type: Grant
    Filed: March 13, 2019
    Date of Patent: March 15, 2022
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Hiroki Tomita
  • Patent number: 11262909
    Abstract: A system, method and computer program product for use in providing a linguistic resource for input recognition of multiple input types on a computing device are provided. The computing device is connected to an input interface. A user is able to provide input by applying pressure to or gesturing above the input interface using a finger or an instrument such as a stylus or pen. The computing device has an input management system for recognizing the input. The input management system is configured to allow setting, in the computing device memory, parameters of a linguistic resource for a language model of one or more languages, and cause recognition of input to the input interface of the different input types using the linguistic resource. The resource parameters are set to optimize recognition performance characteristics of each input type while providing the linguistic resource with the pre-determined size.
    Type: Grant
    Filed: July 21, 2016
    Date of Patent: March 1, 2022
    Assignee: MYSCRIPT
    Inventors: Ali Reza Ebadat, Lois Rigouste
  • Patent number: 11238865
    Abstract: One embodiment provides a method, including: receiving, at an information handling device, audible user input; determining, subsequent to the receiving, an intonation with which the audible user input was provided; assigning, based on the determined intonation, an expressive meaning to the audible user input; and performing, based on the expressive meaning, a corresponding function. Other aspects are described and claimed.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: February 1, 2022
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: John Weldon Nicholson, Ming Qian, Song Wang, Ryan Charles Knudson, Roderick Echols
  • Patent number: 11232787
    Abstract: When a portion of an original audio track is unsatisfactory, alternate portions are searched for using phonetic matching within phonetically indexed alternate files. The alternates may correspond to recordings captured in a different take from the original where timecode matching is unavailable. An editor may preview one or more candidate alternates and select a preferred one to be used to replace the original. A media editing application, such as a digital audio workstation, automatically aligns and matches the preferred alternative to the original using waveform matching and optional gain matching.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: January 25, 2022
    Assignee: AVID TECHNOLOGY, INC
    Inventor: Christopher M. Winsor
  • Patent number: 11222165
    Abstract: According to one or more embodiments of the present invention, an input request to a natural language processing (NLP) system is optimized. A window-size is selected for annotating an input corpus. The corpus is divided into partitions of the window-size, each partition processed separately. Further, a first set of entities is identified in a first partition, and a second set of entities in a second partition. Further, a third partition containing a first segment and a second segment is determined. The first segment overlaps the first partition, and the second segment overlaps the second partition. The method further includes identifying a third set of entities in the third partition. In response to the third set of entities being distinct from a set of entities from the first segment and the second segment, the window-size is adjusted. The input request for the NLP system is generated using the adjusted window-size.
    Type: Grant
    Filed: August 18, 2020
    Date of Patent: January 11, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Igor S. Ramos, Andrew J. Lavery, Scott Carrier, Paul Joseph Hake
  • Patent number: 11217231
    Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: January 4, 2022
    Assignee: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Golan Pundak, Tara N. Sainath, Antoine Jean Bruguier
  • Patent number: 11205434
    Abstract: An audio encoder for providing an encoded audio information on the basis of an input audio information has a bandwidth extension information provider configured to provide bandwidth extension information using a variable temporal resolution and a detector configured to detect an onset of a fricative or affricate. The audio encoder is configured to adjust a temporal resolution used by the bandwidth extension information provider such that bandwidth extension information is provided with an increased temporal resolution at least for a predetermined period of time before a time at which an onset of a fricative or affricate is detected and for a predetermined period of time following the time at which the onset of the fricative or affricate is detected. Alternatively or in addition, the bandwidth extension information is provided with an increased temporal resolution in response to a detection of an offset of a fricative or affricate. Audio encoders and methods use a corresponding concept.
    Type: Grant
    Filed: August 12, 2019
    Date of Patent: December 21, 2021
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Christian Helmrich, Markus Multrus, Markus Schnell, Arthur Tritthart
  • Patent number: 11183171
    Abstract: The present disclosure relates to a method and system for robust and efficient language identification. The method includes: receiving the speech signal; partitioning the speech signal into a plurality of audio frames; extracting features of the plurality of audio frames; determining, using a neural network, a variable associated with the language identity and one or more auxiliary attributes of the speech signal, for each of the plurality of audio frames; determining scores of the plurality of audio frames based on the extracted features; and determining the language identity of the speech signal based on the variables and scores determined for the plurality of audio frames.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: November 23, 2021
    Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventor: Tianxiao Fu
  • Patent number: 11183168
    Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: November 23, 2021
    Assignee: TENCENT AMERICA LLC
    Inventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
  • Patent number: 11164581
    Abstract: An artificial intelligence device includes a speaker, a microphone configured to receive a user's speech, and one or more controllers configured to extract an utterance feature of the received speech, determine a user type corresponding to the extracted utterance feature, map a speech agent associated with the determined user type, and output an audio response through the speaker using the mapped speech agent.
    Type: Grant
    Filed: August 12, 2019
    Date of Patent: November 2, 2021
    Assignee: LG ELECTRONICS INC.
    Inventors: Jonghoon Chae, Yongchul Park, Siyoung Yang, Juyeong Jang, Sungmin Han
  • Patent number: 11164578
    Abstract: A voice recognition apparatus capable of executing a function by a voice instruction, comprises an input unit configured to input a voice, and a control unit configured to enable the voice recognition apparatus to accept a voice instruction for executing a function, if a voice input by the input unit is an activation phrase. The voice recognition apparatus comprises a recognition unit configured to, if a voice input by the input unit after the control unit has enabled the voice recognition apparatus to accept the voice instruction contains the activation phrase, recognize the voice instruction by excluding the activation phrase from the voice.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: November 2, 2021
    Assignee: HONDA MOTOR CO., LTD.
    Inventor: Keigo Nakada
  • Patent number: 11164568
    Abstract: A speech recognition method is provided. The method includes: obtaining a voice signal; processing the voice signal according to a speech recognition algorithm to obtain n candidate recognition results, the candidate recognition results including text information corresponding to the voice signal; identifying a target result from among the n candidate recognition results according to a selection rule selected from among m selection rules, the selection rule having an execution sequence of j, the target result being a candidate recognition result that has a highest matching degree with the voice signal in the n candidate recognition results, an initial value of j being 1; and identifying the target result from among the n candidate recognition results according to a selection rule having an execution sequence of j+1 based on the target result not being identified according to the selection rule having the execution sequence of j.
    Type: Grant
    Filed: August 21, 2019
    Date of Patent: November 2, 2021
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTD
    Inventors: Ping Zheng, Feng Rao, Li Lu, Tao Li
  • Patent number: 11145315
    Abstract: An electronic device includes an audio capture device receiving audio input. The electronic device includes one or more processors, operable with the audio capture device, and configured to execute a control operation in response to a device command preceded by a trigger phrase identified in the audio input when in a first mode of operation. The one or more processors transition from the first mode of operation to a second mode of operation in response to detecting a predefined operating condition of the electronic device. In the second mode of operation, the one or more processors execute the control operation without requiring the trigger phrase to precede the device command.
    Type: Grant
    Filed: October 16, 2019
    Date of Patent: October 12, 2021
    Assignee: Motorola Mobility LLC
    Inventors: John Gorsica, Thomas Merrell
  • Patent number: 11145290
    Abstract: Provided is a system including an electronic device to recognize and process a user's speech and a method of controlling speech recognition on an electronic device. According to an embodiment, an electronic device comprises a communication circuit, an input module, at least one processor, and a memory operatively connected with the at least one processor the input module, and the communication circuit, wherein the memory stores instructions configured to enable the at least one processor to provide a function according to a first utterance of a user for wake-up, receive a second utterance of the user including a plurality of words with predesignated relevance through the input module while the function is provided, transmit information about the second utterance of the user, to another electronic device via the communication circuit, and receive a response related to the second utterance of the user from the second electronic device according to the transmission and provide the received response.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: October 12, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jooyoo Kim, Jaepil Kim, Seongmin Je
  • Patent number: 11138966
    Abstract: A method for generating an automatic speech recognition (ASR) model using unsupervised learning includes obtaining, by a device, text information. The method includes determining, by the device, a set of phoneme sequences associated with the text information. The method includes obtaining, by the device, speech waveform data. The method includes determining, by the device, a set of phoneme boundaries associated with the speech waveform data. The method includes generating, by the device, the ASR model using an output distribution matching (ODM) technique based on determining the set of phoneme sequences associated with the text information and based on determining the set of phoneme boundaries associated with the speech waveform data.
    Type: Grant
    Filed: February 7, 2019
    Date of Patent: October 5, 2021
    Assignee: TENCENT AMERICA LLC
    Inventors: Jianshu Chen, Chengzhu Yu, Dong Yu, Chih-Kuan Yeh
  • Patent number: 11133004
    Abstract: This disclosure describes techniques and systems for enabling accessory devices to output supplemental content that is complementary to content output by a primary user device in situations where a separate music-provider system provides the primary content to the primary device. The techniques may include attempting to identify the primary content using metadata provided by the music-provider system, retrieving existing audio feature data in response to identifying the primary content, and providing the audio feature data to the accessory device for use in outputting the supplemental content. If the primary content is unable to be identified using metadata, then the techniques may include instructing the primary device to generate audio feature data and/or instructing the primary device to generate a fingerprint of the primary content for use in identifying the primary content.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: September 28, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Apoorv Naik, Pete Klein, Qi Li
  • Patent number: 11132991
    Abstract: Disclosed are a response device determination method and a response device determination apparatus. The method includes receiving audio signals from a plurality of devices respectively; extracting a plurality of distance information indicative of distances between the user and the plurality of devices from the audio signals respectively; and determining a response device to respond to the wake-up voice using the extracted plurality of distance information, wherein the response device is determined based on at least one of first and second steps according to a predetermined condition, wherein the first step includes comparing the extracted plurality of distance information with each other and determining the response device based on the comparison result, wherein the second step includes applying the extracted plurality of distance information to a deep neural network (DNN) model to obtain an application result and determining the response device based on the application result.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: September 28, 2021
    Assignee: LG Electronics Inc.
    Inventors: Heewan Park, Donghoon Yi, Bongki Lee, Yuyong Jeon, Jaewoong Jeong
  • Patent number: 11127407
    Abstract: Captured vocals may be automatically transformed using advanced digital signal processing techniques that provide captivating applications, and even purpose-built devices, in which mere novice user-musicians may generate, audibly render and share musical performances. In some cases, the automated transformations allow spoken vocals to be segmented, arranged, temporally aligned with a target rhythm, meter or accompanying backing tracks and pitch corrected in accord with a score or note sequence. Speech-to-song music applications are one such example. In some cases, spoken vocals may be transformed in accord with musical genres such as rap using automated segmentation and temporal alignment techniques, often without pitch correction. Such applications, which may employ different signal processing and different automated transformations, may nonetheless be understood as speech-to-rap variations on the theme.
    Type: Grant
    Filed: May 13, 2019
    Date of Patent: September 21, 2021
    Assignee: Smule, Inc.
    Inventors: Parag Chordia, Mark Godfrey, Alexander Rae, Prerna Gupta, Perry R. Cook
  • Patent number: 11120804
    Abstract: Implementations set forth herein relate to management of casting requests and user inputs at a rechargeable device, which provides access to an automated assistant and is capable of rendering data that is cast from a separate device. Casting requests can be handled by the rechargeable device despite a device SoC of the rechargeable device operating in a sleep mode. Furthermore, spoken utterances provided by a user for invoking the automated assistant can also be adaptively managed by the rechargeable device in order mitigate idle power consumption by the device SoC. Such spoken utterances can be initially processed by a digital signal processor (DSP), and, based on one or more features (e.g., voice characteristic, conformity to a particular invocation phrase, etc.) of the spoken utterance, the device SoC can be initialized for an amount of time that is selected based on the features of the spoken utterance.
    Type: Grant
    Filed: April 1, 2019
    Date of Patent: September 14, 2021
    Assignee: GOOGLE LLC
    Inventors: Andrei Pascovici, Victor Lin, Jianghai Zhu, Paul Gyugyi, Shlomi Regev
  • Patent number: 11087746
    Abstract: In order to improve accuracy for detecting presence or absence of a target object. A time-series segmentation unit 102 creates first time-series data by segmenting processing target data into each frame of “n” time zones. Each of first determination units 103 creates “m” second time-series data by determining each frame of the first time-series data using “m” models having different characteristics. A second determination unit 104 creates a second determination result as a presence probability of the target object for a set of second time-series data including n×m data.
    Type: Grant
    Filed: July 24, 2019
    Date of Patent: August 10, 2021
    Assignee: Rakuten, Inc.
    Inventors: Ali Cevahir, Stanley Kok
  • Patent number: 11062233
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed that provide an apparatus to monitor watermark encoder operation, the apparatus comprising: a data collector to collect one or more types of heartbeat data from a watermark encoder, the heartbeat data including time varying data, the one or more types of the heartbeat data defined by a software development kit (SDK); a machine learning engine to process the heartbeat data to predict whether the watermark encoder is associated with respective ones of a plurality of failure modes; and an alert generator to, in response to the machine learning engine predicting the watermark encoder is associated with a first one of the failure modes: generate an alert indicating the at least one of the one or more components to be remedied according to the first one of the failure modes; and transmit the alert to a watermark encoder management agent.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: July 13, 2021
    Assignee: THE NIELSEN COMPANY (US), LLC
    Inventors: John T. Livoti, Susan Cimino, Stanley Wellington Woodruff, Rajakumar Madhanganesh, Alok Garg
  • Patent number: 11049504
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.
    Type: Grant
    Filed: May 27, 2020
    Date of Patent: June 29, 2021
    Assignee: Google LLC
    Inventors: Alexander H. Gruenstein, Petar Aleksic, Johan Schalkwyk, Pedro J. Moreno Mengibar
  • Patent number: 11031007
    Abstract: Implementations are set forth herein for creating an order of execution for actions that were requested by a user, via a spoken utterance to an automated assistant. The order of execution for the requested actions can be based on how each requested action can, or is predicted to, affect other requested actions. In some implementations, an order of execution for a series of actions can be determined based on an output of a machine learning model, such as a model that has been trained according to supervised learning. A particular order of execution can be selected to mitigate waste of processing, memory, and network resources—at least relative to other possible orders of execution. Using interaction data that characterizes past performances of automated assistants, certain orders of execution can be adapted over time, thereby allowing the automated assistant to learn from past interactions with one or more users.
    Type: Grant
    Filed: February 7, 2019
    Date of Patent: June 8, 2021
    Assignee: GOOGLE LLC
    Inventors: Mugurel Ionut Andreica, Vladimir Vuskovic, Joseph Lange, Sharon Stovezky, Marcin Nowak-Przygodzki
  • Patent number: 11023686
    Abstract: Conversational systems are required to be capable of handling more sophisticated interactions than providing factual answers only. Such interactions are handled by resolving abstract anaphoric references in conversational systems which includes antecedent fact references and posterior fact references. The present disclosure resolves abstract anaphoric references in conversational systems using hierarchically stacked neural networks. In the present disclosure, a deep hierarchical maxpool network based model is used to obtain a representation of each utterance received from users and a representation of one or more generated sequences of utterances. The obtained representations are further used to identify contextual dependencies with in the one or more generated sequences which helps in resolving abstract anaphoric references in conversational systems.
    Type: Grant
    Filed: July 9, 2019
    Date of Patent: June 1, 2021
    Assignee: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Puneet Agarwal, Prerna Khurana, Gautam Shroff, Lovekesh Vig
  • Patent number: 11019207
    Abstract: Systems and methods for smart dialogue communication are provided. A method may include receiving, from a responder terminal device, a dialogue request configured to request a smart dialogue communication, wherein the dialogue request is associated with an incoming call request that is initiated by a requester via a requester terminal device and satisfies a smart dialogue condition determined by the responder terminal device; performing the smart dialogue communication with the requester terminal device associated with the requester; recording voice information associated with the smart dialogue communication; converting the voice information into the text information; and transmitting the text information to the responder terminal device.
    Type: Grant
    Filed: June 2, 2020
    Date of Patent: May 25, 2021
    Assignee: HITHINK ROYALFLUSH INFORMATION NETWORK CO., LTD.
    Inventor: Ming Chen
  • Patent number: 11011155
    Abstract: An example method includes: receiving a test phrase; comparing feature vectors of the test phrase to contents of a first database to generate a first score; comparing the feature vectors of the test phrase to contents of a second database to generate a second score; comparing feature vectors of the contents of the second database to the contents of the first database to generate a third score; comparing the feature vectors of the contents of the second database to a model of the test phrase to generate a fourth score; determining a first difference score based on a difference between the first and second scores; determining a second difference score based on a difference between the third and fourth scores; and generating a difference confidence score based on a lesser of the first and second difference scores.
    Type: Grant
    Filed: August 1, 2018
    Date of Patent: May 18, 2021
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Tarkesh Pande, Lorin Paul Netsch, David Patrick Magee
  • Patent number: 10986225
    Abstract: Embodiments of the present disclosure describe a call recording system and a call recording method for automatically recording, i.e. storing, a call candidate when an active call is detected. The call recording system comprises a sound receiver to receive sound data and to convert sound data to audio representations of sound, a buffer to buffer the audio representations of sound for a predetermined time duration, a call candidate determination unit to determine if the buffered audio representations comprise a call candidate, a call analyzer to analyze the call candidate, wherein the call analyzer determines if the call candidate is a call to be stored, and a storage to store the call candidate as a call. Hence, a reliable system can be provided for automatically storing a call.
    Type: Grant
    Filed: January 30, 2019
    Date of Patent: April 20, 2021
    Assignee: I2X GMBH
    Inventors: Ilya Edrenkin, Evgenii Khamukhin, Evgenii Kazakov, Stefan Decker
  • Patent number: 10971147
    Abstract: In an aspect of the present disclosure, a method for providing an alternate modality of input for filling a form field in response to a failure of voice recognition is disclosed including prompting the user for information corresponding to a field of a form, generating speech data by capturing a spoken response of the user to the prompt using at least one input device, attempting to convert the speech data to text, determining that the attempted conversion has failed, evaluating the failure using at least one speech rule, selecting, based on the evaluation, an alternate input modality to be used for receiving the information corresponding to the field of the form, receiving the information corresponding to the field of the form from the alternate input modality, and injecting the received information into the field of the form.
    Type: Grant
    Filed: March 7, 2019
    Date of Patent: April 6, 2021
    Assignee: International Business Machines Corporation
    Inventors: Robert H. Grant, Trudy L. Hewitt, Mitchell J. Mason, Robert J. Moore, Kenneth A. Winburn
  • Patent number: 10956675
    Abstract: A system and method are provided. The system includes a gateway portion (201), embedded in a gateway device, having an embedded artificial intelligence engine (220) for processing commands using natural language processing. The system further includes a supplemental cloud server portion (202) having a supplemental artificial intelligence engine (280) for processing, using the natural language processing, the commands unable to be processed by the embedded artificial intelligence engine. The gateway portion (201) further includes a configuration and status interface (230) for performing at least one of diagnostic operations, configuration operations, and status operations, on the gateway device responsive to instructions from any of the embedded artificial intelligence engine and the supplemental artificial intelligence engine.
    Type: Grant
    Filed: June 1, 2015
    Date of Patent: March 23, 2021
    Assignee: INTERDIGITAL CE PATENT HOLDINGS
    Inventors: Brian Duane Clevenger, Thomas Patrick Newberry
  • Patent number: 10936636
    Abstract: Textual information related to user information from user service information is identified. A layered matching is performed on the textual information based on preset background identification information in a preset list, wherein the layered matching includes different matching methods, and the preset list includes a plurality of entries storing different preset background identification information related to the user information. The user information is determined based on the layered matching.
    Type: Grant
    Filed: July 7, 2017
    Date of Patent: March 2, 2021
    Assignee: Advanced New Technologies Co., Ltd.
    Inventors: Hui Li, Guanhai Zhong, Yingping Cao
  • Patent number: 10930286
    Abstract: This disclosure relates generally to a method and system for muting of classified information from an audio using a fuzzy approach. The method comprises converting the received audio signal into text using a speech recognition engine to identify a plurality of classified words from the text to obtain a first set of parameters. Further, a plurality of subwords associated with each classified word are identified to obtain a second set of parameters associated with each subword of corresponding classified word. A relative score is computed for each subword associated with the classified word based on a plurality of similar pairs for the corresponding classified word. A fuzzy muting function is generated using the first set of parameters, the second set of parameters and the relative score associated with each subword. The plurality of subwords associated with each classified word is muted in accordance with the generated fuzzy muting function.
    Type: Grant
    Filed: January 22, 2019
    Date of Patent: February 23, 2021
    Assignee: Tata Consultancy Services Limited
    Inventors: Imran Ahamad Sheikh, Sunil Kumar Kopparapu, Bhavikkumar Bhagvanbhai Vachhani, Bala Mallikarjunarao Garlapati, Srinivasa Rao Chalamala
  • Patent number: 10891944
    Abstract: A speech recognition method includes clustering feature vectors of training data to obtain clustered feature vectors of training data performing interpolation calculation on feature vectors of data to be recognized using the clustered feature vectors of training data, and inputting the feature vectors of data to be recognized after the interpolation calculation into a speech recognition model to adaptively adjust the speech recognition model. The techniques of the present disclosure improve speech recognition accuracy and adaptive processing efficiency.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: January 12, 2021
    Inventor: Shaofei Xue
  • Patent number: 10885916
    Abstract: A control method of an electronic device includes receiving a user voice; based on a user command corresponding to the user voice being already registered, providing at least one audio or visual indication that the user command is unable to be registered as a voice command; based on the user command corresponding to the user voice not being registered yet, providing at least one audio or visual indication that the user command is able to be registered as a voice command; and based on the user command corresponding to the user voice being related to a prohibited expression, providing at least one audio or visual indication that the user command is related to the prohibited expression.
    Type: Grant
    Filed: May 28, 2019
    Date of Patent: January 5, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Nam-yeong Kwon, Kyung-mi Park
  • Patent number: 10871943
    Abstract: In one aspect, a network microphone device includes a plurality of microphones and is configured to detect sound via the one or more microphones. The network microphone device may capture sound data based on the detected sound in a first buffer, and capture metadata associated with the detected sound in a second buffer. The network microphone device may classify one or more noises in the detected sound and cause the network microphone device to perform an action based on the classification of the respective one or more noises.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: December 22, 2020
    Assignee: Sonos, Inc.
    Inventors: Nick D'Amato, Kurt Thomas Soto, Connor Kristopher Smith
  • Patent number: 10850664
    Abstract: A monitoring system can monitor, using one or more sensors, conditions external to a vehicle. The system detects one or more external conditions. Based, at least in part, on the external conditions, the system can transmit a signal corresponding to an alert to an output device.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: December 1, 2020
    Assignee: Uber Technologies, Inc.
    Inventors: Brennan T. Lopez-Hinojosa, Kermit D. Lopez
  • Patent number: 10825465
    Abstract: There is provided a signal processing apparatus for amplifying or attenuating, with respect to a signal in which a desired signal and another signal are mixed, the desired signal and the other signal at different ratios. The signal processing apparatus includes a separator that obtains an estimated first signal and an estimated second signal by receiving a mixed signal in which a first signal (for example, speech) and a second signal (for example, noise) are mixed and estimating the first signal and the second signal. Furthermore, the signal processing apparatus includes a gain adjuster that obtains a gain-adjusted mixed signal by receiving the estimated first signal and the estimated second signal.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: November 3, 2020
    Assignees: NEC CORPORATION, NEC Platforms, Ltd.
    Inventors: Akihiko Sugiyama, Ryoji Miyahara
  • Patent number: 10818286
    Abstract: A vehicle based system and method for receiving voice inputs and determining whether to perform a voice recognition analysis using in-vehicle resources or resources external to the vehicle.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: October 27, 2020
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Ritchie Huang, Pedram Vaghefinazari, Stuart Yamamoto
  • Patent number: 10811011
    Abstract: System and method for correcting for impulse noise in speech recognition systems. One example system includes a microphone, a speaker, and an electronic processor. The electronic processor is configured to receive an audio signal representing an utterance. The electronic processor is configured to detect, within the utterance, the impulse noise, and, in response, generate an annotated utterance including a timing of the impulse noise. The electronic processor is configured to segment the annotated utterance into silence, voice content, and other content, and, when a length of the other content is greater than or equal to an average word length for the annotated utterance, determine, based on the voice content, an intent portion and an entity portion. The electronic processor is configured to generate a voice prompt based on the timing of the impulse noise and the intent portion and/or the entity portion, and to play the voice prompt.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: October 20, 2020
    Assignee: MOTOROLA SOLUTIONS, INC.
    Inventor: Alejandro G. Blanco
  • Patent number: 10747960
    Abstract: Systems and methods are disclosed herein for training a model to accurately determine whether two phrases are conversationally connected. A media guidance application may detect a first phrase and a second phrase, translate each phrase to a string of word types, append each string to the back of a prior string to create a combined string, determine a degree to which any of the individual strings, matches any singleton template, and determine a degree to which the combined string matches any conversational template. Based on the degrees to which the individual and combination strings match the singleton and conversational templates, respectively, strengths of association are correspondingly updated.
    Type: Grant
    Filed: October 11, 2018
    Date of Patent: August 18, 2020
    Assignee: ROVI GUIDES, INC.
    Inventors: Sashikumar Venkataraman, Ahmed Nizam Mohaideen P, Manik Malhotra
  • Patent number: 10740562
    Abstract: A computer generates semantic structure information from a document. The semantic structure information includes a plurality of semantic structures in a plurality of sentences in the document and a plurality of morphemes included in each of the plurality of sentences belong to a corresponding semantic structure. The computer generates a plurality of codes by encoding the plurality of morphemes for each of the plurality of sentences. The computer specifies a specific code that corresponds to a specific morpheme from among the plurality of morphemes included in each of the plurality of sentences, wherein at least one morpheme is potentially missing in a semantic structure that the specific morpheme belongs to. A missing-portion information indicates the missing morpheme is included in a prior sentence in the document. The computer adds missing-portion information to the specific code for each of the plurality of sentences.
    Type: Grant
    Filed: July 14, 2017
    Date of Patent: August 11, 2020
    Assignee: FUJITSU LIMITED
    Inventors: Seiji Okura, Masahiro Kataoka, Masao Ideuchi, Fumiaki Nakamura
  • Patent number: 10733978
    Abstract: An electronic device is provided. The electronic device includes a memory configured to store at least a portion of a plurality of pieces of speech information used for voice recognition, and a processor operatively connected to the memory, wherein the processor selects speaker speech information from at least a portion of the plurality of pieces of speech information based on mutual similarity, and generates voice recognition information to be registered as personalized voice information based on the speaker speech information.
    Type: Grant
    Filed: August 20, 2018
    Date of Patent: August 4, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Chakladar Subhojit
  • Patent number: 10706851
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: July 7, 2020
    Assignee: Google LLC
    Inventors: Alexander H. Gruenstein, Petar Aleksic, Johan Schalkwyk, Pedro J. Moreno Mengibar
  • Patent number: 10699704
    Abstract: A system includes at least one communication interface, at least one processor operatively connected to the at least one communication interface, and at least one memory operatively connected to the at least one processor and storing a plurality of natural language understanding (NLU) models. The at least one memory stores instructions that, when executed, cause the processor to receive first information associated with a user from an external electronic device associated with a user account, using the at least one communication interface, to select at least one of the plurality of NLU models, based on at least part of the first information, and to transmit the selected at least one NLU model to the external electronic device, using the at least one communication interface such that the external electronic device uses the selected at least one NLU model for natural language processing.
    Type: Grant
    Filed: August 8, 2019
    Date of Patent: June 30, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sean Minsung Kim, Jaeyung Yeo
  • Patent number: 10694298
    Abstract: A hearing aid configured for detecting and enhancing speech within an audio environment is disclosed. An incoming audio stream is continuously monitored for the presence of speech within the audio stream. A Codebook Excited Linear Prediction (“CELP”) encoder analyzes the incoming audio stream and outputs an indication of a presence or absence of human speech within the incoming audio stream. Upon detection of human speech, the hearing aid in real time may: amplify the audio input to make the speech more audible to a wearer; filter non-speech audio through isolation of the speech by passing the output of the CELP encoder directly to a CELP decoder; activate a beam-steering process which makes dominant a microphone closest to a speaker while de-prioritizing input from other microphones of the hearing aid; and/or shape the audio spectrum conveyed by the audio input using a response curve optimized for better clarity of human speech.
    Type: Grant
    Filed: October 22, 2019
    Date of Patent: June 23, 2020
    Inventors: Zeev Neumeier, W. Leo Hoarty
  • Patent number: 10685190
    Abstract: Various embodiments described herein facilitate multi-lingual communications. The systems and methods of some embodiments may enable multi-lingual communications through different modes of communications including, for example, Internet-based chat, e-mail, text-based mobile phone communications, postings to online forums, postings to online social media services, and the like. Certain embodiments may implement communications systems and methods that translate text between two or more languages (e.g., spoken), while handling/accommodating for one or more of the following in the text: specialized/domain-related jargon, abbreviations, acronyms, proper nouns, common nouns, diminutives, colloquial words or phrases, and profane words or phrases.
    Type: Grant
    Filed: August 14, 2019
    Date of Patent: June 16, 2020
    Assignee: MZ IP Holdings, LLC
    Inventors: Gabriel Leydon, Francois Orsini, Nikhil Bojja, Shailen Karur
  • Patent number: 10593321
    Abstract: A method for training a multi-language speech recognition network includes providing utterance datasets corresponding to predetermined languages, inserting language identification (ID) labels into the utterance datasets, wherein each of the utterance datasets is labelled by each of the language ID labels, concatenating the labeled utterance datasets, generating initial network parameters from the utterance datasets, selecting the initial network parameters according to a predetermined sequence, and training, iteratively, an end-to-end network with a series of the selected initial network parameters and the concatenated labeled utterance datasets until a training result reaches a threshold.
    Type: Grant
    Filed: December 15, 2017
    Date of Patent: March 17, 2020
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Shinji Watanabe, Takaaki Hori, Hiroshi Seki, Jonathan Le Roux, John Hershey
  • Patent number: 10586533
    Abstract: Embodiments of the present disclosure provide a method and a device for recognizing a speech based on a Chinese-English mixed dictionary. The method includes acquiring a Chinese-English mixed dictionary marked by an international phonetic alphabet, in which, the Chinese-English mixed dictionary includes a Chinese dictionary and an English dictionary revised by Chinglish; by taking the Chinese-English mixed dictionary as a training dictionary, taking a one-layer Convolutional Neural Network and a five-layer Long Short-Term Memory as a model, taking a state of the IPA as a target and taking a connectionist temporal classifier as a training criterion, training the model to obtain a trained CTC acoustic model; and performing a speech recognition on a Chinese-English mixed language based on the trained CTC acoustic model.
    Type: Grant
    Filed: January 2, 2018
    Date of Patent: March 10, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Xiangang Li, Xuewei Zhang
  • Patent number: 10552546
    Abstract: Artificial intelligence is introduced into an electronic meeting context to perform various tasks before, during, and/or after electronic meetings. The artificial intelligence may analyze a wide variety of data such as data pertaining to other electronic meetings, data pertaining to organizations and users, and other general information pertaining to any topic. Capability is also provided to create, manage, and enforce meeting rules templates that specify requirements and constraints for various aspects of electronic meetings. Embodiments include improved approaches for translation and transcription using multiple translation/transcription services. Embodiments also include using sensors in conjunction with interactive whiteboard appliances to perform person detection, person identification, attendance tracking, and improved meeting start.
    Type: Grant
    Filed: October 9, 2017
    Date of Patent: February 4, 2020
    Assignee: Ricoh Company, Ltd.
    Inventors: Steven Nelson, Hiroshi Kitada, Lana Wong
  • Patent number: 10540957
    Abstract: Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained.
    Type: Grant
    Filed: June 9, 2015
    Date of Patent: January 21, 2020
    Assignee: BAIDU USA LLC
    Inventors: Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Gregory Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubhabrata Sengupta, Adam Coates, Andrew Y. Ng