Recognition Patents (Class 704/231)
  • Patent number: 11710486
    Abstract: A virtual environment platform may receive, from a user device, a request to access a virtual reality (VR) environment and may verify, based on the request, a user of the user device to allow the user device access to the VR environment. The virtual environment platform may receive, after verifying the user of the user device, user voice input and user handwritten input from the user device. The virtual environment platform may generate processed user speech by processing the user voice input, wherein a characteristic of the processed user speech and a corresponding characteristic of the user voice input are different and may generate formatted user text by processing the user handwritten input, wherein the formatted user text is machine-encoded text. The virtual environment platform may cause the processed user speech to be audibly presented and the formatted user text to be visually presented in the VR environment.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: July 25, 2023
    Assignee: Capital One Services, LLC
    Inventors: Austin Walters, Jeremy Goodsitt, Fardin Abdi Taghi Abad, Vincent Pham, Kenneth Taylor
  • Patent number: 11705127
    Abstract: Coordinating signal processing among computing devices in a voice-driven computing environment is provided. A first and second digital assistant can detect an input audio signal, perform a signal quality check, and provide indications that the first and second digital assistants are operational to process the input audio signal. A system can select the first digital assistant for further processing. The system can receive, from the first digital assistant, data packets including a command. The system can generate, for a network connected device selected from a plurality of network connected devices, an action data structure based on the data packets, and transmit the action data structure to the selected network connected device.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: July 18, 2023
    Assignee: GOOGLE LLC
    Inventors: Anshul Kothari, Gaurav Bhaya, Tarun Jain
  • Patent number: 11700335
    Abstract: A system may provide for the generation of spatial audio for audiovisual conferences, video conferences, etc. (referred to herein simply as “conferences”). Spatial audio may include audio encoding and/or decoding techniques in which a sound source may be specified at a location, such as on a two-dimensional plane and/or within a three-dimensional field, and/or in which a direction or target for a given sound source may be specified. A conference participant's position within a conference user interface (“UI”) may be set as the source of sound associated with the conference participant, such that different conference participants may be associated with different sound source positions within the conference UI.
    Type: Grant
    Filed: September 7, 2021
    Date of Patent: July 11, 2023
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: Pierre Seigneurbieux
  • Patent number: 11694693
    Abstract: Methods and systems for processing audio signals containing speech data are disclosed. Biometric data associated with at least one speaker are extracted from an audio input. A correspondence is determined between the extracted biometric data and stored biometric data associated with a consenting user profile, where a consenting user profile is a user profile indicates consent to store biometric data. If no correspondence is determined, the speech data is discarded, optionally after having been processed.
    Type: Grant
    Filed: March 21, 2022
    Date of Patent: July 4, 2023
    Assignee: SoapBox Labs Ltd.
    Inventor: Patricia Scanlon
  • Patent number: 11689380
    Abstract: A method and a device for viewing a conference are provided. In the method, after a wide-view video of a specific conference, related conference event data, and speech content of each participant are obtained, a highlight video of the specific conference is correspondingly generated. Accordingly, the efficiency of conference viewing is improved.
    Type: Grant
    Filed: November 23, 2021
    Date of Patent: June 27, 2023
    Assignee: ASPEED Technology Inc.
    Inventor: Chen-Wei Chou
  • Patent number: 11688388
    Abstract: Techniques are described for fulfilling an utterance request for an item represented within a video rendered at a client device. In some implementations, a user account associated with the request is identified, enabling a video stream transmitted in association with the user account at the time that the request was uttered to be identified. In one technique, a timestamp associated with the request is used to identify the relevant portion of the video stream. The item represented within the portion of the video stream can be identified using various techniques and/or information such as image recognition, metadata within the video, subtitles, closed captions, and/or a database mapping between the item and a video content item transmitted in the video stream.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: June 27, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Joshua Danovitz, Lei Li, Lars Christian Ulness, Andrew J. Watts, Amarsingh Buckthasingh Winston, Umut Utkan, Michael Flynn, Girish Bansilal Bajaj
  • Patent number: 11682384
    Abstract: A method for training an alarm system to classify audio of an event, wherein the alarm system is connected to a neural network trained to classify audio as an event type, the method comprising the steps of: receiving audio recorded during a first period of time; transmitting the audio to an external unit; receiving data from the external unit indicating a sub-period of time of the audio and data indicating an event type of the indicated sub-period of time of the audio; and re-training the neural network by inputting a sub-period of the audio corresponding to the indicated sub-period of time of the audio and using the indicated event type as a correct classification of the sub-period of the audio.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: June 20, 2023
    Assignee: Axis AB
    Inventors: Ingemar Larsson, Daniel Andersson
  • Patent number: 11676576
    Abstract: Systems and methods are provided for acquiring training data and building an organizational-based language model based on the training data. In organizational data is generated via one or more applications associated with an organization, the collected organizational data is aggregated and filtered into training data that is used for training an organizational-based language model for speech processing based on the training data.
    Type: Grant
    Filed: August 11, 2021
    Date of Patent: June 13, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao
  • Patent number: 11670015
    Abstract: Embodiments of the present disclosure provide a method and apparatus for generating a video. The method may include: acquiring a cartoon face image sequence of a target cartoon character from a received cartoon-style video, and generating a cartoon face contour figure sequence based on the cartoon face image sequence; generating a face image sequence for a real face based on the cartoon face contour figure sequence and a received initial face image of the real face, a face expression in the face image sequence matching a face expression in the cartoon face image sequence; generating a cartoon-style face image sequence for the real face according to the face image sequence; and replacing a face image of the target cartoon character in the cartoon-style video with a cartoon-style face image in a cartoon-style face image sequence, to generate a cartoon-style video corresponding to the real face.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: June 6, 2023
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Yunfeng Liu, Chao Wang, Yuanhang Li, Ting Yun, Guoqing Chen
  • Patent number: 11664031
    Abstract: Implementations of the subject technology provide systems and methods for multi-mode voice triggering for audio devices. An audio device may store multiple voice recognition models, each trained to detect a single corresponding trigger phrase. So that the audio device can detect a specific one of the multiple trigger phrases without consuming the processing and/or power resources to run a voice recognition model that can differentiate between different trigger phrases, the audio device pre-loads a selected one of the voice recognition models for an expected trigger phrase into a processor of the audio device. The audio device may select the one of the voice recognition models for the expected trigger phrase based on a type of a companion device that is communicatively coupled to the audio device.
    Type: Grant
    Filed: March 11, 2021
    Date of Patent: May 30, 2023
    Assignee: Apple Inc.
    Inventors: Dersheet C. Mehta, Dinesh Garg, Sham Anton Koli, Kerry J. Kopp, Hans Bernhard
  • Patent number: 11636342
    Abstract: In implementations of searching for music, a music search system can receive a music search request that includes a music file including music content. The music search system can also receive a selected musical attribute from a plurality of musical attributes. The music search system includes a music search application that can generate musical features of the music content, where a respective one or more of the musical features correspond to a respective one of the musical attributes. The music search application can then compare the musical features that correspond to the selected musical attribute to audio features of audio files, and determine similar audio files to the music file based on the comparison of the musical features to the audio features of the audio files.
    Type: Grant
    Filed: October 3, 2022
    Date of Patent: April 25, 2023
    Assignee: Adobe Inc.
    Inventors: Jongpil Lee, Nicholas J. Bryan, Justin J. Salamon, Zeyu Jin
  • Patent number: 11636099
    Abstract: A computer-implemented method for generating a question from an abstracted template is described. A non-limiting example of the computer-implemented method includes receiving, by a processor, a question. The method parses, by the processor, the question into a parse tree and abstracts, by the processor, an abstracted template from the parse tree. The method receives, by the processor, a domain schema and a domain knowledge base and generates, by the processor, a new question based on the abstracted template, the domain schema, and the domain knowledge base.
    Type: Grant
    Filed: August 23, 2019
    Date of Patent: April 25, 2023
    Assignee: International Business Machines Corporation
    Inventors: Laura Chiticariu, Aparna Garimella, Yunyao Li
  • Patent number: 11631398
    Abstract: A voice aware audio system and a method for a user wearing a headset to be aware of an outer sound environment while listening to music or any other audio source. An adjustable sound awareness zone gives the user the flexibility to avoid hearing far distant voices. The outer sound can be analyzed in a frequency domain to select an oscillating frequency candidate and in a time domain to determine if the oscillating frequency candidate is the signal of interest. If the signal directed to the outer sound is determined to be a signal of interest the outer sound is mixed with audio from the audio source.
    Type: Grant
    Filed: July 27, 2021
    Date of Patent: April 18, 2023
    Assignee: HED Technologies SARL
    Inventors: Timothy Degraye, Liliane Huguet
  • Patent number: 11620487
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting a neural network architecture for performing a machine learning task. In one aspect, a method comprises: obtaining data defining a synaptic connectivity graph representing synaptic connectivity between neurons in a brain of a biological organism; generating data defining a plurality of candidate graphs based on the synaptic connectivity graph; determining, for each candidate graph, a performance measure on a machine learning task of a neural network having a neural network architecture that is specified by the candidate graph; and selecting a final neural network architecture for performing the machine learning task based on the performance measures.
    Type: Grant
    Filed: January 29, 2020
    Date of Patent: April 4, 2023
    Assignee: X Development LLC
    Inventors: Sarah Ann Laszlo, Philip Edwin Watson, Georgios Evangelopoulos
  • Patent number: 11620994
    Abstract: The invention relates to a method for operating and a method for controlling a dialog system, wherein the dialog system comprises a local dialog unit for detecting dialog inputs and for outputting dialog outputs as well as an external dialog unit having a data connection to the local dialog unit for analyzing detected dialog inputs and determining dialog outputs based thereon. At least one probable future course of dialog consisting of dialog inputs and/or dialog outputs is calculated and transmitted to the local dialog unit by the external dialog unit. Said course of dialog is received and saved by the local dialog unit. The saved course of dialog is called up in the event of an interruption to the data connection between the local dialog unit and the external dialog unit. Dialog inputs are detected and/or dialog outputs are output based on the called-up course of dialog.
    Type: Grant
    Filed: February 4, 2020
    Date of Patent: April 4, 2023
    Assignee: VOLKSWAGEN AKTIENGESELLSCHAFT
    Inventors: Sebastian Varges, Spyros Kousidis
  • Patent number: 11621000
    Abstract: Embodiments described herein include systems and methods for using image searching with voice recognition commands. Embodiments of a method may include providing a user interface via a target application and receiving a user selection of an area on the user interface by a user, the area including a search image. Embodiments may also include receiving an associated voice command and associating, by the computing device, the associated voice command with the search image.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: April 4, 2023
    Assignee: Dolbey & Company, Inc.
    Inventor: Curtis A. Weeks
  • Patent number: 11609851
    Abstract: According to one aspect, a method for determining, for a memory allocation, placements in a memory area of data blocks generated by a neural network, comprises a development of an initial sequence of placements of blocks, each placement being selected from several possible placements, the initial sequence being defined as a candidate sequence, a development of at least one modified sequence of placements from a replacement of a given placement of the initial sequence by a memorized unselected placement, and, if the planned size of the memory area obtained by this modified sequence is less than that of the memory area of the candidate sequence, then this modified sequence becomes the candidate sequence, the placements of the blocks for the allocation being those of the placement sequence defined as a candidate sequence once each modified sequence has been developed.
    Type: Grant
    Filed: April 13, 2021
    Date of Patent: March 21, 2023
    Assignees: STMicroelectronics S.r.l., STMicroelectronics (Rousset) SAS
    Inventors: Laurent Folliot, Emanuele Plebani, Mirko Falchetto
  • Patent number: 11605390
    Abstract: Provided are various mechanisms and processes for language acquisition using socio-neurocognitive techniques.
    Type: Grant
    Filed: September 1, 2020
    Date of Patent: March 14, 2023
    Inventor: Malihe Eshghavi
  • Patent number: 11580994
    Abstract: A method includes receiving acoustic features of a first utterance spoken by a first user that speaks with typical speech and processing the acoustic features of the first utterance using a general speech recognizer to generate a first transcription of the first utterance. The operations also include analyzing the first transcription of the first utterance to identify one or more bias terms in the first transcription and biasing the alternative speech recognizer on the one or more bias terms identified in the first transcription. The operations also include receiving acoustic features of a second utterance spoken by a second user that speaks with atypical speech and processing, using the alternative speech recognizer biased on the one or more terms identified in the first transcription, the acoustic features of the second utterance to generate a second transcription of the second utterance.
    Type: Grant
    Filed: January 20, 2021
    Date of Patent: February 14, 2023
    Assignee: Google LLC
    Inventors: Fadi Biadsy, Pedro Jose Moreno Mengibar
  • Patent number: 11581006
    Abstract: An enunciation system (ES) enables users to gain acquaintance, understanding, and mastery of the relationship between letters and sounds in the context of an alphabetic writing system. The ES enables the user to experience the action of sounding out a word, before their own phonics knowledge enables them to sound out the word independently; its continuous, unbroken speech output or input avoids the common confusions that ensue from analyzing words by breaking them up into discrete sounds; its user-controlled pacing allows the user to slow down enunciation at specific points of difficulty within the word; its real-time touch control allows the written word to be “played” like a musical instrument, with expressive and aesthetic possibilities; and its highlighting of the letter cluster that is responsible for the recognized phoneme enunciated by the user as it occurs allows the user to more easily associated the letters with the sounds.
    Type: Grant
    Filed: November 2, 2020
    Date of Patent: February 14, 2023
    Assignee: Tertl Studos, LLC
    Inventor: Christopher Hancock
  • Patent number: 11568859
    Abstract: A computer-implemented method and apparatus for extracting key information from conversational voice data, where the method comprises receiving a first speaker text corresponding to a speech of a first speaker in a conversation with a second speaker, the conversation comprising multiple turns of speech between the first speaker and the second speaker, the first speaker text comprising multiple question lines, each question line corresponding to the speech of the first speaker at a corresponding turn, arranged chronologically. Feature words are identified, and a frequency of occurrence therefor in each question line is determined. Question lines without any of the feature words are removed, to yield candidate question lines, for each of which a mathematical representation is generated. A similarity score for each candidate question line with respect to each subsequent candidate question line is computed, and the line with the highest score is identified as a key question.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: January 31, 2023
    Assignee: UNIPHORE SOFTWARE SYSTEMS, INC.
    Inventor: Somnath Roy
  • Patent number: 11557282
    Abstract: A method for detecting a hotword includes receiving a sequence of input frames that characterize streaming audio captured by a user device and generating a probability score indicating a presence of a hotword in the streaming audio using a memorized neural network. The network includes sequentially-stacked single value decomposition filter (SVDF) layers and each SVDF layer includes at least one neuron. Each neuron includes a respective memory component, a first stage configured to perform filtering on audio features of each input frame individually and output to the memory component, and a second stage configured to perform filtering on all the filtered audio features residing in the respective memory component. The method also includes determining whether the probability score satisfies a hotword detection threshold and initiating a wake-up process on the user device for processing additional terms.
    Type: Grant
    Filed: January 21, 2021
    Date of Patent: January 17, 2023
    Assignee: Google LLC
    Inventors: Raziel Alvarez Guevara, Hyun Jin Park
  • Patent number: 11556307
    Abstract: Example techniques relate to local voice control in a media playback system. A satellite device (e.g., a playback device or microcontroller unit) may be configured to recognize a local set of keywords in voice inputs including context specific keywords (e.g., for controlling an associated smart device) as well as keywords corresponding to a subset of media playback commands for controlling playback devices in the media playback system. The satellite device may fall back to a hub device (e.g., a playback device) configured to recognize a more extensive set of keywords. In some examples, either device may fall back to the cloud for processing of other voice inputs.
    Type: Grant
    Filed: January 31, 2021
    Date of Patent: January 17, 2023
    Assignee: Sonos, Inc.
    Inventors: Sebastien Maury, Joseph Dureau, Thibaut Lorrain, Do Kyun Kim
  • Patent number: 11551676
    Abstract: Techniques are described for using data stored for a user in association with context levels to improve the efficiency and accuracy of dialog processing tasks. A dialog system stores historical dialog data in association with a plurality of configured context levels. The dialog system receives an utterance and identifies a term for disambiguation from the utterance. Based on a determined context level, the dialog system identifies relevant historical data stored to a database. The historical data may be used to perform tasks such as resolving an ambiguity based on user preferences, disambiguating named entities based on a prior dialog, and identifying previously generated answers to queries. Based on the context level, the dialog system can efficiently identify the relevant information and use the identified information to provide a response.
    Type: Grant
    Filed: August 26, 2020
    Date of Patent: January 10, 2023
    Assignee: Oracle International Corporation
    Inventor: Mark Edward Johnson
  • Patent number: 11545144
    Abstract: A method, an electronic device, and computer readable medium is provided. The method includes identifying a frequency of each word that is present within a set of words. The method also includes deriving relatedness values for pairs of words. Each pair of words includes a first word and a second word in the set of words. Each relatedness value corresponds to a respective one of the pairs of words. Each relatedness value is based on the identified frequencies that the first word and the second word of the respective pair of words are present within the set of words. The method further includes generating a matrix representing the relatedness values. The method additionally includes generating a language model that represents relationships between the set of words included in the matrix.
    Type: Grant
    Filed: January 29, 2019
    Date of Patent: January 3, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Anil Yadav, Mohammad Moazzami, Allan Jay Schwade
  • Patent number: 11545147
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.
    Type: Grant
    Filed: May 2, 2019
    Date of Patent: January 3, 2023
    Assignee: Google LLC
    Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
  • Patent number: 11538461
    Abstract: Some implementations include methods for detecting missing subtitles associated with a media presentation and may include receiving an audio component and a subtitle component associated with a media presentation, the audio component including an audio sequence, the audio sequence divided into a plurality of audio segments; evaluating the plurality of audio segments using a combination of a recurrent neural network and a convolutional neural network to identify refined speech segments associated with the audio sequence, the recurrent neural network trained based on a plurality of languages, the convolutional neural network trained based on a plurality of categories of sound; determining timestamps associated with the identified refined speech segments; and determining missing subtitles based on the timestamps associated with the identified refined speech segments and timestamps associated with subtitles included in the subtitle component.
    Type: Grant
    Filed: March 18, 2021
    Date of Patent: December 27, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Honey Gupta, Mayank Sharma
  • Patent number: 11538463
    Abstract: Methods and systems are provided for generating a customized speech recognition neural network system comprised of an adapted automatic speech recognition neural network and an adapted language model neural network. The automatic speech recognition neural network is first trained in a generic domain and then adapted to a target domain. The language model neural network is first trained in a generic domain and then adapted to a target domain. Such a customized speech recognition neural network system can be used to understand input vocal commands.
    Type: Grant
    Filed: April 12, 2019
    Date of Patent: December 27, 2022
    Assignee: Adobe Inc.
    Inventors: Trung Huu Bui, Subhadeep Dey, Franck Dernoncourt
  • Patent number: 11508367
    Abstract: A dialogue system includes: a storage configured to store a parameter tree including at least one parameter used for performing an action; a speech input device configured to receive speech from a user; an input processor configured to apply a natural language understanding algorithm to the received speech to generate a speech recognition result; a dialogue manager configured to determine an action corresponding to the received speech based on the speech recognition result, to retrieve a parameter tree corresponding to the action from the storage, and to determine additional information needed to perform the action based on the retrieved parameter tree; and a result processor configured to generate a dialogue response for requesting the additional information.
    Type: Grant
    Filed: November 4, 2019
    Date of Patent: November 22, 2022
    Assignees: HYUNDAI MOTOR COMPANY, KIA MOTORS CORPORATION, SOGANG UNIVERSITY RESEARCH & BUSINESS DEVELOPMENT FOUNDATION
    Inventors: Youngmin Park, Seona Kim, Jeong-Eom Lee, Jung Yun Seo
  • Patent number: 11507581
    Abstract: The invention concerns a query response device comprising: an input adapted to receive user queries; a memory (106) adapted to store one or more routing rules; one or more live agent engines (116) configured to support interactions with one or more live agents; one or more virtual assistant engines (120) configured to support interactions with one or more virtual assistants instantiated by an artificial intelligence module (103); and a routing module (104) coupled to said live agent engines and to said virtual assistant engines, the routing module comprising a processing device configured: to select, based on content of at least a first user message from a first user relating to a first user query and on said one or more routing rules, a first of said live agent engines or a first of said virtual assistant engines; and to route one or more further user messages relating to the first user query to the selected engine.
    Type: Grant
    Filed: May 5, 2020
    Date of Patent: November 22, 2022
    Assignee: Accenture Global Services Limited
    Inventors: Anatoly Roytman, Alexandre Naressi
  • Patent number: 11508375
    Abstract: An electronic apparatus is provided. The electronic apparatus includes a microphone, a transceiver, a memory configured to store a control command identification tool based on a control command identified by a voice recognition server that performs voice recognition processing on a user voice received from the electronic apparatus, and at least one processor configured to, based on the user voice being received through the microphone, acquire user intention information by performing the voice recognition processing on the received user voice, receive status information of external devices related to the acquired user intention information from a device control server, identify a control command for controlling a device to be controlled among the plurality of external devices by applying the acquired user intention information and the received status information of the external devices to the control command identification tool, and transmit the identified control command to the device control server.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: November 22, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Woojei Choi, Minkyong Kim
  • Patent number: 11501546
    Abstract: In various embodiments, methods and systems for implementing a media management system, for video data processing and adaptation data generation, are provided. At a high level, a video data processing engine relies on different types of video data properties and additional auxiliary data resources to perform video optical character recognition operations for recognizing characters in video data. In operation, video data is accessed to identify recognized characters. A video OCR operation to perform on the video data for character recognition is determined from video character processing and video auxiliary data processing. Video auxiliary data processing includes processing an auxiliary reference object; the auxiliary reference object is an indirect reference object that is a derived input element used as a factor in determining the recognized characters. The video data is processed based on the video OCR operation and based on processing the video data, at least one recognized character is communicated.
    Type: Grant
    Filed: July 27, 2020
    Date of Patent: November 15, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Royi Ronen, Ika Bar-Menachem, Ohad Jassin, Avner Levi, Olivier Nano, Oron Nir, Mor Geva Pipek, Ori Ziv
  • Patent number: 11494554
    Abstract: A function execution instruction system includes a function execution instruction unit configured to instruct execution of one or more functions, a sentence input unit configured to input a sentence, an execution function determination unit configured to determine a function the execution of which is instructed on the basis of an input sentence, a time information extraction unit configured to extract time information indicating a time from the input sentence, and a time specification unit configured to, in accordance with a determined function, specify a time used for the execution of the function on the basis of extracted time information wherein the function execution instruction unit instructs the execution of the determined function, which uses a specified time.
    Type: Grant
    Filed: December 4, 2018
    Date of Patent: November 8, 2022
    Assignee: NTT DOCOMO, INC.
    Inventors: Hiroshi Fujimoto, Kousuke Kadono
  • Patent number: 11494434
    Abstract: The system receives a voice query at an audio interface and converts the voice query to text. The system can determine pronunciation information during conversion and generate metadata the indicates a pronunciation of one or more words of the query, include phonetic information in the text query, or both. A query includes one or more entities, which may be more accurately identified based on pronunciation. The system searches for information, content, or both among one or more databases based on the generated text query, pronunciation information, user profile information, search histories or trends, and optionally other information. The system identifies one or more entities or content items that match the text query, and retrieves the identified information to provide to the user.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: November 8, 2022
    Assignee: ROVI GUIDES, INC.
    Inventors: Ankur Aher, Indranil Coomar Doss, Aashish Goyal, Aman Puniyani, Kandala Reddy, Mithun Umesh
  • Patent number: 11488590
    Abstract: According to some embodiments of the disclosure, a method is disclosed. The method includes receiving, by a processing device of an in-ear device, an audio signal from one or more microphones of the in-ear device. The method further includes extracting, by the processing device, one or more features of the audio signal and generating, by the processing device, an in-ear data object based on the one or more features. The method also includes publishing, by the processing device, the in-ear data object to an external system via a network.
    Type: Grant
    Filed: May 9, 2019
    Date of Patent: November 1, 2022
    Assignee: Staton Techiya LLC
    Inventors: Charles Cella, John Keady
  • Patent number: 11487502
    Abstract: A portable terminal device in an information processing system and method includes a camera and a microphone. Data of obtained images and voice are transmitted to a server that identifies operations to be executed based on the received voice and image data. The server transmits an identification of one or more results of the plurality of operations to the portable terminal device. When the portable terminal device receives only one result from the server, an operation corresponding to the one result is executed, and when a plurality of results is received, the portable terminal device displays information corresponding to the plurality of results as candidates. Additional voice is captured for selecting one of the plurality of results during the displaying of the information. A determination of one result from the plurality of results is made based on the captured voice, and an operation corresponding to the determined result is executed.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: November 1, 2022
    Assignee: Maxell, Ltd.
    Inventors: Motoyuki Suzuki, Hideo Nishijima
  • Patent number: 11481087
    Abstract: An electronic device, including an input interface configured to receive an input signal, a command determination unit configured to determine a plurality of possible commands based on the input signal, and an output interface configured to provide a plurality of output information corresponding to effects associated with each of the plurality of determined possible commands.
    Type: Grant
    Filed: March 18, 2015
    Date of Patent: October 25, 2022
    Assignee: SONY CORPORATION
    Inventors: Fritz Hohl, Stefan Uhlich, Wilhelm Hagg, Thomas Kemp
  • Patent number: 11482215
    Abstract: A method comprising detecting an activation of an intelligent assistant on an electronic device, waking up the intelligent assistant from a sleep mode in response to the activation, and determining an amount of vocabulary the intelligent assistant acts upon during a listening mode based on a type of the activation.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: October 25, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jeffrey C. Olson, Henry N. Holtzman, Jean-David Hsu, Jeffrey A. Morgan
  • Patent number: 11463779
    Abstract: A video stream processing method is provided. First audio stream data in live video stream data is obtained. Speech recognition is performed on the first audio stream data to generate speech recognition text. Caption data is generated according to the speech recognition text, the caption data including caption text and time information corresponding to the caption text. The caption text is added to a corresponding picture frame in the live video stream data according to the time information corresponding to the caption text to generate captioned live video stream data.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: October 4, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Xiaohua Hu, Ziheng Luo, Xiuming Zhu
  • Patent number: 11462217
    Abstract: An electronic apparatus is disclosed.
    Type: Grant
    Filed: April 21, 2020
    Date of Patent: October 4, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kihyun Song, Jongjin Park, Shina Kim, Sukhoon Yoon, Wonjae Lee, Jongkeun Lee
  • Patent number: 11461649
    Abstract: In implementations of searching for music, a music search system can receive a music search request that includes a music file including music content. The music search system can also receive a selected musical attribute from a plurality of musical attributes. The music search system includes a music search application that can generate musical features of the music content, where a respective one or more of the musical features correspond to a respective one of the musical attributes. The music search application can then compare the musical features that correspond to the selected musical attribute to audio features of audio files, and determine similar audio files to the music file based on the comparison of the musical features to the audio features of the audio files.
    Type: Grant
    Filed: March 19, 2020
    Date of Patent: October 4, 2022
    Assignee: Adobe Inc.
    Inventors: Jongpil Lee, Nicholas J. Bryan, Justin J. Salamon, Zeyu Jin
  • Patent number: 11462233
    Abstract: An electronic device and method of recognizing an audio scene are provided.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: October 4, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hoon Heo, Sunmin Kim, Kiwoong Kang, Kibeom Kim, Inwoo Hwang
  • Patent number: 11462209
    Abstract: For the problem of waveform synthesis from spectrograms, presented herein are embodiments of an efficient neural network architecture, based on transposed convolutions to achieve a high compute intensity and fast inference. In one or more embodiments, for training of the convolutional vocoder architecture, losses are used that are related to perceptual audio quality, as well as a GAN framework to guide with a critic that discerns unrealistic waveforms. While yielding a high-quality audio, embodiments of the model can achieve more than 500 times faster than real-time audio synthesis. Multi-head convolutional neural network (MCNN) embodiments for waveform synthesis from spectrograms are also disclosed. MCNN embodiments enable significantly better utilization of modern multi-core processors than commonly-used iterative algorithms like Griffin-Lim and yield very fast (more than 300× real-time) waveform synthesis.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: October 4, 2022
    Assignee: Baidu USA LLC
    Inventors: Sercan Arik, Hee Woo Jun, Eric Undersander, Gregory Diamos
  • Patent number: 11449672
    Abstract: Disclosed is an electronic device including a communication circuit, a microphone, a memory, speaker, and a processor, in which the processor receives the speech input and transmits first data associated with the speech input to a first server for supporting the speech recognition service, receives second data corresponding to processing of a part of the first data from the first server and outputs the second data at a first time that a first period of time has elapsed after the transmission of the first data, and outputs third data corresponding to processing of the rest of the first data at a second time that a second period of time has elapsed from the first time, while receiving the third data from the first server or a second server for supporting the speech recognition service before the second time.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: September 20, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yong Wook Kim, Dong Kyu Lee, Ja Min Goo, Gang Heok Kim
  • Patent number: 11450317
    Abstract: The disclosure relates to a smart furniture controller with voice recognition, including a controller body and a control circuit. The control circuit is disposed in the controller body. The control circuit comprises a main control unit, a voice recognition module, an operation panel, a control output interface and a power module. The voice recognition module is connected to the main control unit; a microphone and a loudspeaker are connected to the voice recognition module; the operation panel is connected to the main control unit, and the control output interface is connected to the main control unit. The disclosed embodiments integrate a voice recognition module in the controller. Voice commands can be recognized through the voice recognition module, allowing to control the smart furniture via voice, and making the operation simpler and more convenient.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: September 20, 2022
    Assignee: eMoMo Technology Co., Ltd.
    Inventors: Wenji Tang, Jingzhi Chen, Zhigang Wang, Wei Zhou, Ming Kong, Lin Chen, Shunde Yang, Jiandong Li, Qishuang Lu, Zaigui Yang
  • Patent number: 11450323
    Abstract: Speech is digitized and analyzed by a speech-recognition platform to produce raw text sentences. In various embodiments, the recognized words of each sentence are tokenized based on a grammar, which may be selected by a Recognition Context Controller (RCC) using a context database. A Medical Context Semantic Library (MCSL) contains all medically relevant terms recognized by the system and, once the grammar is selected, the MCSL is used to select a semantic template (consisting of one or more hierarchically organized data structures whose root is a “Concept”). Recognized words are mapped to tokens based on the operative grammar to fill the Concept tree(s). The grammar and the Concept trees can potentially shift after each sentence based on the RCC's analysis. The trees accumulate and are filled as sentences are analyzed. Once all of the sentences have been analyzed, the trees have been filled to the extent possible. Concepts may be organized into higher-level Observations.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: September 20, 2022
    Inventors: Kaushal Shastri, Gerard Muro
  • Patent number: 11443554
    Abstract: One or more computing devices, systems, and/or methods are provided. One or more videos associated with a user may be analyzed to determine a first set of features of the user associated with a first emotion of the user and/or a second set of features of the user associated with a second emotion of the user. A first user emotion profile associated with the user may be generated based upon the first set of features and/or the second set of features. A second video may be presented via a graphical user interface of a first client device. The user may be identified within the second video. It may be determined, based upon the second video and/or the first user emotion profile, that the user is associated with the first emotion. A representation of the first emotion may be displayed via the graphical user interface of the first client device.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: September 13, 2022
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: Ariel Raviv, Joel Oren, Irena Grabovitch-Zuyev
  • Patent number: 11437035
    Abstract: An agent device is equipped with a plurality of agent controllers which provide a service including causing an output device to output a response of voice in accordance with an utterance of an occupant of a vehicle, in which a first agent controller included in the plurality of agent controllers provides an agent controller different from the first agent controller with first service information on the service to be provided to the occupant.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: September 6, 2022
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Masaki Kurihara, Hiroshi Honda
  • Patent number: 11430465
    Abstract: A method of recorded message detection is provided. In this an audio restream is received and a set of landmark features is identified in a section of the audio stream. From these landmark features an audio finger print for the section of the audio stream is determined. This audio finger print is compared with at least one of the plurality of stored audio finger prints, each derived from a respective audio stream. It is determined that the received audio stream is a recorded message if a derived audio finger print is substantially equivalent to one of the plurality of stored audio finger prints representing a recorded message.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: August 30, 2022
    Assignee: Magus Communications Limited
    Inventor: Michael Thompson
  • Patent number: 11431517
    Abstract: Methods and systems for team cooperation with real-time recording of one or more moment-associating elements. For example, a method includes: delivering, in response to an instruction, an invitation to each member of one or more members associated with a workspace; granting, in response to acceptance of the invitation by one or more subscribers of the one or more members, subscription permission to the one or more subscribers; receiving the one or more moment-associating elements; transforming the one or more moment-associating elements into one or more pieces of moment-associating information; and transmitting at least one piece of the one or more pieces of moment-associating information to the one or more subscribers.
    Type: Grant
    Filed: February 3, 2020
    Date of Patent: August 30, 2022
    Assignee: Otter.ai, Inc.
    Inventors: Simon Lau, Yun Fu, James Mason Altreuter, Brian Francis Williams, Xiaoke Huang, Tao Xing, Wen Sun, Tao Lu, Kaisuke Nakajima, Kean Kheong Chin, Hitesh Anand Gupta, Julius Cheng, Jing Pan, Sam Song Liang