Recognition Patents (Class 704/231)

Neural network (Class 704/232)

Detect speech in noise (Class 704/233)

Normalizing (Class 704/234)

Speech to image (Class 704/235)

Specialized equations or comparisons (Class 704/236)

Creating patterns for matching (Class 704/243)

Voice recognition (Class 704/246)

Word recognition (Class 704/251)

Removal of identifying traits of a user in a virtual environment

Patent number: 11710486

Abstract: A virtual environment platform may receive, from a user device, a request to access a virtual reality (VR) environment and may verify, based on the request, a user of the user device to allow the user device access to the VR environment. The virtual environment platform may receive, after verifying the user of the user device, user voice input and user handwritten input from the user device. The virtual environment platform may generate processed user speech by processing the user voice input, wherein a characteristic of the processed user speech and a corresponding characteristic of the user voice input are different and may generate formatted user text by processing the user handwritten input, wherein the formatted user text is machine-encoded text. The virtual environment platform may cause the processed user speech to be audibly presented and the formatted user text to be visually presented in the VR environment.

Type: Grant

Filed: June 11, 2021

Date of Patent: July 25, 2023

Assignee: Capital One Services, LLC

Inventors: Austin Walters, Jeremy Goodsitt, Fardin Abdi Taghi Abad, Vincent Pham, Kenneth Taylor
Signal processing coordination among digital voice assistant computing devices

Patent number: 11705127

Abstract: Coordinating signal processing among computing devices in a voice-driven computing environment is provided. A first and second digital assistant can detect an input audio signal, perform a signal quality check, and provide indications that the first and second digital assistants are operational to process the input audio signal. A system can select the first digital assistant for further processing. The system can receive, from the first digital assistant, data packets including a command. The system can generate, for a network connected device selected from a plurality of network connected devices, an action data structure based on the data packets, and transmit the action data structure to the selected network connected device.

Type: Grant

Filed: June 11, 2021

Date of Patent: July 18, 2023

Assignee: GOOGLE LLC

Inventors: Anshul Kothari, Gaurav Bhaya, Tarun Jain
Systems and methods for videoconferencing with spatial audio

Patent number: 11700335

Abstract: A system may provide for the generation of spatial audio for audiovisual conferences, video conferences, etc. (referred to herein simply as “conferences”). Spatial audio may include audio encoding and/or decoding techniques in which a sound source may be specified at a location, such as on a two-dimensional plane and/or within a three-dimensional field, and/or in which a direction or target for a given sound source may be specified. A conference participant's position within a conference user interface (“UI”) may be set as the source of sound associated with the conference participant, such that different conference participants may be associated with different sound source positions within the conference UI.

Type: Grant

Filed: September 7, 2021

Date of Patent: July 11, 2023

Assignee: Verizon Patent and Licensing Inc.

Inventor: Pierre Seigneurbieux
Methods and systems for processing audio signals containing speech data

Patent number: 11694693

Abstract: Methods and systems for processing audio signals containing speech data are disclosed. Biometric data associated with at least one speaker are extracted from an audio input. A correspondence is determined between the extracted biometric data and stored biometric data associated with a consenting user profile, where a consenting user profile is a user profile indicates consent to store biometric data. If no correspondence is determined, the speech data is discarded, optionally after having been processed.

Type: Grant

Filed: March 21, 2022

Date of Patent: July 4, 2023

Assignee: SoapBox Labs Ltd.

Inventor: Patricia Scanlon
Method and device for viewing conference

Patent number: 11689380

Abstract: A method and a device for viewing a conference are provided. In the method, after a wide-view video of a specific conference, related conference event data, and speech content of each participant are obtained, a highlight video of the specific conference is correspondingly generated. Accordingly, the efficiency of conference viewing is improved.

Type: Grant

Filed: November 23, 2021

Date of Patent: June 27, 2023

Assignee: ASPEED Technology Inc.

Inventor: Chen-Wei Chou
Utterance request of items as seen within video

Patent number: 11688388

Abstract: Techniques are described for fulfilling an utterance request for an item represented within a video rendered at a client device. In some implementations, a user account associated with the request is identified, enabling a video stream transmitted in association with the user account at the time that the request was uttered to be identified. In one technique, a timestamp associated with the request is used to identify the relevant portion of the video stream. The item represented within the portion of the video stream can be identified using various techniques and/or information such as image recognition, metadata within the video, subtitles, closed captions, and/or a database mapping between the item and a video content item transmitted in the video stream.

Type: Grant

Filed: September 24, 2020

Date of Patent: June 27, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Joshua Danovitz, Lei Li, Lars Christian Ulness, Andrew J. Watts, Amarsingh Buckthasingh Winston, Umut Utkan, Michael Flynn, Girish Bansilal Bajaj
Method, software, and device for training an alarm system to classify audio of an event

Patent number: 11682384

Abstract: A method for training an alarm system to classify audio of an event, wherein the alarm system is connected to a neural network trained to classify audio as an event type, the method comprising the steps of: receiving audio recorded during a first period of time; transmitting the audio to an external unit; receiving data from the external unit indicating a sub-period of time of the audio and data indicating an event type of the indicated sub-period of time of the audio; and re-training the neural network by inputting a sub-period of the audio corresponding to the indicated sub-period of time of the audio and using the indicated event type as a correct classification of the sub-period of the audio.

Type: Grant

Filed: February 23, 2021

Date of Patent: June 20, 2023

Assignee: Axis AB

Inventors: Ingemar Larsson, Daniel Andersson
Organizational-based language model generation

Patent number: 11676576

Abstract: Systems and methods are provided for acquiring training data and building an organizational-based language model based on the training data. In organizational data is generated via one or more applications associated with an organization, the collected organizational data is aggregated and filtered into training data that is used for training an organizational-based language model for speech processing based on the training data.

Type: Grant

Filed: August 11, 2021

Date of Patent: June 13, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ziad Al Bawab, Anand U Desai, Cem Aksoylar, Michael Levit, Xin Meng, Shuangyu Chang, Suyash Choudhury, Dhiresh Rawal, Tao Li, Rishi Girish, Marcus Jager, Ananth Rampura Sheshagiri Rao
Method and apparatus for generating video

Patent number: 11670015

Abstract: Embodiments of the present disclosure provide a method and apparatus for generating a video. The method may include: acquiring a cartoon face image sequence of a target cartoon character from a received cartoon-style video, and generating a cartoon face contour figure sequence based on the cartoon face image sequence; generating a face image sequence for a real face based on the cartoon face contour figure sequence and a received initial face image of the real face, a face expression in the face image sequence matching a face expression in the cartoon face image sequence; generating a cartoon-style face image sequence for the real face according to the face image sequence; and replacing a face image of the target cartoon character in the cartoon-style video with a cartoon-style face image in a cartoon-style face image sequence, to generate a cartoon-style video corresponding to the real face.

Type: Grant

Filed: December 4, 2020

Date of Patent: June 6, 2023

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Yunfeng Liu, Chao Wang, Yuanhang Li, Ting Yun, Guoqing Chen
Multi-mode voice triggering for audio devices

Patent number: 11664031

Abstract: Implementations of the subject technology provide systems and methods for multi-mode voice triggering for audio devices. An audio device may store multiple voice recognition models, each trained to detect a single corresponding trigger phrase. So that the audio device can detect a specific one of the multiple trigger phrases without consuming the processing and/or power resources to run a voice recognition model that can differentiate between different trigger phrases, the audio device pre-loads a selected one of the voice recognition models for an expected trigger phrase into a processor of the audio device. The audio device may select the one of the voice recognition models for the expected trigger phrase based on a type of a companion device that is communicatively coupled to the audio device.

Type: Grant

Filed: March 11, 2021

Date of Patent: May 30, 2023

Assignee: Apple Inc.

Inventors: Dersheet C. Mehta, Dinesh Garg, Sham Anton Koli, Kerry J. Kopp, Hans Bernhard
Searching for music

Patent number: 11636342

Abstract: In implementations of searching for music, a music search system can receive a music search request that includes a music file including music content. The music search system can also receive a selected musical attribute from a plurality of musical attributes. The music search system includes a music search application that can generate musical features of the music content, where a respective one or more of the musical features correspond to a respective one of the musical attributes. The music search application can then compare the musical features that correspond to the selected musical attribute to audio features of audio files, and determine similar audio files to the music file based on the comparison of the musical features to the audio features of the audio files.

Type: Grant

Filed: October 3, 2022

Date of Patent: April 25, 2023

Assignee: Adobe Inc.

Inventors: Jongpil Lee, Nicholas J. Bryan, Justin J. Salamon, Zeyu Jin
Domain-specific labeled question generation for training syntactic parsers

Patent number: 11636099

Abstract: A computer-implemented method for generating a question from an abstracted template is described. A non-limiting example of the computer-implemented method includes receiving, by a processor, a question. The method parses, by the processor, the question into a parse tree and abstracts, by the processor, an abstracted template from the parse tree. The method receives, by the processor, a domain schema and a domain knowledge base and generates, by the processor, a new question based on the abstracted template, the domain schema, and the domain knowledge base.

Type: Grant

Filed: August 23, 2019

Date of Patent: April 25, 2023

Assignee: International Business Machines Corporation

Inventors: Laura Chiticariu, Aparna Garimella, Yunyao Li
Voice aware audio system and method

Patent number: 11631398

Abstract: A voice aware audio system and a method for a user wearing a headset to be aware of an outer sound environment while listening to music or any other audio source. An adjustable sound awareness zone gives the user the flexibility to avoid hearing far distant voices. The outer sound can be analyzed in a frequency domain to select an oscillating frequency candidate and in a time domain to determine if the oscillating frequency candidate is the signal of interest. If the signal directed to the outer sound is determined to be a signal of interest the outer sound is mixed with audio from the audio source.

Type: Grant

Filed: July 27, 2021

Date of Patent: April 18, 2023

Assignee: HED Technologies SARL

Inventors: Timothy Degraye, Liliane Huguet
Neural architecture search based on synaptic connectivity graphs

Patent number: 11620487

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting a neural network architecture for performing a machine learning task. In one aspect, a method comprises: obtaining data defining a synaptic connectivity graph representing synaptic connectivity between neurons in a brain of a biological organism; generating data defining a plurality of candidate graphs based on the synaptic connectivity graph; determining, for each candidate graph, a performance measure on a machine learning task of a neural network having a neural network architecture that is specified by the candidate graph; and selecting a final neural network architecture for performing the machine learning task based on the performance measures.

Type: Grant

Filed: January 29, 2020

Date of Patent: April 4, 2023

Assignee: X Development LLC

Inventors: Sarah Ann Laszlo, Philip Edwin Watson, Georgios Evangelopoulos
Method for operating and/or controlling a dialog system

Patent number: 11620994

Abstract: The invention relates to a method for operating and a method for controlling a dialog system, wherein the dialog system comprises a local dialog unit for detecting dialog inputs and for outputting dialog outputs as well as an external dialog unit having a data connection to the local dialog unit for analyzing detected dialog inputs and determining dialog outputs based thereon. At least one probable future course of dialog consisting of dialog inputs and/or dialog outputs is calculated and transmitted to the local dialog unit by the external dialog unit. Said course of dialog is received and saved by the local dialog unit. The saved course of dialog is called up in the event of an interruption to the data connection between the local dialog unit and the external dialog unit. Dialog inputs are detected and/or dialog outputs are output based on the called-up course of dialog.

Type: Grant

Filed: February 4, 2020

Date of Patent: April 4, 2023

Assignee: VOLKSWAGEN AKTIENGESELLSCHAFT

Inventors: Sebastian Varges, Spyros Kousidis
Systems and methods for associating a voice command with a search image

Patent number: 11621000

Abstract: Embodiments described herein include systems and methods for using image searching with voice recognition commands. Embodiments of a method may include providing a user interface via a target application and receiving a user selection of an area on the user interface by a user, the area including a search image. Embodiments may also include receiving an associated voice command and associating, by the computing device, the associated voice command with the search image.

Type: Grant

Filed: April 27, 2021

Date of Patent: April 4, 2023

Assignee: Dolbey & Company, Inc.

Inventor: Curtis A. Weeks
Device and method for allocating intermediate data from an artificial neural network

Patent number: 11609851

Abstract: According to one aspect, a method for determining, for a memory allocation, placements in a memory area of data blocks generated by a neural network, comprises a development of an initial sequence of placements of blocks, each placement being selected from several possible placements, the initial sequence being defined as a candidate sequence, a development of at least one modified sequence of placements from a replacement of a given placement of the initial sequence by a memorized unselected placement, and, if the planned size of the memory area obtained by this modified sequence is less than that of the memory area of the candidate sequence, then this modified sequence becomes the candidate sequence, the placements of the blocks for the allocation being those of the placement sequence defined as a candidate sequence once each modified sequence has been developed.

Type: Grant

Filed: April 13, 2021

Date of Patent: March 21, 2023

Assignees: STMicroelectronics S.r.l., STMicroelectronics (Rousset) SAS

Inventors: Laurent Folliot, Emanuele Plebani, Mirko Falchetto
Systems, methods, and apparatus for language acquisition using socio-neuorocognitive techniques

Patent number: 11605390

Abstract: Provided are various mechanisms and processes for language acquisition using socio-neurocognitive techniques.

Type: Grant

Filed: September 1, 2020

Date of Patent: March 14, 2023

Inventor: Malihe Eshghavi
Speech recognition

Patent number: 11580994

Abstract: A method includes receiving acoustic features of a first utterance spoken by a first user that speaks with typical speech and processing the acoustic features of the first utterance using a general speech recognizer to generate a first transcription of the first utterance. The operations also include analyzing the first transcription of the first utterance to identify one or more bias terms in the first transcription and biasing the alternative speech recognizer on the one or more bias terms identified in the first transcription. The operations also include receiving acoustic features of a second utterance spoken by a second user that speaks with atypical speech and processing, using the alternative speech recognizer biased on the one or more terms identified in the first transcription, the acoustic features of the second utterance to generate a second transcription of the second utterance.

Type: Grant

Filed: January 20, 2021

Date of Patent: February 14, 2023

Assignee: Google LLC

Inventors: Fadi Biadsy, Pedro Jose Moreno Mengibar
Systems and methods for variably paced real-time translation between the written and spoken forms of a word

Patent number: 11581006

Abstract: An enunciation system (ES) enables users to gain acquaintance, understanding, and mastery of the relationship between letters and sounds in the context of an alphabetic writing system. The ES enables the user to experience the action of sounding out a word, before their own phonics knowledge enables them to sound out the word independently; its continuous, unbroken speech output or input avoids the common confusions that ensue from analyzing words by breaking them up into discrete sounds; its user-controlled pacing allows the user to slow down enunciation at specific points of difficulty within the word; its real-time touch control allows the written word to be “played” like a musical instrument, with expressive and aesthetic possibilities; and its highlighting of the letter cluster that is responsible for the recognized phoneme enunciated by the user as it occurs allows the user to more easily associated the letters with the sounds.

Type: Grant

Filed: November 2, 2020

Date of Patent: February 14, 2023

Assignee: Tertl Studos, LLC

Inventor: Christopher Hancock
Method and apparatus for extracting key information from conversational voice data

Patent number: 11568859

Abstract: A computer-implemented method and apparatus for extracting key information from conversational voice data, where the method comprises receiving a first speaker text corresponding to a speech of a first speaker in a conversation with a second speaker, the conversation comprising multiple turns of speech between the first speaker and the second speaker, the first speaker text comprising multiple question lines, each question line corresponding to the speech of the first speaker at a corresponding turn, arranged chronologically. Feature words are identified, and a frequency of occurrence therefor in each question line is determined. Question lines without any of the feature words are removed, to yield candidate question lines, for each of which a mathematical representation is generated. A similarity score for each candidate question line with respect to each subsequent candidate question line is computed, and the line with the highest score is identified as a key question.

Type: Grant

Filed: September 25, 2020

Date of Patent: January 31, 2023

Assignee: UNIPHORE SOFTWARE SYSTEMS, INC.

Inventor: Somnath Roy
End-to-end streaming keyword spotting

Patent number: 11557282

Abstract: A method for detecting a hotword includes receiving a sequence of input frames that characterize streaming audio captured by a user device and generating a probability score indicating a presence of a hotword in the streaming audio using a memorized neural network. The network includes sequentially-stacked single value decomposition filter (SVDF) layers and each SVDF layer includes at least one neuron. Each neuron includes a respective memory component, a first stage configured to perform filtering on audio features of each input frame individually and output to the memory component, and a second stage configured to perform filtering on all the filtered audio features residing in the respective memory component. The method also includes determining whether the probability score satisfies a hotword detection threshold and initiating a wake-up process on the user device for processing additional terms.

Type: Grant

Filed: January 21, 2021

Date of Patent: January 17, 2023

Assignee: Google LLC

Inventors: Raziel Alvarez Guevara, Hyun Jin Park
Local voice data processing

Patent number: 11556307

Abstract: Example techniques relate to local voice control in a media playback system. A satellite device (e.g., a playback device or microcontroller unit) may be configured to recognize a local set of keywords in voice inputs including context specific keywords (e.g., for controlling an associated smart device) as well as keywords corresponding to a subset of media playback commands for controlling playback devices in the media playback system. The satellite device may fall back to a hub device (e.g., a playback device) configured to recognize a more extensive set of keywords. In some examples, either device may fall back to the cloud for processing of other voice inputs.

Type: Grant

Filed: January 31, 2021

Date of Patent: January 17, 2023

Assignee: Sonos, Inc.

Inventors: Sebastien Maury, Joseph Dureau, Thibaut Lorrain, Do Kyun Kim
Techniques for dialog processing using contextual data

Patent number: 11551676

Abstract: Techniques are described for using data stored for a user in association with context levels to improve the efficiency and accuracy of dialog processing tasks. A dialog system stores historical dialog data in association with a plurality of configured context levels. The dialog system receives an utterance and identifies a term for disambiguation from the utterance. Based on a determined context level, the dialog system identifies relevant historical data stored to a database. The historical data may be used to perform tasks such as resolving an ambiguity based on user preferences, disambiguating named entities based on a prior dialog, and identifying previously generated answers to queries. Based on the context level, the dialog system can efficiently identify the relevant information and use the identified information to provide a response.

Type: Grant

Filed: August 26, 2020

Date of Patent: January 10, 2023

Assignee: Oracle International Corporation

Inventor: Mark Edward Johnson
System and method supporting context-specific language model

Patent number: 11545144

Abstract: A method, an electronic device, and computer readable medium is provided. The method includes identifying a frequency of each word that is present within a set of words. The method also includes deriving relatedness values for pairs of words. Each pair of words includes a first word and a second word in the set of words. Each relatedness value corresponds to a respective one of the pairs of words. Each relatedness value is based on the identified frequencies that the first word and the second word of the respective pair of words are present within the set of words. The method further includes generating a matrix representing the relatedness values. The method additionally includes generating a language model that represents relationships between the set of words included in the matrix.

Type: Grant

Filed: January 29, 2019

Date of Patent: January 3, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Anil Yadav, Mohammad Moazzami, Allan Jay Schwade
Utterance classifier

Patent number: 11545147

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.

Type: Grant

Filed: May 2, 2019

Date of Patent: January 3, 2023

Assignee: Google LLC

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
Language agnostic missing subtitle detection

Patent number: 11538461

Abstract: Some implementations include methods for detecting missing subtitles associated with a media presentation and may include receiving an audio component and a subtitle component associated with a media presentation, the audio component including an audio sequence, the audio sequence divided into a plurality of audio segments; evaluating the plurality of audio segments using a combination of a recurrent neural network and a convolutional neural network to identify refined speech segments associated with the audio sequence, the recurrent neural network trained based on a plurality of languages, the convolutional neural network trained based on a plurality of categories of sound; determining timestamps associated with the identified refined speech segments; and determining missing subtitles based on the timestamps associated with the identified refined speech segments and timestamps associated with subtitles included in the subtitle component.

Type: Grant

Filed: March 18, 2021

Date of Patent: December 27, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Honey Gupta, Mayank Sharma
Customizable speech recognition system

Patent number: 11538463

Abstract: Methods and systems are provided for generating a customized speech recognition neural network system comprised of an adapted automatic speech recognition neural network and an adapted language model neural network. The automatic speech recognition neural network is first trained in a generic domain and then adapted to a target domain. The language model neural network is first trained in a generic domain and then adapted to a target domain. Such a customized speech recognition neural network system can be used to understand input vocal commands.

Type: Grant

Filed: April 12, 2019

Date of Patent: December 27, 2022

Assignee: Adobe Inc.

Inventors: Trung Huu Bui, Subhadeep Dey, Franck Dernoncourt
Dialogue system and dialogue processing method

Patent number: 11508367

Abstract: A dialogue system includes: a storage configured to store a parameter tree including at least one parameter used for performing an action; a speech input device configured to receive speech from a user; an input processor configured to apply a natural language understanding algorithm to the received speech to generate a speech recognition result; a dialogue manager configured to determine an action corresponding to the received speech based on the speech recognition result, to retrieve a parameter tree corresponding to the action from the storage, and to determine additional information needed to perform the action based on the retrieved parameter tree; and a result processor configured to generate a dialogue response for requesting the additional information.

Type: Grant

Filed: November 4, 2019

Date of Patent: November 22, 2022

Assignees: HYUNDAI MOTOR COMPANY, KIA MOTORS CORPORATION, SOGANG UNIVERSITY RESEARCH & BUSINESS DEVELOPMENT FOUNDATION

Inventors: Youngmin Park, Seona Kim, Jeong-Eom Lee, Jung Yun Seo
Query response device

Patent number: 11507581

Abstract: The invention concerns a query response device comprising: an input adapted to receive user queries; a memory (106) adapted to store one or more routing rules; one or more live agent engines (116) configured to support interactions with one or more live agents; one or more virtual assistant engines (120) configured to support interactions with one or more virtual assistants instantiated by an artificial intelligence module (103); and a routing module (104) coupled to said live agent engines and to said virtual assistant engines, the routing module comprising a processing device configured: to select, based on content of at least a first user message from a first user relating to a first user query and on said one or more routing rules, a first of said live agent engines or a first of said virtual assistant engines; and to route one or more further user messages relating to the first user query to the selected engine.

Type: Grant

Filed: May 5, 2020

Date of Patent: November 22, 2022

Assignee: Accenture Global Services Limited

Inventors: Anatoly Roytman, Alexandre Naressi
Electronic apparatus including control command identification tool generated by using a control command identified by voice recognition identifying a control command corresponding to a user voice and control method thereof

Patent number: 11508375

Abstract: An electronic apparatus is provided. The electronic apparatus includes a microphone, a transceiver, a memory configured to store a control command identification tool based on a control command identified by a voice recognition server that performs voice recognition processing on a user voice received from the electronic apparatus, and at least one processor configured to, based on the user voice being received through the microphone, acquire user intention information by performing the voice recognition processing on the received user voice, receive status information of external devices related to the acquired user intention information from a device control server, identify a control command for controlling a device to be controlled among the plurality of external devices by applying the acquired user intention information and the received status information of the external devices to the control command identification tool, and transmit the identified control command to the device control server.

Type: Grant

Filed: June 17, 2020

Date of Patent: November 22, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Woojei Choi, Minkyong Kim
Media management system for video data processing and adaptation data generation

Patent number: 11501546

Abstract: In various embodiments, methods and systems for implementing a media management system, for video data processing and adaptation data generation, are provided. At a high level, a video data processing engine relies on different types of video data properties and additional auxiliary data resources to perform video optical character recognition operations for recognizing characters in video data. In operation, video data is accessed to identify recognized characters. A video OCR operation to perform on the video data for character recognition is determined from video character processing and video auxiliary data processing. Video auxiliary data processing includes processing an auxiliary reference object; the auxiliary reference object is an indirect reference object that is a derived input element used as a factor in determining the recognized characters. The video data is processed based on the video OCR operation and based on processing the video data, at least one recognized character is communicated.

Type: Grant

Filed: July 27, 2020

Date of Patent: November 15, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Royi Ronen, Ika Bar-Menachem, Ohad Jassin, Avner Levi, Olivier Nano, Oron Nir, Mor Geva Pipek, Ori Ziv
Function execution instruction system

Patent number: 11494554

Abstract: A function execution instruction system includes a function execution instruction unit configured to instruct execution of one or more functions, a sentence input unit configured to input a sentence, an execution function determination unit configured to determine a function the execution of which is instructed on the basis of an input sentence, a time information extraction unit configured to extract time information indicating a time from the input sentence, and a time specification unit configured to, in accordance with a determined function, specify a time used for the execution of the function on the basis of extracted time information wherein the function execution instruction unit instructs the execution of the determined function, which uses a specified time.

Type: Grant

Filed: December 4, 2018

Date of Patent: November 8, 2022

Assignee: NTT DOCOMO, INC.

Inventors: Hiroshi Fujimoto, Kousuke Kadono
Systems and methods for managing voice queries using pronunciation information

Patent number: 11494434

Abstract: The system receives a voice query at an audio interface and converts the voice query to text. The system can determine pronunciation information during conversion and generate metadata the indicates a pronunciation of one or more words of the query, include phonetic information in the text query, or both. A query includes one or more entities, which may be more accurately identified based on pronunciation. The system searches for information, content, or both among one or more databases based on the generated text query, pronunciation information, user profile information, search histories or trends, and optionally other information. The system identifies one or more entities or content items that match the text query, and retrieves the identified information to provide to the user.

Type: Grant

Filed: July 31, 2019

Date of Patent: November 8, 2022

Assignee: ROVI GUIDES, INC.

Inventors: Ankur Aher, Indranil Coomar Doss, Aashish Goyal, Aman Puniyani, Kandala Reddy, Mithun Umesh
Methods and systems for processing, storing, and publishing data collected by an in-ear device

Patent number: 11488590

Abstract: According to some embodiments of the disclosure, a method is disclosed. The method includes receiving, by a processing device of an in-ear device, an audio signal from one or more microphones of the in-ear device. The method further includes extracting, by the processing device, one or more features of the audio signal and generating, by the processing device, an in-ear data object based on the one or more features. The method also includes publishing, by the processing device, the in-ear data object to an external system via a network.

Type: Grant

Filed: May 9, 2019

Date of Patent: November 1, 2022

Assignee: Staton Techiya LLC

Inventors: Charles Cella, John Keady
Portable terminal device and information processing system

Patent number: 11487502

Abstract: A portable terminal device in an information processing system and method includes a camera and a microphone. Data of obtained images and voice are transmitted to a server that identifies operations to be executed based on the received voice and image data. The server transmits an identification of one or more results of the plurality of operations to the portable terminal device. When the portable terminal device receives only one result from the server, an operation corresponding to the one result is executed, and when a plurality of results is received, the portable terminal device displays information corresponding to the plurality of results as candidates. Additional voice is captured for selecting one of the plurality of results during the displaying of the information. A determination of one result from the plurality of results is made based on the captured voice, and an operation corresponding to the determined result is executed.

Type: Grant

Filed: April 29, 2019

Date of Patent: November 1, 2022

Assignee: Maxell, Ltd.

Inventors: Motoyuki Suzuki, Hideo Nishijima
Electronic device and method for identifying input commands of a user

Patent number: 11481087

Abstract: An electronic device, including an input interface configured to receive an input signal, a command determination unit configured to determine a plurality of possible commands based on the input signal, and an output interface configured to provide a plurality of output information corresponding to effects associated with each of the plurality of determined possible commands.

Type: Grant

Filed: March 18, 2015

Date of Patent: October 25, 2022

Assignee: SONY CORPORATION

Inventors: Fritz Hohl, Stefan Uhlich, Wilhelm Hagg, Thomas Kemp
Multi-modal interaction with intelligent assistants in voice command devices

Patent number: 11482215

Abstract: A method comprising detecting an activation of an intelligent assistant on an electronic device, waking up the intelligent assistant from a sleep mode in response to the activation, and determining an amount of vocabulary the intelligent assistant acts upon during a listening mode based on a type of the activation.

Type: Grant

Filed: March 27, 2019

Date of Patent: October 25, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jeffrey C. Olson, Henry N. Holtzman, Jean-David Hsu, Jeffrey A. Morgan
Video stream processing method and apparatus, computer device, and storage medium

Patent number: 11463779

Abstract: A video stream processing method is provided. First audio stream data in live video stream data is obtained. Speech recognition is performed on the first audio stream data to generate speech recognition text. Caption data is generated according to the speech recognition text, the caption data including caption text and time information corresponding to the caption text. The caption text is added to a corresponding picture frame in the live video stream data according to the time information corresponding to the caption text to generate captioned live video stream data.

Type: Grant

Filed: July 7, 2020

Date of Patent: October 4, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Xiaohua Hu, Ziheng Luo, Xiuming Zhu
Electronic apparatus and control method thereof

Patent number: 11462217

Abstract: An electronic apparatus is disclosed.

Type: Grant

Filed: April 21, 2020

Date of Patent: October 4, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Kihyun Song, Jongjin Park, Shina Kim, Sukhoon Yoon, Wonjae Lee, Jongkeun Lee
Searching for music

Patent number: 11461649

Abstract: In implementations of searching for music, a music search system can receive a music search request that includes a music file including music content. The music search system can also receive a selected musical attribute from a plurality of musical attributes. The music search system includes a music search application that can generate musical features of the music content, where a respective one or more of the musical features correspond to a respective one of the musical attributes. The music search application can then compare the musical features that correspond to the selected musical attribute to audio features of audio files, and determine similar audio files to the music file based on the comparison of the musical features to the audio features of the audio files.

Type: Grant

Filed: March 19, 2020

Date of Patent: October 4, 2022

Assignee: Adobe Inc.

Inventors: Jongpil Lee, Nicholas J. Bryan, Justin J. Salamon, Zeyu Jin
Electronic device and method of recognizing audio scene

Patent number: 11462233

Abstract: An electronic device and method of recognizing an audio scene are provided.

Type: Grant

Filed: November 15, 2019

Date of Patent: October 4, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Hoon Heo, Sunmin Kim, Kiwoong Kang, Kibeom Kim, Inwoo Hwang
Spectrogram to waveform synthesis using convolutional networks

Patent number: 11462209

Abstract: For the problem of waveform synthesis from spectrograms, presented herein are embodiments of an efficient neural network architecture, based on transposed convolutions to achieve a high compute intensity and fast inference. In one or more embodiments, for training of the convolutional vocoder architecture, losses are used that are related to perceptual audio quality, as well as a GAN framework to guide with a critic that discerns unrealistic waveforms. While yielding a high-quality audio, embodiments of the model can achieve more than 500 times faster than real-time audio synthesis. Multi-head convolutional neural network (MCNN) embodiments for waveform synthesis from spectrograms are also disclosed. MCNN embodiments enable significantly better utilization of modern multi-core processors than commonly-used iterative algorithms like Griffin-Lim and yield very fast (more than 300× real-time) waveform synthesis.

Type: Grant

Filed: March 27, 2019

Date of Patent: October 4, 2022

Assignee: Baidu USA LLC

Inventors: Sercan Arik, Hee Woo Jun, Eric Undersander, Gregory Diamos
Method for operating speech recognition service and electronic device and server for supporting the same

Patent number: 11449672

Abstract: Disclosed is an electronic device including a communication circuit, a microphone, a memory, speaker, and a processor, in which the processor receives the speech input and transmits first data associated with the speech input to a first server for supporting the speech recognition service, receives second data corresponding to processing of a part of the first data from the first server and outputs the second data at a first time that a first period of time has elapsed after the transmission of the first data, and outputs third data corresponding to processing of the rest of the first data at a second time that a second period of time has elapsed from the first time, while receiving the third data from the first server or a second server for supporting the speech recognition service before the second time.

Type: Grant

Filed: July 18, 2018

Date of Patent: September 20, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Yong Wook Kim, Dong Kyu Lee, Ja Min Goo, Gang Heok Kim
Smart furniture controller with voice recognition

Patent number: 11450317

Abstract: The disclosure relates to a smart furniture controller with voice recognition, including a controller body and a control circuit. The control circuit is disposed in the controller body. The control circuit comprises a main control unit, a voice recognition module, an operation panel, a control output interface and a power module. The voice recognition module is connected to the main control unit; a microphone and a loudspeaker are connected to the voice recognition module; the operation panel is connected to the main control unit, and the control output interface is connected to the main control unit. The disclosed embodiments integrate a voice recognition module in the controller. Voice commands can be recognized through the voice recognition module, allowing to control the smart furniture via voice, and making the operation simpler and more convenient.

Type: Grant

Filed: September 27, 2019

Date of Patent: September 20, 2022

Assignee: eMoMo Technology Co., Ltd.

Inventors: Wenji Tang, Jingzhi Chen, Zhigang Wang, Wei Zhou, Ming Kong, Lin Chen, Shunde Yang, Jiandong Li, Qishuang Lu, Zaigui Yang
Semantic reporting system

Patent number: 11450323

Abstract: Speech is digitized and analyzed by a speech-recognition platform to produce raw text sentences. In various embodiments, the recognized words of each sentence are tokenized based on a grammar, which may be selected by a Recognition Context Controller (RCC) using a context database. A Medical Context Semantic Library (MCSL) contains all medically relevant terms recognized by the system and, once the grammar is selected, the MCSL is used to select a semantic template (consisting of one or more hierarchically organized data structures whose root is a “Concept”). Recognized words are mapped to tokens based on the operative grammar to fill the Concept tree(s). The grammar and the Concept trees can potentially shift after each sentence based on the RCC's analysis. The trees accumulate and are filled as sentences are analyzed. Once all of the sentences have been analyzed, the trees have been filled to the extent possible. Concepts may be organized into higher-level Observations.

Type: Grant

Filed: March 31, 2020

Date of Patent: September 20, 2022

Inventors: Kaushal Shastri, Gerard Muro
Determining and presenting user emotion

Patent number: 11443554

Abstract: One or more computing devices, systems, and/or methods are provided. One or more videos associated with a user may be analyzed to determine a first set of features of the user associated with a first emotion of the user and/or a second set of features of the user associated with a second emotion of the user. A first user emotion profile associated with the user may be generated based upon the first set of features and/or the second set of features. A second video may be presented via a graphical user interface of a first client device. The user may be identified within the second video. It may be determined, based upon the second video and/or the first user emotion profile, that the user is associated with the first emotion. A representation of the first emotion may be displayed via the graphical user interface of the first client device.

Type: Grant

Filed: August 6, 2019

Date of Patent: September 13, 2022

Assignee: Verizon Patent and Licensing Inc.

Inventors: Ariel Raviv, Joel Oren, Irena Grabovitch-Zuyev
Agent device, method for controlling agent device, and storage medium

Patent number: 11437035

Abstract: An agent device is equipped with a plurality of agent controllers which provide a service including causing an output device to output a response of voice in accordance with an utterance of an occupant of a vehicle, in which a first agent controller included in the plurality of agent controllers provides an agent controller different from the first agent controller with first service information on the service to be provided to the occupant.

Type: Grant

Filed: March 10, 2020

Date of Patent: September 6, 2022

Assignee: HONDA MOTOR CO., LTD.

Inventors: Masaki Kurihara, Hiroshi Honda
Answer machine detection method and apparatus

Patent number: 11430465

Abstract: A method of recorded message detection is provided. In this an audio restream is received and a set of landmark features is identified in a section of the audio stream. From these landmark features an audio finger print for the section of the audio stream is determined. This audio finger print is compared with at least one of the plurality of stored audio finger prints, each derived from a respective audio stream. It is determined that the received audio stream is a recorded message if a derived audio finger print is substantially equivalent to one of the plurality of stored audio finger prints representing a recorded message.

Type: Grant

Filed: June 4, 2019

Date of Patent: August 30, 2022

Assignee: Magus Communications Limited

Inventor: Michael Thompson
Systems and methods for team cooperation with real-time recording and transcription of conversations and/or speeches

Patent number: 11431517

Abstract: Methods and systems for team cooperation with real-time recording of one or more moment-associating elements. For example, a method includes: delivering, in response to an instruction, an invitation to each member of one or more members associated with a workspace; granting, in response to acceptance of the invitation by one or more subscribers of the one or more members, subscription permission to the one or more subscribers; receiving the one or more moment-associating elements; transforming the one or more moment-associating elements into one or more pieces of moment-associating information; and transmitting at least one piece of the one or more pieces of moment-associating information to the one or more subscribers.

Type: Grant

Filed: February 3, 2020

Date of Patent: August 30, 2022

Assignee: Otter.ai, Inc.

Inventors: Simon Lau, Yun Fu, James Mason Altreuter, Brian Francis Williams, Xiaoke Huang, Tao Xing, Wen Sun, Tao Lu, Kaisuke Nakajima, Kean Kheong Chin, Hitesh Anand Gupta, Julius Cheng, Jing Pan, Sam Song Liang

prev 1 2 3 4 5 6 … next