Patents Examined by Abul K. Azad

Systems and methods for avoiding inadvertently triggering a voice assistant

Patent number: 11527245

Abstract: Systems and methods are provided herein for avoiding inadvertently trigging a voice assistant with audio played through a speaker. An audio signal is captured by sampling a microphone of the voice assistant at a sampling frequency that is higher than an expected finite sampling frequency of previously recorded audio played through the speaker to generate a voice data sample. A quality metric of the generated voice data sample is calculated by determining whether the generated voice data sample comprises artifacts resulting from previous compression or approximation by the expected finite sampling frequency. Based on the calculated quality metric, it is determined whether the captured audio signal is previously recorded audio played through the speaker. Responsive to the determination that the captured audio signal is previously recorded audio played through the speaker, the voice assistant refrains from being activated.

Type: Grant

Filed: April 29, 2020

Date of Patent: December 13, 2022

Assignee: Rovi Guides, Inc.

Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose
Voice command system and voice command method

Patent number: 11521609

Abstract: A voice command system according to a first disclosure comprises a gateway apparatus having an interface configured to receive a voice command, and a controller configured to perform a registration process of registering a speaker permitted to receive the voice command. The controller is configured to perform an authentication process of rejecting a reception of the voice command when a speaker of the voice command is not registered, and permitting a reception of the voice command when a speaker of the voice command is registered. The controller is configured to perform the authentication process for each voice command.

Type: Grant

Filed: September 26, 2018

Date of Patent: December 6, 2022

Assignee: KYOCERA CORPORATION

Inventor: Yumiko Yamamoto
Apparatus and method for selecting one of a first encoding algorithm and a second encoding algorithm

Patent number: 11521631

Abstract: An apparatus for selecting one of a first encoding algorithm having a first characteristic and a second encoding algorithm having a second characteristic for encoding a portion of an audio signal to obtain an encoded version of the portion of the audio signal has a first estimator for estimating a first quality measure for the portion of the audio signal, which is associated with the first encoding algorithm, without actually encoding and decoding the portion of the audio signal using the first encoding algorithm. A second estimator is provided for estimating a second quality measure for the portion of the audio signal, which is associated with the second encoding algorithm, without actually encoding and decoding the portion of the audio signal using the second encoding algorithm. The apparatus has a controller for selecting the first or second encoding algorithms based on a comparison between the first and second quality measures.

Type: Grant

Filed: March 31, 2020

Date of Patent: December 6, 2022

Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventors: Emmanuel Ravelli, Stefan Doehla, Guillaume Fuchs, Eleni Fotopoulou, Christian Helmrich
Activation of remote devices in a networked system

Patent number: 11514907

Abstract: The present disclosure is generally directed to the generation of voice-activated data flows in interconnected network. The voice-activated data flows can include input audio signals that include a request and are detected at a client device. The client device can transmit the input audio signal to a data processing system, where the input audio signal can be parsed and passed to the data processing system of a service provider to fulfill the request in the input audio signal. The present solution is configured to conserve network resources by reducing the number of network transmissions needed to fulfill a request.

Type: Grant

Filed: April 28, 2020

Date of Patent: November 29, 2022

Assignee: GOOGLE LLC

Inventors: Gaurav Bhaya, Ulas Kirazci, Bradley Abrams, Adam Coimbra, Ilya Firman, Carey Radebaugh
Methods and systems for enabling human-robot interaction to resolve task ambiguity

Patent number: 11501777

Abstract: The disclosure herein relates to methods and systems for enabling human-robot interaction (HRI) to resolve task ambiguity. Conventional techniques that initiates continuous dialogue with the human to ask a suitable question based on the observed scene until resolving the ambiguity are limited. The present disclosure use the concept of Talk-to-Resolve (TTR) which initiates a continuous dialogue with the user based on visual uncertainty analysis and by asking a suitable question that convey the veracity of the problem to the user and seek guidance until all the ambiguities are resolved. The suitable question is formulated based on the scene understanding and the argument spans present in the natural language instruction. The present disclosure asks questions in a natural way that not only ensures that the user can understand the type of confusion, the robot is facing; but also ensures minimal and relevant questioning to resolve the ambiguities.

Type: Grant

Filed: January 29, 2021

Date of Patent: November 15, 2022

Assignee: Tata Consultancy Services Limited

Inventors: Chayan Sarkar, Pradip Pramanick, Snehasis Banerjee, Brojeshwar Bhowmick
System, server, and method for speech recognition of home appliance

Patent number: 11501770

Abstract: Provided is a system, server, and method for speech recognition capable of collectively setting a plurality of setting items for device control through an utterance of a single sentence provided in the form of natural language. The system includes: a home appliance configured to receive a speech command that is generated through an utterance of a single sentence for control of the home appliance; and a server configured to receive the speech command in the single sentence from the home appliance and interpret the speech command of the single sentence through multiple intent determination.

Type: Grant

Filed: August 29, 2018

Date of Patent: November 15, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Eun Jin Chun, Woo Cheol Shin, Nam Gook Cho, Young Soo Do, Min Hyung Lee, Pil Soo Lee
Methods and systems for facilitating accomplishing tasks based on a natural language conversation

Patent number: 11501776

Abstract: Disclosed herein is a system for facilitating accomplishing tasks based on a natural language conversation. Accordingly, the system may include a direct graph unit. Further, the direct graph unit may include a directed graph. Further, the directed graph models a non-linearity of the natural language conversation. Further, the directed graph may include a set of nodes connected by at least one edge. Further, the system may include a context-encoded language understanding unit may include a learning unit and an inferring unit. Further, the learning unit may be configured for receiving a plurality of inputs. Further, the learning unit may be configured for generating a model based on the plurality of inputs. Further, the inferring unit may be configured for receiving a plurality of inputs. Further, the inferring unit may be configured for generating an output based on the plurality of inputs and the model.

Type: Grant

Filed: January 14, 2021

Date of Patent: November 15, 2022

Assignee: KOSMOS AI TECH INC

Inventor: An Wei
System for speech recognition text enhancement fusing multi-modal semantic invariance

Patent number: 11488586

Abstract: Disclosed is a system for speech recognition text enhancement fusing multi-modal semantic invariance, the system includes an acoustic feature extraction module, an acoustic down-sampling module, an acoustic feature extraction module, an acoustic down-sampling module, an encoder and a decoder fusing multi-modal semantic invariance; the acoustic feature extraction module is configured for frame-dividing processing of speech data, dividing the speech data into short-term audio frames with a fixed length, extracting thank acoustic features from the short-term audio frames, and inputting the acoustic features into the acoustic down-sampling module for down-sampling to obtain an acoustic representation; inputting the speech data into an existing speech recognition module to obtain input text data, and inputting the input text data into the encoder to obtain an input text encoded representation; inputting the acoustic representation and the input text encoded representation into the decoder to fuse.

Type: Grant

Filed: July 19, 2022

Date of Patent: November 1, 2022

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jianhua Tao, Shuai Zhang, Jiangyan Yi
Interactive control method and device for voice and video communications

Patent number: 11487503

Abstract: The present invention discloses an interactive control method executed during instant video communication between a user and one or more other users. The method comprises: monitoring video information collected by a camera during the instant video communication between the user and the one or more other users; performing recognition on the video information after acquiring the video information, to acquire user behavior data inputted by the user in a preset manner; determining whether the user behavior data comprises preset trigger information; when it is determined that the user behavior data comprises the preset trigger information, further determining whether the user behavior data comprises a preset gesture action; and when it is determined that the user behavior data comprises the preset gesture action, determining an operation instruction corresponding to the preset gesture action in a preset operation instruction set, and performing an event corresponding to the operation instruction.

Type: Grant

Filed: June 11, 2020

Date of Patent: November 1, 2022

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventor: Feng Li
System for introducing scalability of an action-topic approach to deriving intents from utterances

Patent number: 11475885

Abstract: Methods for mapping intents to utterances using a three-tiered system is provided. Methods may include receiving a plurality of predetermined action-topic pairs and a plurality of predetermined intents. Methods may include mapping the plurality of predetermined action-topic pairs to the plurality of predetermined intents via a one-to-many mapping. Methods may include receiving a linguistic utterance at a first tier of the three-tiered system. Methods may include translating the linguistic utterance at the first tier of the three-tiered system. Methods may include mapping the textual representation to one or more action-topic pairs included in the plurality of action-topic pairs. The mapping may be executed at the second tier of the three-tiered system. Methods may include identifying one or more intents that correlate to the textual representation. The identifying may be executed at the third tier. The identifying may be based on the mapping between the action-topics pairs and the predetermined intents.

Type: Grant

Filed: May 26, 2020

Date of Patent: October 18, 2022

Assignee: Bank of America Corporation

Inventors: Isaac Persing, Emad Noorizadeh
Generating automated assistant responses and/or actions directly from dialog history and resources

Patent number: 11475890

Abstract: Training and/or utilizing a single neural network model to generate, at each of a plurality of assistant turns of a dialog session between a user and an automated assistant, a corresponding automated assistant natural language response and/or a corresponding automated assistant action. For example, at a given assistant turn of a dialog session, both a corresponding natural language response and a corresponding action can be generated jointly and based directly on output generated using the single neural network model. The corresponding response and/or corresponding action can be generated based on processing, using the neural network model, dialog history and a plurality of discrete resources. For example, the neural network model can be used to generate a response and/or action on a token-by-token basis.

Type: Grant

Filed: June 24, 2020

Date of Patent: October 18, 2022

Assignee: GOOGLE LLC

Inventors: Arvind Neelakantan, Daniel Duckworth, Ben Goodrich, Vishaal Prasad, Chinnadhurai Sankar, Semih Yavuz
Robot capable of conversation with another robot and method of controlling the same

Patent number: 11465290

Abstract: A robot capable of conversation with another robot and a method of controlling the same are disclosed. The robot includes a main body having a first region corresponding to a human face and rotatable in left-right direction directions, a signal generator generating a first data signal to be transmitted to a listener robot and a first robot voice signal corresponding to the first data signal, a communication unit transmitting the first data signal to an external server, a speaker outputting the first robot voice signal, and a controller controlling a rotation direction of the main body such that the first region is directed toward the listener robot at a time point adjacent to a transmission time of the first data signal and controlling the speaker to output the first robot voice signal after the rotation direction of the robot is controlled, wherein the listener robot receives the first data signal transmitted from the external server and is controlled to operate based on the first data signal.

Type: Grant

Filed: August 29, 2019

Date of Patent: October 11, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Ji Yoon Park, Jungkwan Son
Devices, systems, and methods for selectively providing contextual language translation

Patent number: 11461560

Abstract: A device includes a memory adapted to store a list in a file or database comprising a plurality of vocabulary words in a first language and, for each vocabulary word, a corresponding word in a second language, a display device, and a processor. The processor is adapted to receive a plurality of words in the first language, select one or more words among the plurality of words, based on one or more predetermined criteria, translate, match or equate the one or more selected words from the first language to words of the second language, and cause the display device to display the plurality of words, wherein one or more first words that are in the plurality of words and are not among the one or more selected words which are displayed in the first language and one or more second words that are in the plurality of words and are among the one or more selected words are displayed in the second language.

Type: Grant

Filed: September 14, 2018

Date of Patent: October 4, 2022

Inventor: Robert F. Deming, Jr.
Unified speech/audio codec (USAC) processing windows sequence based mode switching

Patent number: 11430458

Abstract: A Unified Speech and Audio Codec (USAC) that may process a window sequence based on mode switching is provided. The USAC may perform encoding or decoding by overlapping between frames based on a folding point when mode switching occurs. The USAC may process different window sequences for each situation to perform encoding or decoding, and thereby may improve a coding efficiency.

Type: Grant

Filed: March 31, 2020

Date of Patent: August 30, 2022

Assignees: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, KAWNGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION

Inventors: Seungkwon Beack, Tae Jin Lee, Min Je Kim, Kyeongok Kang, Dae Young Jang, Jeongil Seo, Jin Woo Hong, Chieteuk Ahn, Ho Chong Park, Young-cheol Park
Method and device for voice activity detection

Patent number: 11417354

Abstract: In accordance with an example embodiment of the present invention, disclosed is a method and an apparatus for voice activity detection (VAD). The VAD comprises creating a signal indicative of a primary VAD decision and determining hangover addition. The determination on hangover addition is made in dependence of a short term activity measure and/or a long term activity measure. A signal indicative of a final VAD decision is then created.

Type: Grant

Filed: February 18, 2020

Date of Patent: August 16, 2022

Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)

Inventor: Martin Sehlstedt
Method and device for controlling terminal, and computer readable storage medium

Patent number: 11417331

Abstract: The present disclosure provides a method for controlling a terminal, including the following operations: obtaining recognition results corresponding to control signals after receiving the control signals, and determining whether control instructions corresponding to the recognition results conflict, each control signal comprising at least one of a voice signal or a gesture signal; determining a credibility of each control instruction in response to a determination that there exists conflict among control instructions; and sending the control instruction with highest credibility to a control terminal. The present disclosure further provides a device for controlling a terminal and a computer readable storage medium. When control instructions are received and there exists conflict among control instructions, the control instruction with the highest credibility is sent to the control terminal after the credibility of each control instructions is determined, thereby avoiding settings from conflict.

Type: Grant

Filed: March 6, 2020

Date of Patent: August 16, 2022

Assignees: GD MIDEA AIR-CONDITIONING EQUIPMENT CO., LTD., MIDEA GROUP CO., LTD.

Inventors: Zhicai Ou, Weiying Li
Network source identification via audio signals

Patent number: 11410651

Abstract: Network source identification via audio signals is provided. A system receives data packets with an input audio signal from a client device. The system identifies a request. The system selects a digital component provided by a digital component provider device. The system identifies audio chimes stored in memory of the client device. The system matches, based on a policy, an identifier of the digital component provider device to a first audio chime stored in the memory of the client device. The system determines, based on a characteristic of the first audio chime, a configuration to combine the digital component with the first audio chime. The system generates an action data structure with the digital component, an indication of the first audio chime, and the configuration. The system transmits the action data structure to the client device to cause the client device to generate an output audio signal.

Type: Grant

Filed: April 30, 2020

Date of Patent: August 9, 2022

Assignee: GOOGLE LLC

Inventor: Peter Kraker
Response generation for conversational computing interface

Patent number: 11410643

Abstract: A computer-implemented method of responding to a conversational event. The method comprises enacting, by a conversational computing interface, an initial computer-executable plan based on a conversational event received by the conversational computing interface, wherein the initial computer-executable plan is configured to output an initial value based on the conversational event. The method further comprises selecting, by the conversational computing interface, an extended computer-executable plan based on determining that the initial value is insufficient for generating an extended description responsive to the conversational event. The method further comprises enacting, by the conversational computing interface, the extended computer-executable plan to output additional information beyond what the initial computer-executable plan is configured to output, the additional information sufficient for generating the extended description responsive to the conversational event.

Type: Grant

Filed: October 18, 2019

Date of Patent: August 9, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jacob Daniel Andreas, Jayant Sivarama Krishnamurthy, Alan Xinyu Guo, Andrei Vorobev, John Philip Bufe, III, Jesse Daniel Eskes Rusak, Yuchen Zhang
Apparatus and method for estimating an inter-channel time difference

Patent number: 11410664

Abstract: An apparatus for estimating an inter-channel time difference between a first channel signal and a second channel signal, includes: a calculator for calculating a cross-correlation spectrum for a time block from the first channel signal in the time block and the second channel signal in the time block; a spectral characteristic estimator for estimating a characteristic of a spectrum of the first channel signal or the second channel signal for the time block; a smoothing filter for smoothing the cross-correlation spectrum over time using the spectral characteristic to obtain a smoothed cross-correlation spectrum; and a processor for processing the smoothed cross-correlation spectrum to obtain the inter-channel time difference.

Type: Grant

Filed: February 19, 2020

Date of Patent: August 9, 2022

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Stefan Bayer, Eleni Fotopoulou, Markus Multrus, Guillaume Fuchs, Emmanuel Ravelli, Markus Schnell, Stefan Doehla, Wolfgang Jaegers, Martin Dietz, Goran Markovic
Systems and methods for managing voice environments and voice routines

Patent number: 11404062

Abstract: Provided is a voice assistance system with proactive routines that couples a remote server and respective user voice interactive devices to deliver a complete experience to the end user of the device. The user devices can be managed by groups and/or associated entities who manage voice services for their users. For example, the entities can provide pre-configured voice routines that perform actions on behalf of their users. The voice assistance system can also allow users to customize these routines to improve day to day operation. In addition, external services and/or providers can be linked to the system and allowed to define routines that have external system dependencies. Avoiding and managing conflicts in this environment becomes quite challenging. Some approaches use execution queues and priority, others invoke time slices and limitations on assignment of routines to time slices to resolve these issues, among other examples.

Type: Grant

Filed: July 26, 2021

Date of Patent: August 2, 2022

Assignee: LifePod Solutions, Inc.

Inventors: Nirmalya K. De, Alan R. Bugos, Dale M. Smith, Stuart R. Patterson, Jonathan E. Gordon

prev 1 2 3 4 5 6 7 8 9 … next