Speech Recognition (epo) Patents (Class 704/E15.001)

E Subclasses

Assessment or evaluation of speech recognition systems (epo) (Class 704/E15.002)

Language recognition (epo) (Class 704/E15.003)

Feature extraction for speech recognition; selection of recognition unit (epo) (Class 704/E15.004)

Segmentation or word limit detection (epo) (Class 704/E15.005)

Word boundary detection (EPO) (Class 704/E15.006)

Creation of reference templates; training of speech recognition systems, e.g., adaption to the characteristics of the speaker's voice, etc. (epo) (Class 704/E15.007)

Speech classification or search (epo) (Class 704/E15.014)

Speech recognition techniques for robustness in adverse environments, e.g., in noise, of stress induced speech, etc. (epo) (Class 704/E15.039)

Procedures used during a speech recognition process, e.g., man-machine dialogue, etc. (epo) (Class 704/E15.04)

Speech recognition using nonacoustical features, e.g., position of the lips, etc. (epo) (Class 704/E15.041)

Using position of the lips, movement of the lips, or face analysis (EPO) (Class 704/E15.042)

Speech to text systems (epo) (Class 704/E15.043)

Constructional details of speech recognition systems (epo) (Class 704/E15.046)

Server and method for providing multilingual subtitle service using artificial intelligence learning model, and method for controlling server

Patent number: 11966712

Abstract: Provided are a server and a method for providing a multilingual subtitle service using an artificial intelligence learning model, and a method for controlling the server. The server includes: a communication unit configured to perform data communication with either or both of a first user terminal device of a client requesting translation of a content image and a second user terminal device of a worker performing a translation task; a storage configured to store a worker search list based on learned worker information, and an artificial intelligence learning model for performing a worker's task performance evaluation; and a controller configured to input image information on the content image to the artificial intelligence learning model in accordance with a worker recommendation command of the client to acquire a worker list of workers capable of translating the content image, and control the communication unit to transmit the acquired worker list to the first user terminal device.

Type: Grant

Filed: July 8, 2021

Date of Patent: April 23, 2024

Assignee: GLOZ INC.

Inventors: Kug Koung Lee, Ho Kyun Kim, Bong Wan Kim
Artificial intelligence for intent-based networking

Patent number: 11968088

Abstract: Example implementations include a method, apparatus, and computer-readable medium configured for generating a network configuration using a large language model (LLM). The apparatus receives, at an interface between a user and LLM, a natural language intent for a network configuration. The apparatus requests the large language model to update the network configuration to an updated network configuration that satisfies the natural language intent in a declarative network configuration language. The apparatus verifies whether the updated network configuration satisfies a configuration syntax of the declarative network configuration language to detect an error. The apparatus requests the large language model to update the updated network configuration to correct the error. The apparatus deploys the updated network configuration to a user network.

Type: Grant

Filed: June 7, 2023

Date of Patent: April 23, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yu Yan, Ryan Andrew Beckett, Paramvir Bahl
Method and system for providing real-time trustworthiness analysis

Patent number: 11955114

Abstract: Disclosed herein is a method for providing real-time trustworthiness analysis. The method comprises the steps of: receiving, by a speech data receiving module, speech data; delivering, by the speech data receiving module, the speech data to a speech analysis module; analyzing, by the speech analysis module, the speech data to identify one or more speech attributes; quantifying, by the speech analysis module, at least one of the speech attributes with an attribute score; and determining, by a trustworthiness determination module, a trustworthiness level based on the attribute score of the at least one of the speech attributes.

Type: Grant

Filed: July 14, 2023

Date of Patent: April 9, 2024

Inventor: Craig Hancock, Sr.
Conversation-aware proactive notifications for a voice interface device

Patent number: 11908445

Abstract: A method for proactive notifications in a voice interface device includes: receiving a first user voice request for an action with an future performance time; assigning the first user voice request to a voice assistant service for performance; subsequent to the receiving, receiving a second user voice request and in response to the second user voice request initiating a conversation with the user; and during the conversation: receiving a notification from the voice assistant service of performance of the action; triggering a first audible announcement to the user to indicate a transition from the conversation and interrupting the conversation; triggering a second audible announcement to the user to indicate performance of the action; and triggering a third audible announcement to the user to indicate a transition back to the conversation and rejoining the conversation.

Type: Grant

Filed: May 16, 2022

Date of Patent: February 20, 2024

Assignee: Google LLC

Inventors: Kenneth Mixter, Daniel Colish, Tuan Nguyen
Controlling vehicle functions

Patent number: 11904888

Abstract: A system for controlling autonomously-controllable vehicle functions of an autonomous vehicle cooperating with partner subjects includes a database device with information on communication signals from partner subjects, action objectives, and scenarios, and has an autonomous vehicle with autonomously controllable vehicle functions communicatively connected to the database device. The autonomous vehicle includes a control device with a programmable unit and a surround sensor device. The control device receives sensor signals acquired by the surround sensor device of a surrounding area of the vehicle and communication signals originating from at least one partner subject. The control device determines a situation context based on the database information, and converts the captured communication signals into control signals for the autonomously controllable vehicle functions based on the situation context.

Type: Grant

Filed: October 27, 2022

Date of Patent: February 20, 2024

Assignee: Ford Global Technologies, LLC

Inventors: Ahmed Benmimoun, Mohamed Benmimoun, Sufian Ashraf Mazhari, Mohsen Lakehal-Ayat, Muhammad Adeel Awan
Thermal load prediction method and apparatus, readable medium, and electronic device

Patent number: 11887020

Abstract: A thermal load prediction method and apparatus. The method includes configuring multiple prediction states and corresponding error thresholds and forming a prediction model. The prediction model predicting first thermal load magnitudes respectively corresponding to multiple testing time periods, wherein a target steam user uses boiler steam in the multiple testing time periods. Determining, according to the first thermal load magnitudes, relative prediction errors respectively corresponding to the multiple testing time periods Forming a state transition probability matrix according to the relative prediction errors, and determining a state probability of each prediction state in each future time period of future time periods according to the state transition probability matrix.

Type: Grant

Filed: September 25, 2019

Date of Patent: January 30, 2024

Assignee: ENNEW Technology Co., Ltd.

Inventors: Shengwei Liu, Xin Huang
Unified communications call routing and decision based on integrated analytics-driven database and aggregated data

Patent number: 11889026

Abstract: Exemplary aspects involve a data-communications apparatus or system communicate over a broadband network with a plurality of remotely-located data-communications circuits respectively associated with a plurality of remotely-situated client entities. The system includes data-communications platform (e.g., UC-CC) that processes incoming data-communication interactions including different types of digitally-represented communications among which are incoming call, and that is integrated with a memory circuit including a database of information sets. Each of the information sets includes experience data corresponding to past incoming data-communication interactions processed by the platform, and with aggregated and organized data based on data collected in previous incoming interactions.

Type: Grant

Filed: August 18, 2022

Date of Patent: January 30, 2024

Assignee: 8x8, Inc.

Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
Active speaker tracking using a global naming scheme

Patent number: 11863592

Abstract: A method includes, at a media bridge configured to distribute a plurality of media streams among a plurality of client devices connected to the media bridge over a network, receiving the plurality of media streams from the plurality of client devices via the media bridge. The media bridge connects the plurality of client devices. The method further includes assigning a pair of names for each of the plurality of media streams. The pair of names include a contribution name and a distribution name. The method further includes presenting a first list to the plurality of client devices. The first list including a plurality of the distribution names for the plurality of media streams received from the plurality of client devices. The method further includes providing an indication of a current active speaker within the plurality of media streams via a signaling process.

Type: Grant

Filed: May 14, 2021

Date of Patent: January 2, 2024

Assignee: CISCO TECHNOLOGY, INC.

Inventors: Jacques Samain, Giovanna Carofiglio, Giulio Grassi, Enrico Loparco, Michele Papalini
Analysis and validation of language models

Patent number: 11829720

Abstract: Systems and methods for analysis and validation of language models trained using data that is unavailable or inaccessible are provided. One example method includes, at an electronic device with one or more processors and memory, obtaining a first set of data corresponding to one or more tokens predicted based on one or more previous tokens. The method determines a probability that the first set of data corresponds to a prediction generated by a first language model trained using a user privacy preserving training process. In accordance with a determination that the probability is within a predetermined range, the method determines that the one or more tokens correspond to a prediction associated with the user privacy preserving training process and outputs a predicted token sequence including the one or more tokens and the one or more previous tokens.

Type: Grant

Filed: December 1, 2020

Date of Patent: November 28, 2023

Assignee: Apple Inc.

Inventors: Jerome R. Bellegarda, Bishal Barman, Brent D. Ramerth
Extended reality based voice command device management

Patent number: 11790908

Abstract: A voice command can be received from a user. One or more voice command devices (VCDs) that the voice command is targeting can be determined. A visual indicator of each of the one or more targeted VCDs can be displayed on an XR device worn by the user, wherein each visual indicator visually indicates a respective targeted VCD the voice command is directed to on the XR device.

Type: Grant

Filed: February 9, 2021

Date of Patent: October 17, 2023

Assignee: International Business Machines Corporation

Inventors: Soma Shekar Naganna, Sarbajit K. Rakshit, Abhishek Seth, Matheen Ahmed Pasha
Intelligent agent for interactive service environments

Patent number: 11776546

Abstract: Techniques are described for providing information during a service session, using an intelligent agent. The intelligent agent executes as a process to monitor communications exchanged during a service session between an individual and a service representative (SR) within a service environment. The agent analyzes the communications to identify questions or other topics that are posed by the individual during the service session. The agent retrieves stored data related to such questions or other topics, and generates a message to address each question or other topic. The message is injected into the service session to be presented to the individual, to supplement the conversation that is taking place between the SR and the individual. In some implementations, the agent monitors the communications, generates the message, and/or injects the message into the service session at least partly autonomously of any explicit action taken by the SR.

Type: Grant

Filed: September 8, 2021

Date of Patent: October 3, 2023

Assignee: United Services Automobile Association (USAA )

Inventors: Michael Waldmeier, Yuibi Fujimoto
Video conference collaboration

Patent number: 11778102

Abstract: A system and method providing an accessibility tool that enhances a graphical user interface of an online meeting application is described. In one aspect, a computer-implemented method performed by an accessibility tool (128), the method includes accessing (802), in real-time, audio data of a session of an online meeting application (120), identifying (804) a target user, a speaking user, and a task based on the audio data, the speaking user indicating the task assigned to the target user in the audio data, generating (806) a message (318) that identifies the speaking user, the target user, and the task, the message (318) including textual content, and displaying (808) the message (318) in a chat pane (906) of a graphical user interface (902) of the online meeting application (120) during the session.

Type: Grant

Filed: April 1, 2022

Date of Patent: October 3, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shahil Soni, Charles Yin-Che Lee
Biometric authentication

Patent number: 11748462

Abstract: A method for authenticating a user of an electronic device is disclosed. The method comprises: responsive to detection of a trigger event indicative of a user interaction with the electronic device, generating an audio probe signal to play through an audio transducer of the electronic device; receiving a first audio signal comprising a response of the user's ear to the audio probe signal; receiving a second audio signal comprising speech of the user; and applying an ear biometric algorithm to the first audio signal and a voice biometric algorithm to the second audio signal to authenticate the user as an authorised user.

Type: Grant

Filed: December 7, 2020

Date of Patent: September 5, 2023

Assignee: Cirrus Logic Inc.

Inventor: John Paul Lesso
System and methods for training task-oriented dialogue (TOD) language models

Patent number: 11749264

Abstract: Embodiments described herein provide methods and systems for training task-oriented dialogue (TOD) language models. In some embodiments, a TOD language model may receive a TOD dataset including a plurality of dialogues and a model input sequence may be generated from the dialogues using a first token prefixed to each user utterance and a second token prefixed to each system response of the dialogues. In some embodiments, the first token or the second token may be randomly replaced with a mask token to generate a masked training sequence and a masked language modeling (MLM) loss may be computed using the masked training sequence. In some embodiments, the TOD language model may be updated based on the MLM loss.

Type: Grant

Filed: November 3, 2020

Date of Patent: September 5, 2023

Assignee: Salesforce, Inc.

Inventors: Chien-Sheng Wu, Chu Hong Hoi, Richard Socher, Caiming Xiong
Method and apparatus and telephonic system for acoustic scene conversion

Patent number: 11741984

Abstract: An acoustic scene conversion method, comprising: receiving sound signals including user's speech and scenic sounds; processing the sound signals according to an artificial intelligence model to generate enhanced speech signals without scenic sounds; and mixing the enhanced speech signals with new scenic sounds to produce converted sound signals.

Type: Grant

Filed: June 1, 2021

Date of Patent: August 29, 2023

Assignee: ACADEMIA SINICA

Inventors: Tsao Yu, Syu-Siang Wang, Szu-Wei Fu, Alexander Chao-Fu Kang, Hsin-Min Wang
Memory retention system

Patent number: 11709654

Abstract: The present disclosure generally relates to a computer-implemented system for intelligently retaining and recalling memory data. An exemplary method comprises receiving, via a microphone of an electronic device, a speech input of the user; receiving a text input of the user; constructing a first instance of a memory data structure based on the speech input; constructing a second instance of the memory data structure based on the text input; adding the first instance and the second instance of the memory data structure to a memory stack of the user; displaying a user interface for retrieving memory data of the user; receiving, via the user interface, a beginning of a statement from the user; retrieving a particular instance of the memory data structure from the memory stack based on the beginning of the statement; and automatically displaying a completion of the statement.

Type: Grant

Filed: May 19, 2022

Date of Patent: July 25, 2023

Assignee: Human AI Labs, Inc.

Inventors: Suman Kanuganti, Xiaoran Zhang, Kristie Kaiser
Agent control device

Patent number: 11710484

Abstract: An agent control device configured to execute a plurality of agents and including a processor, the processor being configured to store an interruptibility list that stipulates interruptibility of execution for each function of one given agent being executed or for an execution status of the one given agent; request execution of each of the agents at a prescribed trigger, or request execution of another given agent at a specific trigger, reference the interruptibility list in order to set permissibility information relating to executability of the other given agent in conjunction with execution of the one given agent; and perform management such that, in a case in which there is a request at the specific trigger for execution of the other given agent while the one given agent is executing, the other given agent is executed based on the request.

Type: Grant

Filed: April 8, 2021

Date of Patent: July 25, 2023

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor: Satoshi Aihara
Display device

Patent number: 11706482

Abstract: Provided is a display device including a display unit, a storage unit configured to store information on a web page, a microphone configured to receive a user's voice command, a network interface unit configured to perform communication with a natural language processing (NLP) server, and a controller configured to transmit text data of the voice command to the NLP server, to receive intention analysis result information corresponding to the voice command from the NLP server, to select, as a final candidate address, one of a plurality of candidate addresses related to a search word included in the received intention analysis result information if the search word is not stored in the storage unit, and to access a website corresponding to the selected final candidate address.

Type: Grant

Filed: February 20, 2018

Date of Patent: July 18, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Chulmin Son, Seunghyun Heo, Jaekyung Lee
System and method for fast and accurate netlist to RTL reverse engineering

Patent number: 11704460

Abstract: Embodiments herein provide for reverse engineering of integrated circuits (ICs) for design verification. In example embodiments, an apparatus receives a gate-level netlist for an integrated circuit (IC), generates a list of equivalence classes related to signals included in the gate-level netlist, determines control signals of the gate-level netlist based at least in part on the list of equivalence classes, determines a logic flow of a finite state transducer (FST) based at least in part on the control signals, and generates register transfer level (RTL) source code for the IC based on the FST.

Type: Grant

Filed: June 9, 2021

Date of Patent: July 18, 2023

Assignee: UNIVERSITY OF FLORIDA RESEARCH FOUNDATION, INCORPORATED

Inventors: Yier Jin, Shaojie Zhang, James Geist, Travis Meade, Jason Liam Portillo
Multimedia processing method and electronic system

Patent number: 11699429

Abstract: An electronic system is provided. The electronic system includes a host and a display. The host includes an audio processing module, and a smart interpreter engine. The audio processing module is utilized for acquiring audio data corresponding to a first language from audio streams processed by an application program executed on the host. The application program executed on the host includes a specific game software. The smart interpreter engine is utilized for receiving the audio data corresponding to the first language from the audio processing module and converting the audio data corresponding to the first language into text data corresponding to a second language according to the game software executed on the host The display is utilized for receiving the text data corresponding to the second language from the smart interpreter engine and displaying the text data corresponding to the second language.

Type: Grant

Filed: March 3, 2021

Date of Patent: July 11, 2023

Assignee: ACER INCORPORATED

Inventors: Gianna Tseng, Szu-Ting Chou, Shang-Yao Lin, Shih-Cheng Huang
Machine learning for improving quality of voice biometrics

Patent number: 11699446

Abstract: Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.

Type: Grant

Filed: May 19, 2021

Date of Patent: July 11, 2023

Assignee: Capital One Services, LLC

Inventors: Bozhao Tan, Isabelle Alice Yvonne Moulinier, David Almquist, June Wu
Template-based intent classification for chatbots

Patent number: 11694032

Abstract: The present disclosure relates to chatbot systems and, more particularly, to techniques for determining that an input utterance is representative of a task that a particular chatbot can perform, based on matching the input utterance to a template. Techniques are also described for generating templates based on example utterances that have been provided for a chatbot. In certain embodiments, an initial set of templates is generated based on example utterances. This initial set of templates is then refined using template generalization techniques, which can be performed at the word or sentence level to generate a final set of templates for use at runtime, when the templates are matched against user utterances. The final set of templates may include one or more generalized templates that were derived from the initial set of templates and may also include the initial set of templates.

Type: Grant

Filed: September 3, 2020

Date of Patent: July 4, 2023

Assignee: Oracle International Corporation

Inventors: Stephen Andrew McRitchie, Sunghye Jeon
System and method for facilitating electronic financial transactions during a phone call

Patent number: 11687908

Abstract: A payment button on a device capable of making telephone calls, such as a mobile phone, allows a payer to electronically transfer money while in a phone call with a payee. The payment button also allows a payee to initiate an electronic payment transaction while in a phone call with a payer. The payment button may be a clickable or tappable virtual button presented on a display of the phone when being used to make or receive a call. The payer or the payee can simply enter a payment amount on the phone to complete an electronic payment transaction. A notification of payment is instantly transmitted to the phones being used for the phone call, so that the parties can safely and conveniently conclude a purchase and/or payment transaction during one phone call.

Type: Grant

Filed: June 7, 2021

Date of Patent: June 27, 2023

Assignee: PAYPAL, INC.

Inventors: Saumil Ashvin Gandhi, Ray Hideki Tanaka
Automatic speech recognition triggering system

Patent number: 11683632

Abstract: An automatic speech recognition (ASR) triggering system, and a method of providing an ASR trigger signal, is described. The ASR triggering system can include a microphone to generate an acoustic signal representing an acoustic vibration and an accelerometer worn in an ear canal of a user to generate a non-acoustic signal representing a bone conduction vibration. A processor of the ASR triggering system can receive an acoustic trigger signal based on the acoustic signal and a non-acoustic trigger signal based on the non-acoustic signal, and combine the trigger signals to gate an ASR trigger signal. For example, the ASR trigger signal may be provided to an ASR server only when the trigger signals are simultaneously asserted. Other embodiments are also described and claimed.

Type: Grant

Filed: August 17, 2021

Date of Patent: June 20, 2023

Assignee: Apple Inc.

Inventors: Sorin V. Dusan, Aram M. Lindahl, Robert D. Watson
End-to-end multi-speaker audio-visual automatic speech recognition

Patent number: 11615781

Abstract: A singe audio-visual automated speech recognition model for transcribing speech from audio-visual data includes an encoder frontend and a decoder. The encoder includes an attention mechanism configured to receive an audio track of the audio-visual data and a video portion of the audio-visual data. The video portion of the audio-visual data includes a plurality of video face tracks each associated with a face of a respective person. For each video face track of the plurality of video face tracks, the attention mechanism is configured to determine a confidence score indicating a likelihood that the face of the respective person associated with the video face tack includes a speaking face of the audio track. The decoder is configured to process the audio track and the video face track of the plurality of video face tracks associated with the highest confidence score to determine a speech recognition result of the audio track.

Type: Grant

Filed: October 2, 2020

Date of Patent: March 28, 2023

Assignee: Google LLC

Inventor: Otavio Braga
Learning word-level confidence for subword end-to-end automatic speech recognition

Patent number: 11610586

Abstract: A method includes receiving a speech recognition result, and using a confidence estimation module (CEM), for each sub-word unit in a sequence of hypothesized sub-word units for the speech recognition result: obtaining a respective confidence embedding that represents a set of confidence features; generating, using a first attention mechanism, a confidence feature vector; generating, using a second attention mechanism, an acoustic context vector; and generating, as output from an output layer of the CEM, a respective confidence output score for each corresponding sub-word unit based on the confidence feature vector and the acoustic feature vector received as input by the output layer of the CEM. For each of the one or more words formed by the sequence of hypothesized sub-word units, the method also includes determining a respective word-level confidence score for the word. The method also includes determining an utterance-level confidence score by aggregating the word-level confidence scores.

Type: Grant

Filed: February 23, 2021

Date of Patent: March 21, 2023

Assignee: Google LLC

Inventors: David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara Sainath, Ian Mcgraw
Information processing apparatus that cooperates with smart speaker, information processing system, control methods, and storage media

Patent number: 11595535

Abstract: An information processing apparatus that is capable of reducing time and effort to set settings of a smart speaker that cooperates with the information processing apparatus when a user starts to use the smart speaker. The information processing apparatus acquires identification information of the user, and acquires audio control information associated with the acquired identification information. Then, the information processing apparatus requests the smart speaker to change the audio setting of the smart speaker based on the acquired audio control information.

Type: Grant

Filed: June 10, 2021

Date of Patent: February 28, 2023

Assignee: CANON KABUSHIKI KAISHA

Inventor: Ryosuke Kasahara
Automated audio-to-text transcription in multi-device teleconferences

Patent number: 11574638

Abstract: A system and method are disclosed for generating a teleconference space for two or more communication devices using a computer coupled with a database and comprising a processor and memory. The computer generates a teleconference space and transmits requests to join the teleconference space to the two or more communication devices. The computer stores in memory identification information, and audiovisual data associated with one or more users, for each of the two or more communication devices. The computer stores audio transcription data, transmitted to the computer by each of the two or more communication devices and associated with one or more communication device users, in the computer memory. The computer merges the audio transcription data from each of the two or more communication devices into a master audio transcript, and transmits the master audio transcript to each of the two or more communication devices.

Type: Grant

Filed: May 9, 2022

Date of Patent: February 7, 2023

Assignee: Nextiva, Inc.

Inventors: Tomas Gorny, Jean-Baptiste Martinoli, Tracy Conrad, Lukas Gorny
Phrase recognition model for autonomous vehicles

Patent number: 11562573

Abstract: Aspects of the disclosure relate to training and using a phrase recognition model to identify phrases in images. As an example, a selected phrase list may include a plurality of phrases is received. Each phrase of the plurality of phrases includes text. An initial plurality of images may be received. A training image set may be selected from the initial plurality of images by identifying the phrase-containing images that include one or more phrases from the selected phrase list. Each given phrase-containing image of the training image set may be labeled with information identifying the one or more phrases from the selected phrase list included in the given phrase-containing images. The model may be trained based on the training image set such that the model is configured to, in response to receiving an input image, output data indicating whether a phrase of the plurality of phrases is included in the input image.

Type: Grant

Filed: December 16, 2020

Date of Patent: January 24, 2023

Assignee: Waymo LLC

Inventors: Victoria Dean, Abhijit S Ogale, Henrik Kretzschmar, David Harrison Silver, Carl Kershaw, Pankaj Chaudhari, Chen Wu, Congcong Li
Information processing device, information processing method, and recording medium

Patent number: 11514787

Abstract: In an information processing device, a first acquirer acquires, from a user, plan information including a scheduled time and a destination. A second acquirer acquires a spare time. A third acquirer acquires travelling schedule information for enabling arrival at the destination earlier than the scheduled time by the spare time or more. A display controller displays, on a display unit, information regarding the travelling schedule information and the spare time.

Type: Grant

Filed: August 1, 2019

Date of Patent: November 29, 2022

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Koichi Suzuki, Makoto Akahane
Data mining apparatus, method and system for speech recognition using the same

Patent number: 11495234

Abstract: A data mining device, and a speech recognition method and system using the same are disclosed. The speech recognition method includes selecting speech data including a dialect from speech data, analyzing and refining the speech data including a dialect, and learning an acoustic model and a language model through an artificial intelligence (AI) algorithm using the refined speech data including a dialect. The user is able to use a dialect speech recognition service which is improved using services such as eMBB, URLLC, or mMTC of 5G mobile communications.

Type: Grant

Filed: May 30, 2019

Date of Patent: November 8, 2022

Assignee: LG Electronics Inc.

Inventors: Jee Hye Lee, Seon Yeong Park
Display device and method for controlling same

Patent number: 11488598

Abstract: The present disclosure relates to a display device. The display device includes a display; a signal receiver configured to receive a user's voice signal through at least one of a plurality of devices; and a processor configured to: display an image of at least one of a plurality of programs on the display by executing the plurality of programs, identify a program corresponding to a device receiving the voice signal among the plurality of programs based on matching information set by the user regarding a mutual correspondence between the plurality of programs and the plurality of devices, in response to the user's voice signal received through any one of the plurality of devices, and control the identified program to operate according to a user command corresponding to the received voice signal. Thereby, it is possible to control a control target program to a user's intention according to a voice command even if a user who inputs the voice command does not separately designate the control target program.

Type: Grant

Filed: January 3, 2019

Date of Patent: November 1, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Youngsoo Yun
Multimedia processing circuit and electronic system

Patent number: 11482229

Abstract: A multimedia processing circuit is provided. The multimedia processing circuit includes a smart interpreter engine and an audio engine. The smart interpreter engine includes a speech to text converter, a natural language processing module and a translator. The speech to text converter is utilized for converting speech data into text data corresponding to the first language. The natural language processing module is utilized for converting the text data corresponding to the first language into glossary text data corresponding to the first language according to an application program being executed in a host. The application program comprises a specific game software. The translator is utilized for converting the glossary text data corresponding to the first language into text data corresponding to a second language. The audio engine is utilized for converting the speech data corresponding to the first language into an analog speech signal corresponding to the first language.

Type: Grant

Filed: May 26, 2020

Date of Patent: October 25, 2022

Assignee: ACER INCORPORATED

Inventors: Gianna Tseng, Shih-Cheng Huang, Shang-Yao Lin, Szu-Ting Chou
Method, system for determining electronic device, computer system and readable storage medium

Patent number: 11481036

Abstract: A method for determining an electronic device, a system for determining an electronic device, a computer system, and a computer-readable storage medium, the method includes: acquiring a recognition result by recognizing a first action performed by an operating object through a first electronic device (S201); and determining a second electronic device which is controllable by the first electronic device according to the recognition result (S202).

Type: Grant

Filed: April 12, 2019

Date of Patent: October 25, 2022

Assignees: Beijing JingDong ShangKe Information Technology Co., Ltd., Beijing Jingdong Century Trading Co., Ltd.

Inventors: Yazhuo Wang, Yu Guan, Zhongfei Xu
Display device for processing user utterance and control method of display device

Patent number: 11460979

Abstract: According to an embodiment disclosed in the specification, a display device may include a microphone, a display displaying a screen including a plurality of layers, a memory storing a plurality of application programs, and at least one processor displaying a first user interface (UI) for interacting with a user on a first layer among the plurality of layers, displaying a second UI for displaying information obtained by performing the interaction on a second layer among the plurality of layers, and displaying an image at least partly overlapping with the first UI and the second UI on a third layer among the plurality of layers.

Type: Grant

Filed: December 28, 2018

Date of Patent: October 4, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jibum Moon, Jina Kwon, Kyerim Lee
Speech recognition method, apparatus, and device, and storage medium

Patent number: 11450312

Abstract: A speech recognition method includes: obtaining speech information; and determining beginning and ending positions of a candidate speech segment in the speech information by using a weighted finite state transducer (WFST) network. The candidate speech segment is identified as corresponding to a preset keyword. The method also includes clipping the candidate speech segment from the speech information according to the beginning and ending positions of the candidate speech segment; detecting whether the candidate speech segment includes a preset keyword by using a machine learning model; and determining, upon determining that the candidate speech segment comprises the preset keyword, that the speech information comprises the preset keyword.

Type: Grant

Filed: June 12, 2020

Date of Patent: September 20, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Shilun Lin, Xilin Zhang, Wenhua Ma, Bo Liu, Xinhui Li, Li Lu, Xiucai Jiang
Information processing apparatus and speech analysis method

Patent number: 11404064

Abstract: An information processing apparatus includes a first detector, a textualization device, a second detector, a display device and a display controller. The first detector detects, from audio data in which speech of each person in a group composed of a plurality of persons has been recorded, each utterance made during the speech. The textualization device converts contents of each utterance detected by the first detector into text. The second detector detects predetermined keywords included in each utterance on the basis of text data obtained through textualization by the textualization device. The display controller causes the display device to display the predetermined keywords detected by the second detector.

Type: Grant

Filed: November 2, 2018

Date of Patent: August 2, 2022

Assignee: KYOCERA Document Solutions Inc.

Inventors: Yuki Kobayashi, Nami Nishimura, Tomoko Mano
Processing method of learning face recognition by artificial intelligence module

Patent number: 11403875

Abstract: A processing method of face recognition includes steps of: extracting embedding feature information from a face image; outputting a recognition result of face recognition according to the embedding feature information, wherein the recognition result includes a recognized name and embedding feature distance information; determining whether the recognized name is in a list or not; if the recognized name is in the list, performing a removal checking step for determining whether to remove the recognition result based on the embedding feature distance information; if determining that the recognition result is not to be removed, displaying the recognized name; if determining that the recognition result is to be removed, displaying a negative prompt; and dynamically and instantly providing a feedback and updating a recognition method for the face recognition.

Type: Grant

Filed: November 20, 2020

Date of Patent: August 2, 2022

Assignee: ASKEY COMPUTER CORP.

Inventors: Chien-Fang Chen, Setya Widyawan Prakosa, Huan-Ruei Shiu, Chien-Ming Lee
Method, apparatus, device and computer-readable storage medium for voice interaction

Patent number: 11393490

Abstract: According to embodiments of the present disclosure, a method, apparatus, device, and computer readable storage medium for voice interaction are provided. The method includes: determining a text corresponding to the voice signal based on a voice feature of a received voice signal. The method further includes: determining, based on the voice feature and the text, a matching degree between a reference voice feature of an element in the text and a target voice feature of the element. The method further includes: determining a first possibility that the voice signal is an executable command based on the text. The method further includes: determining a second possibility that the voice signal is the executable command based on the voice feature.

Type: Grant

Filed: June 8, 2020

Date of Patent: July 19, 2022

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Zhijian Wang, Jinfeng Bai, Sheng Qian, Lei Jia
Method for operating a motor vehicle system of a motor vehicle depending on the driving situation, personalization device, and motor vehicle

Patent number: 11364926

Abstract: The invention relates to a method for operating a motor vehicle system of a motor vehicle regardless of the driving situation. The method is performed by a personalization device of the motor vehicle and includes identifying the driver of the motor vehicle and using the identity of the driver to determine multiple driver-specific configuration data sets. Each of the determined configuration data sets describes configuration data of a respective user profile of the identified driver in order to personalize the motor vehicle system. The method further includes determining at least one additional occupant in the motor vehicle and using the result of the determination of the at least one occupant, determine an intention of the determined driver. The method further incudes using the determined intention to select a personalization mode from a plurality of personalization modes.

Type: Grant

Filed: April 4, 2019

Date of Patent: June 21, 2022

Assignee: AUDI AG

Inventors: Jürgen Lerzer, Nikoletta Sofra, Hans Georg Gruber, André Ebner, Ron Melz
Name Based Initiation of Speech Recognition

Publication number: 20150127345

Abstract: A computer-implemented method includes listening for audio name information indicative of a name of a computer, with the computer configured to listen for the audio name information in a first power mode that promotes a conservation of power; detecting the audio name information indicative of the name of the computer; after detection of the audio name information, switching to a second power mode that promotes a performance of speech recognition; receiving audio command information; and performing speech recognition on the audio command information.

Type: Application

Filed: September 30, 2011

Publication date: May 7, 2015

Inventors: Evan H. Parker, Michal R. Grabowski
Method of navigating in a sound content

Patent number: 8942980

Abstract: A method of navigating in a sound content wherein at least one key word is stored in association with at least two positions representative of said key word in the sound content, and wherein the method comprises: a step of displaying a representation of the sound content; during playback of the sound content, a step of detecting a current extract representative of a key word stored at a first position; a step of determining at least one second extract representative of said key word and a second position as a function of the stored positions; and a step of highlighting the position of the extracts in the representation of the sound content. The invention also relates to a system adapted to implement the navigation method.

Type: Grant

Filed: February 11, 2011

Date of Patent: January 27, 2015

Assignee: Orange

Inventors: Pascal Le Mer, Delphine Charlet, Marc Denjean, Antoine Gonot
Secure communication network

Patent number: 8930576

Abstract: The present invention is directed to a secure communication network that enables multi-point to multi-point proxy communication over the network. The network employs a smart server that establishes a secure communication link with each of a plurality of smart client devices deployed on local client networks. Each smart client device is in communication with a plurality of agent devices. A plurality of remote devices can access the smart server directly and communicate with an agent device via the secure communication link between the smart server and one of the smart client devices.

Type: Grant

Filed: July 11, 2014

Date of Patent: January 6, 2015

Assignee: KE2 Therm Solutions, Inc.

Inventors: Steve Roberts, Cetin Sert
Compounded Text Segmentation

Publication number: 20140372119

Abstract: In general, the subject matter described in this specification can be embodied in methods, systems, and program products for performing compounded text segmentation. Compounded text that is extracted from one or more search queries submitted to a search engine is received. The compounded text includes a plurality of individual words that are joined together without intervening spaces. An electronic dictionary including words is accessed. A data structure representing possible segmentations of the compounded text is generated based on whether words in the possible segmentations occur in the electronic dictionary. A data store comprising data associated with a same field of usage as the compounded text is accessed to determine a frequency of occurrence for possible segmentations of the data structure. A segmentation of the compounded text that is most probable based on the data is determined. A language model is trained using the determined segmentation of the compounded text.

Type: Application

Filed: September 28, 2009

Publication date: December 18, 2014

Inventors: Carolina Parada, Boulos Harb, Johan Schalkwyk
Secure communication network

Patent number: 8850072

Abstract: The present invention is directed to a secure communication network that enables multi-point to multi-point proxy communication over the network. The network employs a smart server that establishes a secure communication link with each of a plurality of smart client devices installed on local client networks. Each smart client device is in communication with a plurality of agent devices. A plurality of remote devices can access the smart server directly and communicate with agent devices via the secure communication link between the smart server and one of the smart client devices. This communication is enabled without complex configuration of firewall or network parameters by the user.

Type: Grant

Filed: July 25, 2013

Date of Patent: September 30, 2014

Assignee: KE2 Therm Solutions, Inc.

Inventors: Steve Roberts, Cetin Sert
Methods and Systems for Interfaces Allowing Limited Edits to Transcripts

Publication number: 20140249813

Abstract: A transcript interface for displaying a plurality of words of a transcript in a text editor can be provided and configured to receive a command to edit the transcript. Limited edits to data corresponding to the transcript can be made based on in response to commands received via the user interface module. For example, edits may be limited to selection of a single word in the text editor for editing via a given command. The edit may affect an adjacent word in some instances, such as when two adjacent words are merged. In some embodiments, data corresponding to the selected word of the transcript is changed to reflect the edit without changing data defining the relative timing of those words of the transcript that are not adjacent to the selected word.

Type: Application

Filed: December 1, 2008

Publication date: September 4, 2014

Applicant: Adobe Systems Incorporated

Inventor: Steven Hoeg
Extendable voice commands

Patent number: 8731609

Abstract: A mobile device, such as a cellular telephone includes a voice interface that includes one part that may not be specific to a particular carrier, and a second part that provides an interface to services that are specific to a carrier or to service or information providers that are not necessarily available with all carriers. A voice command interface provides easy access to the carrier services. The set of carrier services is optionally extendible by the carrier.

Type: Grant

Filed: August 9, 2011

Date of Patent: May 20, 2014

Assignee: Nuanace Communications, Inc.

Inventors: Daniel L. Roth, Chris Reiner, Mark Furnari, Jordan Cohen
Recognition of Speech With Different Accents

Publication number: 20140129218

Abstract: Computer-based speech recognition can be improved by recognizing words with an accurate accent model. In order to provide a large number of possible accents, while providing real-time speech recognition, a language tree data structure of possible accents is provided in one embodiment such that a computerized speech recognition system can benefit from choosing among accent categories when searching for an appropriate accent model for speech recognition.

Type: Application

Filed: November 6, 2012

Publication date: May 8, 2014

Applicant: Spansion LLC

Inventors: Chen Liu, Richard Fastow
Senone Scoring For Multiple Input Streams

Publication number: 20140129217

Abstract: Embodiments of the present invention include an apparatus, method, and system for calculating senone scores for multiple concurrent input speech streams. The method can include the following: receiving one or more feature vectors from one or more input streams; accessing the acoustic model one senone at a time; and calculating separate senone scores corresponding to each incoming feature vector. The calculation uses a single read access to the acoustic model for a single senone and calculates a set of separate senone scores for the one or more feature vectors, before proceeding to the next senone in the acoustic model.

Type: Application

Filed: November 6, 2012

Publication date: May 8, 2014

Inventor: Ojas A. BAPAT
AUGMENTING SPEECH RECOGNITION WITH DEPTH IMAGING

Publication number: 20140122086

Abstract: Embodiments related to the use of depth imaging to augment speech recognition are disclosed. For example, one disclosed embodiment provides, on a computing device, a method including receiving depth information of a physical space from a depth camera, receiving audio information from one or more microphones, identifying a set of one or more possible spoken words from the audio information, determining a speech input for the computing device based upon comparing the set of one or more possible spoken words from the audio information and the depth information, and taking an action on the computing device based upon the speech input determined.

Type: Application

Filed: October 26, 2012

Publication date: May 1, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Jay Kapur, Ivan Tashev, Mike Seltzer, Stephen Edward Hodges

1 2 3 4 5 … next