Speech Recognition (epo) Patents (Class 704/E15.001)

  • Patent number: 11966712
    Abstract: Provided are a server and a method for providing a multilingual subtitle service using an artificial intelligence learning model, and a method for controlling the server. The server includes: a communication unit configured to perform data communication with either or both of a first user terminal device of a client requesting translation of a content image and a second user terminal device of a worker performing a translation task; a storage configured to store a worker search list based on learned worker information, and an artificial intelligence learning model for performing a worker's task performance evaluation; and a controller configured to input image information on the content image to the artificial intelligence learning model in accordance with a worker recommendation command of the client to acquire a worker list of workers capable of translating the content image, and control the communication unit to transmit the acquired worker list to the first user terminal device.
    Type: Grant
    Filed: July 8, 2021
    Date of Patent: April 23, 2024
    Assignee: GLOZ INC.
    Inventors: Kug Koung Lee, Ho Kyun Kim, Bong Wan Kim
  • Patent number: 11968088
    Abstract: Example implementations include a method, apparatus, and computer-readable medium configured for generating a network configuration using a large language model (LLM). The apparatus receives, at an interface between a user and LLM, a natural language intent for a network configuration. The apparatus requests the large language model to update the network configuration to an updated network configuration that satisfies the natural language intent in a declarative network configuration language. The apparatus verifies whether the updated network configuration satisfies a configuration syntax of the declarative network configuration language to detect an error. The apparatus requests the large language model to update the updated network configuration to correct the error. The apparatus deploys the updated network configuration to a user network.
    Type: Grant
    Filed: June 7, 2023
    Date of Patent: April 23, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yu Yan, Ryan Andrew Beckett, Paramvir Bahl
  • Patent number: 11955114
    Abstract: Disclosed herein is a method for providing real-time trustworthiness analysis. The method comprises the steps of: receiving, by a speech data receiving module, speech data; delivering, by the speech data receiving module, the speech data to a speech analysis module; analyzing, by the speech analysis module, the speech data to identify one or more speech attributes; quantifying, by the speech analysis module, at least one of the speech attributes with an attribute score; and determining, by a trustworthiness determination module, a trustworthiness level based on the attribute score of the at least one of the speech attributes.
    Type: Grant
    Filed: July 14, 2023
    Date of Patent: April 9, 2024
    Inventor: Craig Hancock, Sr.
  • Patent number: 11908445
    Abstract: A method for proactive notifications in a voice interface device includes: receiving a first user voice request for an action with an future performance time; assigning the first user voice request to a voice assistant service for performance; subsequent to the receiving, receiving a second user voice request and in response to the second user voice request initiating a conversation with the user; and during the conversation: receiving a notification from the voice assistant service of performance of the action; triggering a first audible announcement to the user to indicate a transition from the conversation and interrupting the conversation; triggering a second audible announcement to the user to indicate performance of the action; and triggering a third audible announcement to the user to indicate a transition back to the conversation and rejoining the conversation.
    Type: Grant
    Filed: May 16, 2022
    Date of Patent: February 20, 2024
    Assignee: Google LLC
    Inventors: Kenneth Mixter, Daniel Colish, Tuan Nguyen
  • Patent number: 11904888
    Abstract: A system for controlling autonomously-controllable vehicle functions of an autonomous vehicle cooperating with partner subjects includes a database device with information on communication signals from partner subjects, action objectives, and scenarios, and has an autonomous vehicle with autonomously controllable vehicle functions communicatively connected to the database device. The autonomous vehicle includes a control device with a programmable unit and a surround sensor device. The control device receives sensor signals acquired by the surround sensor device of a surrounding area of the vehicle and communication signals originating from at least one partner subject. The control device determines a situation context based on the database information, and converts the captured communication signals into control signals for the autonomously controllable vehicle functions based on the situation context.
    Type: Grant
    Filed: October 27, 2022
    Date of Patent: February 20, 2024
    Assignee: Ford Global Technologies, LLC
    Inventors: Ahmed Benmimoun, Mohamed Benmimoun, Sufian Ashraf Mazhari, Mohsen Lakehal-Ayat, Muhammad Adeel Awan
  • Patent number: 11887020
    Abstract: A thermal load prediction method and apparatus. The method includes configuring multiple prediction states and corresponding error thresholds and forming a prediction model. The prediction model predicting first thermal load magnitudes respectively corresponding to multiple testing time periods, wherein a target steam user uses boiler steam in the multiple testing time periods. Determining, according to the first thermal load magnitudes, relative prediction errors respectively corresponding to the multiple testing time periods Forming a state transition probability matrix according to the relative prediction errors, and determining a state probability of each prediction state in each future time period of future time periods according to the state transition probability matrix.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: January 30, 2024
    Assignee: ENNEW Technology Co., Ltd.
    Inventors: Shengwei Liu, Xin Huang
  • Patent number: 11889026
    Abstract: Exemplary aspects involve a data-communications apparatus or system communicate over a broadband network with a plurality of remotely-located data-communications circuits respectively associated with a plurality of remotely-situated client entities. The system includes data-communications platform (e.g., UC-CC) that processes incoming data-communication interactions including different types of digitally-represented communications among which are incoming call, and that is integrated with a memory circuit including a database of information sets. Each of the information sets includes experience data corresponding to past incoming data-communication interactions processed by the platform, and with aggregated and organized data based on data collected in previous incoming interactions.
    Type: Grant
    Filed: August 18, 2022
    Date of Patent: January 30, 2024
    Assignee: 8x8, Inc.
    Inventors: Bryan R. Martin, Matt Taylor, Manu Mukerji
  • Patent number: 11863592
    Abstract: A method includes, at a media bridge configured to distribute a plurality of media streams among a plurality of client devices connected to the media bridge over a network, receiving the plurality of media streams from the plurality of client devices via the media bridge. The media bridge connects the plurality of client devices. The method further includes assigning a pair of names for each of the plurality of media streams. The pair of names include a contribution name and a distribution name. The method further includes presenting a first list to the plurality of client devices. The first list including a plurality of the distribution names for the plurality of media streams received from the plurality of client devices. The method further includes providing an indication of a current active speaker within the plurality of media streams via a signaling process.
    Type: Grant
    Filed: May 14, 2021
    Date of Patent: January 2, 2024
    Assignee: CISCO TECHNOLOGY, INC.
    Inventors: Jacques Samain, Giovanna Carofiglio, Giulio Grassi, Enrico Loparco, Michele Papalini
  • Patent number: 11829720
    Abstract: Systems and methods for analysis and validation of language models trained using data that is unavailable or inaccessible are provided. One example method includes, at an electronic device with one or more processors and memory, obtaining a first set of data corresponding to one or more tokens predicted based on one or more previous tokens. The method determines a probability that the first set of data corresponds to a prediction generated by a first language model trained using a user privacy preserving training process. In accordance with a determination that the probability is within a predetermined range, the method determines that the one or more tokens correspond to a prediction associated with the user privacy preserving training process and outputs a predicted token sequence including the one or more tokens and the one or more previous tokens.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: November 28, 2023
    Assignee: Apple Inc.
    Inventors: Jerome R. Bellegarda, Bishal Barman, Brent D. Ramerth
  • Patent number: 11790908
    Abstract: A voice command can be received from a user. One or more voice command devices (VCDs) that the voice command is targeting can be determined. A visual indicator of each of the one or more targeted VCDs can be displayed on an XR device worn by the user, wherein each visual indicator visually indicates a respective targeted VCD the voice command is directed to on the XR device.
    Type: Grant
    Filed: February 9, 2021
    Date of Patent: October 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Soma Shekar Naganna, Sarbajit K. Rakshit, Abhishek Seth, Matheen Ahmed Pasha
  • Patent number: 11776546
    Abstract: Techniques are described for providing information during a service session, using an intelligent agent. The intelligent agent executes as a process to monitor communications exchanged during a service session between an individual and a service representative (SR) within a service environment. The agent analyzes the communications to identify questions or other topics that are posed by the individual during the service session. The agent retrieves stored data related to such questions or other topics, and generates a message to address each question or other topic. The message is injected into the service session to be presented to the individual, to supplement the conversation that is taking place between the SR and the individual. In some implementations, the agent monitors the communications, generates the message, and/or injects the message into the service session at least partly autonomously of any explicit action taken by the SR.
    Type: Grant
    Filed: September 8, 2021
    Date of Patent: October 3, 2023
    Assignee: United Services Automobile Association (USAA )
    Inventors: Michael Waldmeier, Yuibi Fujimoto
  • Patent number: 11778102
    Abstract: A system and method providing an accessibility tool that enhances a graphical user interface of an online meeting application is described. In one aspect, a computer-implemented method performed by an accessibility tool (128), the method includes accessing (802), in real-time, audio data of a session of an online meeting application (120), identifying (804) a target user, a speaking user, and a task based on the audio data, the speaking user indicating the task assigned to the target user in the audio data, generating (806) a message (318) that identifies the speaking user, the target user, and the task, the message (318) including textual content, and displaying (808) the message (318) in a chat pane (906) of a graphical user interface (902) of the online meeting application (120) during the session.
    Type: Grant
    Filed: April 1, 2022
    Date of Patent: October 3, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shahil Soni, Charles Yin-Che Lee
  • Patent number: 11748462
    Abstract: A method for authenticating a user of an electronic device is disclosed. The method comprises: responsive to detection of a trigger event indicative of a user interaction with the electronic device, generating an audio probe signal to play through an audio transducer of the electronic device; receiving a first audio signal comprising a response of the user's ear to the audio probe signal; receiving a second audio signal comprising speech of the user; and applying an ear biometric algorithm to the first audio signal and a voice biometric algorithm to the second audio signal to authenticate the user as an authorised user.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: September 5, 2023
    Assignee: Cirrus Logic Inc.
    Inventor: John Paul Lesso
  • Patent number: 11749264
    Abstract: Embodiments described herein provide methods and systems for training task-oriented dialogue (TOD) language models. In some embodiments, a TOD language model may receive a TOD dataset including a plurality of dialogues and a model input sequence may be generated from the dialogues using a first token prefixed to each user utterance and a second token prefixed to each system response of the dialogues. In some embodiments, the first token or the second token may be randomly replaced with a mask token to generate a masked training sequence and a masked language modeling (MLM) loss may be computed using the masked training sequence. In some embodiments, the TOD language model may be updated based on the MLM loss.
    Type: Grant
    Filed: November 3, 2020
    Date of Patent: September 5, 2023
    Assignee: Salesforce, Inc.
    Inventors: Chien-Sheng Wu, Chu Hong Hoi, Richard Socher, Caiming Xiong
  • Patent number: 11741984
    Abstract: An acoustic scene conversion method, comprising: receiving sound signals including user's speech and scenic sounds; processing the sound signals according to an artificial intelligence model to generate enhanced speech signals without scenic sounds; and mixing the enhanced speech signals with new scenic sounds to produce converted sound signals.
    Type: Grant
    Filed: June 1, 2021
    Date of Patent: August 29, 2023
    Assignee: ACADEMIA SINICA
    Inventors: Tsao Yu, Syu-Siang Wang, Szu-Wei Fu, Alexander Chao-Fu Kang, Hsin-Min Wang
  • Patent number: 11709654
    Abstract: The present disclosure generally relates to a computer-implemented system for intelligently retaining and recalling memory data. An exemplary method comprises receiving, via a microphone of an electronic device, a speech input of the user; receiving a text input of the user; constructing a first instance of a memory data structure based on the speech input; constructing a second instance of the memory data structure based on the text input; adding the first instance and the second instance of the memory data structure to a memory stack of the user; displaying a user interface for retrieving memory data of the user; receiving, via the user interface, a beginning of a statement from the user; retrieving a particular instance of the memory data structure from the memory stack based on the beginning of the statement; and automatically displaying a completion of the statement.
    Type: Grant
    Filed: May 19, 2022
    Date of Patent: July 25, 2023
    Assignee: Human AI Labs, Inc.
    Inventors: Suman Kanuganti, Xiaoran Zhang, Kristie Kaiser
  • Patent number: 11710484
    Abstract: An agent control device configured to execute a plurality of agents and including a processor, the processor being configured to store an interruptibility list that stipulates interruptibility of execution for each function of one given agent being executed or for an execution status of the one given agent; request execution of each of the agents at a prescribed trigger, or request execution of another given agent at a specific trigger, reference the interruptibility list in order to set permissibility information relating to executability of the other given agent in conjunction with execution of the one given agent; and perform management such that, in a case in which there is a request at the specific trigger for execution of the other given agent while the one given agent is executing, the other given agent is executed based on the request.
    Type: Grant
    Filed: April 8, 2021
    Date of Patent: July 25, 2023
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Satoshi Aihara
  • Patent number: 11706482
    Abstract: Provided is a display device including a display unit, a storage unit configured to store information on a web page, a microphone configured to receive a user's voice command, a network interface unit configured to perform communication with a natural language processing (NLP) server, and a controller configured to transmit text data of the voice command to the NLP server, to receive intention analysis result information corresponding to the voice command from the NLP server, to select, as a final candidate address, one of a plurality of candidate addresses related to a search word included in the received intention analysis result information if the search word is not stored in the storage unit, and to access a website corresponding to the selected final candidate address.
    Type: Grant
    Filed: February 20, 2018
    Date of Patent: July 18, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Chulmin Son, Seunghyun Heo, Jaekyung Lee
  • Patent number: 11704460
    Abstract: Embodiments herein provide for reverse engineering of integrated circuits (ICs) for design verification. In example embodiments, an apparatus receives a gate-level netlist for an integrated circuit (IC), generates a list of equivalence classes related to signals included in the gate-level netlist, determines control signals of the gate-level netlist based at least in part on the list of equivalence classes, determines a logic flow of a finite state transducer (FST) based at least in part on the control signals, and generates register transfer level (RTL) source code for the IC based on the FST.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: July 18, 2023
    Assignee: UNIVERSITY OF FLORIDA RESEARCH FOUNDATION, INCORPORATED
    Inventors: Yier Jin, Shaojie Zhang, James Geist, Travis Meade, Jason Liam Portillo
  • Patent number: 11699429
    Abstract: An electronic system is provided. The electronic system includes a host and a display. The host includes an audio processing module, and a smart interpreter engine. The audio processing module is utilized for acquiring audio data corresponding to a first language from audio streams processed by an application program executed on the host. The application program executed on the host includes a specific game software. The smart interpreter engine is utilized for receiving the audio data corresponding to the first language from the audio processing module and converting the audio data corresponding to the first language into text data corresponding to a second language according to the game software executed on the host The display is utilized for receiving the text data corresponding to the second language from the smart interpreter engine and displaying the text data corresponding to the second language.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: July 11, 2023
    Assignee: ACER INCORPORATED
    Inventors: Gianna Tseng, Szu-Ting Chou, Shang-Yao Lin, Shih-Cheng Huang
  • Patent number: 11699446
    Abstract: Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.
    Type: Grant
    Filed: May 19, 2021
    Date of Patent: July 11, 2023
    Assignee: Capital One Services, LLC
    Inventors: Bozhao Tan, Isabelle Alice Yvonne Moulinier, David Almquist, June Wu
  • Patent number: 11694032
    Abstract: The present disclosure relates to chatbot systems and, more particularly, to techniques for determining that an input utterance is representative of a task that a particular chatbot can perform, based on matching the input utterance to a template. Techniques are also described for generating templates based on example utterances that have been provided for a chatbot. In certain embodiments, an initial set of templates is generated based on example utterances. This initial set of templates is then refined using template generalization techniques, which can be performed at the word or sentence level to generate a final set of templates for use at runtime, when the templates are matched against user utterances. The final set of templates may include one or more generalized templates that were derived from the initial set of templates and may also include the initial set of templates.
    Type: Grant
    Filed: September 3, 2020
    Date of Patent: July 4, 2023
    Assignee: Oracle International Corporation
    Inventors: Stephen Andrew McRitchie, Sunghye Jeon
  • Patent number: 11687908
    Abstract: A payment button on a device capable of making telephone calls, such as a mobile phone, allows a payer to electronically transfer money while in a phone call with a payee. The payment button also allows a payee to initiate an electronic payment transaction while in a phone call with a payer. The payment button may be a clickable or tappable virtual button presented on a display of the phone when being used to make or receive a call. The payer or the payee can simply enter a payment amount on the phone to complete an electronic payment transaction. A notification of payment is instantly transmitted to the phones being used for the phone call, so that the parties can safely and conveniently conclude a purchase and/or payment transaction during one phone call.
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: June 27, 2023
    Assignee: PAYPAL, INC.
    Inventors: Saumil Ashvin Gandhi, Ray Hideki Tanaka
  • Patent number: 11683632
    Abstract: An automatic speech recognition (ASR) triggering system, and a method of providing an ASR trigger signal, is described. The ASR triggering system can include a microphone to generate an acoustic signal representing an acoustic vibration and an accelerometer worn in an ear canal of a user to generate a non-acoustic signal representing a bone conduction vibration. A processor of the ASR triggering system can receive an acoustic trigger signal based on the acoustic signal and a non-acoustic trigger signal based on the non-acoustic signal, and combine the trigger signals to gate an ASR trigger signal. For example, the ASR trigger signal may be provided to an ASR server only when the trigger signals are simultaneously asserted. Other embodiments are also described and claimed.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: June 20, 2023
    Assignee: Apple Inc.
    Inventors: Sorin V. Dusan, Aram M. Lindahl, Robert D. Watson
  • Patent number: 11615781
    Abstract: A singe audio-visual automated speech recognition model for transcribing speech from audio-visual data includes an encoder frontend and a decoder. The encoder includes an attention mechanism configured to receive an audio track of the audio-visual data and a video portion of the audio-visual data. The video portion of the audio-visual data includes a plurality of video face tracks each associated with a face of a respective person. For each video face track of the plurality of video face tracks, the attention mechanism is configured to determine a confidence score indicating a likelihood that the face of the respective person associated with the video face tack includes a speaking face of the audio track. The decoder is configured to process the audio track and the video face track of the plurality of video face tracks associated with the highest confidence score to determine a speech recognition result of the audio track.
    Type: Grant
    Filed: October 2, 2020
    Date of Patent: March 28, 2023
    Assignee: Google LLC
    Inventor: Otavio Braga
  • Patent number: 11610586
    Abstract: A method includes receiving a speech recognition result, and using a confidence estimation module (CEM), for each sub-word unit in a sequence of hypothesized sub-word units for the speech recognition result: obtaining a respective confidence embedding that represents a set of confidence features; generating, using a first attention mechanism, a confidence feature vector; generating, using a second attention mechanism, an acoustic context vector; and generating, as output from an output layer of the CEM, a respective confidence output score for each corresponding sub-word unit based on the confidence feature vector and the acoustic feature vector received as input by the output layer of the CEM. For each of the one or more words formed by the sequence of hypothesized sub-word units, the method also includes determining a respective word-level confidence score for the word. The method also includes determining an utterance-level confidence score by aggregating the word-level confidence scores.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: March 21, 2023
    Assignee: Google LLC
    Inventors: David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara Sainath, Ian Mcgraw
  • Patent number: 11595535
    Abstract: An information processing apparatus that is capable of reducing time and effort to set settings of a smart speaker that cooperates with the information processing apparatus when a user starts to use the smart speaker. The information processing apparatus acquires identification information of the user, and acquires audio control information associated with the acquired identification information. Then, the information processing apparatus requests the smart speaker to change the audio setting of the smart speaker based on the acquired audio control information.
    Type: Grant
    Filed: June 10, 2021
    Date of Patent: February 28, 2023
    Assignee: CANON KABUSHIKI KAISHA
    Inventor: Ryosuke Kasahara
  • Patent number: 11574638
    Abstract: A system and method are disclosed for generating a teleconference space for two or more communication devices using a computer coupled with a database and comprising a processor and memory. The computer generates a teleconference space and transmits requests to join the teleconference space to the two or more communication devices. The computer stores in memory identification information, and audiovisual data associated with one or more users, for each of the two or more communication devices. The computer stores audio transcription data, transmitted to the computer by each of the two or more communication devices and associated with one or more communication device users, in the computer memory. The computer merges the audio transcription data from each of the two or more communication devices into a master audio transcript, and transmits the master audio transcript to each of the two or more communication devices.
    Type: Grant
    Filed: May 9, 2022
    Date of Patent: February 7, 2023
    Assignee: Nextiva, Inc.
    Inventors: Tomas Gorny, Jean-Baptiste Martinoli, Tracy Conrad, Lukas Gorny
  • Patent number: 11562573
    Abstract: Aspects of the disclosure relate to training and using a phrase recognition model to identify phrases in images. As an example, a selected phrase list may include a plurality of phrases is received. Each phrase of the plurality of phrases includes text. An initial plurality of images may be received. A training image set may be selected from the initial plurality of images by identifying the phrase-containing images that include one or more phrases from the selected phrase list. Each given phrase-containing image of the training image set may be labeled with information identifying the one or more phrases from the selected phrase list included in the given phrase-containing images. The model may be trained based on the training image set such that the model is configured to, in response to receiving an input image, output data indicating whether a phrase of the plurality of phrases is included in the input image.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: January 24, 2023
    Assignee: Waymo LLC
    Inventors: Victoria Dean, Abhijit S Ogale, Henrik Kretzschmar, David Harrison Silver, Carl Kershaw, Pankaj Chaudhari, Chen Wu, Congcong Li
  • Patent number: 11514787
    Abstract: In an information processing device, a first acquirer acquires, from a user, plan information including a scheduled time and a destination. A second acquirer acquires a spare time. A third acquirer acquires travelling schedule information for enabling arrival at the destination earlier than the scheduled time by the spare time or more. A display controller displays, on a display unit, information regarding the travelling schedule information and the spare time.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: November 29, 2022
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Koichi Suzuki, Makoto Akahane
  • Patent number: 11495234
    Abstract: A data mining device, and a speech recognition method and system using the same are disclosed. The speech recognition method includes selecting speech data including a dialect from speech data, analyzing and refining the speech data including a dialect, and learning an acoustic model and a language model through an artificial intelligence (AI) algorithm using the refined speech data including a dialect. The user is able to use a dialect speech recognition service which is improved using services such as eMBB, URLLC, or mMTC of 5G mobile communications.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: November 8, 2022
    Assignee: LG Electronics Inc.
    Inventors: Jee Hye Lee, Seon Yeong Park
  • Patent number: 11488598
    Abstract: The present disclosure relates to a display device. The display device includes a display; a signal receiver configured to receive a user's voice signal through at least one of a plurality of devices; and a processor configured to: display an image of at least one of a plurality of programs on the display by executing the plurality of programs, identify a program corresponding to a device receiving the voice signal among the plurality of programs based on matching information set by the user regarding a mutual correspondence between the plurality of programs and the plurality of devices, in response to the user's voice signal received through any one of the plurality of devices, and control the identified program to operate according to a user command corresponding to the received voice signal. Thereby, it is possible to control a control target program to a user's intention according to a voice command even if a user who inputs the voice command does not separately designate the control target program.
    Type: Grant
    Filed: January 3, 2019
    Date of Patent: November 1, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Youngsoo Yun
  • Patent number: 11482229
    Abstract: A multimedia processing circuit is provided. The multimedia processing circuit includes a smart interpreter engine and an audio engine. The smart interpreter engine includes a speech to text converter, a natural language processing module and a translator. The speech to text converter is utilized for converting speech data into text data corresponding to the first language. The natural language processing module is utilized for converting the text data corresponding to the first language into glossary text data corresponding to the first language according to an application program being executed in a host. The application program comprises a specific game software. The translator is utilized for converting the glossary text data corresponding to the first language into text data corresponding to a second language. The audio engine is utilized for converting the speech data corresponding to the first language into an analog speech signal corresponding to the first language.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: October 25, 2022
    Assignee: ACER INCORPORATED
    Inventors: Gianna Tseng, Shih-Cheng Huang, Shang-Yao Lin, Szu-Ting Chou
  • Patent number: 11481036
    Abstract: A method for determining an electronic device, a system for determining an electronic device, a computer system, and a computer-readable storage medium, the method includes: acquiring a recognition result by recognizing a first action performed by an operating object through a first electronic device (S201); and determining a second electronic device which is controllable by the first electronic device according to the recognition result (S202).
    Type: Grant
    Filed: April 12, 2019
    Date of Patent: October 25, 2022
    Assignees: Beijing JingDong ShangKe Information Technology Co., Ltd., Beijing Jingdong Century Trading Co., Ltd.
    Inventors: Yazhuo Wang, Yu Guan, Zhongfei Xu
  • Patent number: 11460979
    Abstract: According to an embodiment disclosed in the specification, a display device may include a microphone, a display displaying a screen including a plurality of layers, a memory storing a plurality of application programs, and at least one processor displaying a first user interface (UI) for interacting with a user on a first layer among the plurality of layers, displaying a second UI for displaying information obtained by performing the interaction on a second layer among the plurality of layers, and displaying an image at least partly overlapping with the first UI and the second UI on a third layer among the plurality of layers.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: October 4, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jibum Moon, Jina Kwon, Kyerim Lee
  • Patent number: 11450312
    Abstract: A speech recognition method includes: obtaining speech information; and determining beginning and ending positions of a candidate speech segment in the speech information by using a weighted finite state transducer (WFST) network. The candidate speech segment is identified as corresponding to a preset keyword. The method also includes clipping the candidate speech segment from the speech information according to the beginning and ending positions of the candidate speech segment; detecting whether the candidate speech segment includes a preset keyword by using a machine learning model; and determining, upon determining that the candidate speech segment comprises the preset keyword, that the speech information comprises the preset keyword.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: September 20, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Shilun Lin, Xilin Zhang, Wenhua Ma, Bo Liu, Xinhui Li, Li Lu, Xiucai Jiang
  • Patent number: 11404064
    Abstract: An information processing apparatus includes a first detector, a textualization device, a second detector, a display device and a display controller. The first detector detects, from audio data in which speech of each person in a group composed of a plurality of persons has been recorded, each utterance made during the speech. The textualization device converts contents of each utterance detected by the first detector into text. The second detector detects predetermined keywords included in each utterance on the basis of text data obtained through textualization by the textualization device. The display controller causes the display device to display the predetermined keywords detected by the second detector.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: August 2, 2022
    Assignee: KYOCERA Document Solutions Inc.
    Inventors: Yuki Kobayashi, Nami Nishimura, Tomoko Mano
  • Patent number: 11403875
    Abstract: A processing method of face recognition includes steps of: extracting embedding feature information from a face image; outputting a recognition result of face recognition according to the embedding feature information, wherein the recognition result includes a recognized name and embedding feature distance information; determining whether the recognized name is in a list or not; if the recognized name is in the list, performing a removal checking step for determining whether to remove the recognition result based on the embedding feature distance information; if determining that the recognition result is not to be removed, displaying the recognized name; if determining that the recognition result is to be removed, displaying a negative prompt; and dynamically and instantly providing a feedback and updating a recognition method for the face recognition.
    Type: Grant
    Filed: November 20, 2020
    Date of Patent: August 2, 2022
    Assignee: ASKEY COMPUTER CORP.
    Inventors: Chien-Fang Chen, Setya Widyawan Prakosa, Huan-Ruei Shiu, Chien-Ming Lee
  • Patent number: 11393490
    Abstract: According to embodiments of the present disclosure, a method, apparatus, device, and computer readable storage medium for voice interaction are provided. The method includes: determining a text corresponding to the voice signal based on a voice feature of a received voice signal. The method further includes: determining, based on the voice feature and the text, a matching degree between a reference voice feature of an element in the text and a target voice feature of the element. The method further includes: determining a first possibility that the voice signal is an executable command based on the text. The method further includes: determining a second possibility that the voice signal is the executable command based on the voice feature.
    Type: Grant
    Filed: June 8, 2020
    Date of Patent: July 19, 2022
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Zhijian Wang, Jinfeng Bai, Sheng Qian, Lei Jia
  • Patent number: 11364926
    Abstract: The invention relates to a method for operating a motor vehicle system of a motor vehicle regardless of the driving situation. The method is performed by a personalization device of the motor vehicle and includes identifying the driver of the motor vehicle and using the identity of the driver to determine multiple driver-specific configuration data sets. Each of the determined configuration data sets describes configuration data of a respective user profile of the identified driver in order to personalize the motor vehicle system. The method further includes determining at least one additional occupant in the motor vehicle and using the result of the determination of the at least one occupant, determine an intention of the determined driver. The method further incudes using the determined intention to select a personalization mode from a plurality of personalization modes.
    Type: Grant
    Filed: April 4, 2019
    Date of Patent: June 21, 2022
    Assignee: AUDI AG
    Inventors: Jürgen Lerzer, Nikoletta Sofra, Hans Georg Gruber, André Ebner, Ron Melz
  • Publication number: 20150127345
    Abstract: A computer-implemented method includes listening for audio name information indicative of a name of a computer, with the computer configured to listen for the audio name information in a first power mode that promotes a conservation of power; detecting the audio name information indicative of the name of the computer; after detection of the audio name information, switching to a second power mode that promotes a performance of speech recognition; receiving audio command information; and performing speech recognition on the audio command information.
    Type: Application
    Filed: September 30, 2011
    Publication date: May 7, 2015
    Inventors: Evan H. Parker, Michal R. Grabowski
  • Patent number: 8942980
    Abstract: A method of navigating in a sound content wherein at least one key word is stored in association with at least two positions representative of said key word in the sound content, and wherein the method comprises: a step of displaying a representation of the sound content; during playback of the sound content, a step of detecting a current extract representative of a key word stored at a first position; a step of determining at least one second extract representative of said key word and a second position as a function of the stored positions; and a step of highlighting the position of the extracts in the representation of the sound content. The invention also relates to a system adapted to implement the navigation method.
    Type: Grant
    Filed: February 11, 2011
    Date of Patent: January 27, 2015
    Assignee: Orange
    Inventors: Pascal Le Mer, Delphine Charlet, Marc Denjean, Antoine Gonot
  • Patent number: 8930576
    Abstract: The present invention is directed to a secure communication network that enables multi-point to multi-point proxy communication over the network. The network employs a smart server that establishes a secure communication link with each of a plurality of smart client devices deployed on local client networks. Each smart client device is in communication with a plurality of agent devices. A plurality of remote devices can access the smart server directly and communicate with an agent device via the secure communication link between the smart server and one of the smart client devices.
    Type: Grant
    Filed: July 11, 2014
    Date of Patent: January 6, 2015
    Assignee: KE2 Therm Solutions, Inc.
    Inventors: Steve Roberts, Cetin Sert
  • Publication number: 20140372119
    Abstract: In general, the subject matter described in this specification can be embodied in methods, systems, and program products for performing compounded text segmentation. Compounded text that is extracted from one or more search queries submitted to a search engine is received. The compounded text includes a plurality of individual words that are joined together without intervening spaces. An electronic dictionary including words is accessed. A data structure representing possible segmentations of the compounded text is generated based on whether words in the possible segmentations occur in the electronic dictionary. A data store comprising data associated with a same field of usage as the compounded text is accessed to determine a frequency of occurrence for possible segmentations of the data structure. A segmentation of the compounded text that is most probable based on the data is determined. A language model is trained using the determined segmentation of the compounded text.
    Type: Application
    Filed: September 28, 2009
    Publication date: December 18, 2014
    Inventors: Carolina Parada, Boulos Harb, Johan Schalkwyk
  • Patent number: 8850072
    Abstract: The present invention is directed to a secure communication network that enables multi-point to multi-point proxy communication over the network. The network employs a smart server that establishes a secure communication link with each of a plurality of smart client devices installed on local client networks. Each smart client device is in communication with a plurality of agent devices. A plurality of remote devices can access the smart server directly and communicate with agent devices via the secure communication link between the smart server and one of the smart client devices. This communication is enabled without complex configuration of firewall or network parameters by the user.
    Type: Grant
    Filed: July 25, 2013
    Date of Patent: September 30, 2014
    Assignee: KE2 Therm Solutions, Inc.
    Inventors: Steve Roberts, Cetin Sert
  • Publication number: 20140249813
    Abstract: A transcript interface for displaying a plurality of words of a transcript in a text editor can be provided and configured to receive a command to edit the transcript. Limited edits to data corresponding to the transcript can be made based on in response to commands received via the user interface module. For example, edits may be limited to selection of a single word in the text editor for editing via a given command. The edit may affect an adjacent word in some instances, such as when two adjacent words are merged. In some embodiments, data corresponding to the selected word of the transcript is changed to reflect the edit without changing data defining the relative timing of those words of the transcript that are not adjacent to the selected word.
    Type: Application
    Filed: December 1, 2008
    Publication date: September 4, 2014
    Applicant: Adobe Systems Incorporated
    Inventor: Steven Hoeg
  • Patent number: 8731609
    Abstract: A mobile device, such as a cellular telephone includes a voice interface that includes one part that may not be specific to a particular carrier, and a second part that provides an interface to services that are specific to a carrier or to service or information providers that are not necessarily available with all carriers. A voice command interface provides easy access to the carrier services. The set of carrier services is optionally extendible by the carrier.
    Type: Grant
    Filed: August 9, 2011
    Date of Patent: May 20, 2014
    Assignee: Nuanace Communications, Inc.
    Inventors: Daniel L. Roth, Chris Reiner, Mark Furnari, Jordan Cohen
  • Publication number: 20140129218
    Abstract: Computer-based speech recognition can be improved by recognizing words with an accurate accent model. In order to provide a large number of possible accents, while providing real-time speech recognition, a language tree data structure of possible accents is provided in one embodiment such that a computerized speech recognition system can benefit from choosing among accent categories when searching for an appropriate accent model for speech recognition.
    Type: Application
    Filed: November 6, 2012
    Publication date: May 8, 2014
    Applicant: Spansion LLC
    Inventors: Chen Liu, Richard Fastow
  • Publication number: 20140129217
    Abstract: Embodiments of the present invention include an apparatus, method, and system for calculating senone scores for multiple concurrent input speech streams. The method can include the following: receiving one or more feature vectors from one or more input streams; accessing the acoustic model one senone at a time; and calculating separate senone scores corresponding to each incoming feature vector. The calculation uses a single read access to the acoustic model for a single senone and calculates a set of separate senone scores for the one or more feature vectors, before proceeding to the next senone in the acoustic model.
    Type: Application
    Filed: November 6, 2012
    Publication date: May 8, 2014
    Inventor: Ojas A. BAPAT
  • Publication number: 20140122086
    Abstract: Embodiments related to the use of depth imaging to augment speech recognition are disclosed. For example, one disclosed embodiment provides, on a computing device, a method including receiving depth information of a physical space from a depth camera, receiving audio information from one or more microphones, identifying a set of one or more possible spoken words from the audio information, determining a speech input for the computing device based upon comparing the set of one or more possible spoken words from the audio information and the depth information, and taking an action on the computing device based upon the speech input determined.
    Type: Application
    Filed: October 26, 2012
    Publication date: May 1, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Jay Kapur, Ivan Tashev, Mike Seltzer, Stephen Edward Hodges