Speech Recognition Using Nonacoustical Features, E.g., Position Of The Lips, Etc. (epo) Patents (Class 704/E15.041)
  • Patent number: 11948581
    Abstract: A smart interpreter engine is provided. The smart interpreter engine includes a speech to text converter, a natural language processing module and a translator. The speech to text converter is utilized for converting speech data corresponding to a first language into text data corresponding to the first language. The natural language processing module is utilized for converting the text data corresponding to the first language into glossary text data corresponding to the first language according to a game software. The translator is utilized for converting the glossary text data corresponding to the first language into text data corresponding to a second language.
    Type: Grant
    Filed: May 18, 2022
    Date of Patent: April 2, 2024
    Assignee: ACER INCORPORATED
    Inventors: Gianna Tseng, Shih-Cheng Huang, Shang-Yao Lin, Szu-Ting Chou
  • Patent number: 11941039
    Abstract: Systems and methods for transcription analysis of a recording are provided. The recording includes an audio and screenshot/video portion. The audio portion is transcribed using a machine learned model. Models may be selected by the recording quality and potentially accents or other speech patterns that are present. The transcription is then linked to the video/screen capture chronology, so that automatic scrolling is enabled, clip selection from the transcription, and searching to a video time is possible. There is improvements to user experience question generation, review of study results, and in managing the study participants.
    Type: Grant
    Filed: January 12, 2023
    Date of Patent: March 26, 2024
    Assignee: USERZOOM TECHNOLOGIES, INC.
    Inventors: Xavier Mestres, Alfonso de la Nuez, Albert Recolons, Francesc del Castillo, Jordi Ibañez, Anna Barba, Andrew Jensen
  • Patent number: 11908482
    Abstract: This application provides a packet loss retransmission method, a computer-readable storage medium, and an electronic device. The packet loss retransmission method includes: obtaining a loudness corresponding to a target audio data packet; and in response to receiving a packet loss state indicating that the target audio data packet is lost, in accordance with a determination that the loudness corresponding to the target audio data packet meets a first threshold: retransmitting the target audio data packet. The technical solutions of this application may alleviate the problem of long data retransmission time, and improve data transmission efficiency.
    Type: Grant
    Filed: April 26, 2022
    Date of Patent: February 20, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 11900935
    Abstract: Embodiments disclosed herein include security systems and methods for securing an electronic device. In some embodiments, the security system may include an electronic device including surveillance components, including a microphone and a camera. The security device may be secured to the electronic device in a way that blocks or interferes with one or more of these surveillance components. Additionally, the security device may be placed near or on the electronic device. The security device may also include security components to distort sounds from a sound source to be detected by the microphone. As a result, this may prevent third parties from at least remotely streaming or recording live audio and/or video from various audio and video recording components present on the electronic device.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: February 13, 2024
    Assignee: Insight Holding Group, Inc.
    Inventors: Kimberly Christopher, James Christopher, Ken Tirman
  • Patent number: 11900920
    Abstract: A sound pickup device includes a plurality of microphone elements, a sensitivity correcting unit that corrects a difference in sensitivity among the microphone elements by multiplying an output signal of each of the microphone elements by a gain. The sound pickup device also includes a target sound detecting unit that detects a voice of a speaker as a target sound, a sensitivity correction control unit that controls the gain based on a result of detecting the target sound, and a directivity synthesizing unit that picks up the target sound in a boosted manner using the output signals from the microphone elements of which difference in sensitivity is corrected. The sensitivity correction control unit updates the gain based on the output signals from the microphone elements if the voice of the speaker is detected and does not update the gain if the voice of the speaker is not detected.
    Type: Grant
    Filed: November 16, 2020
    Date of Patent: February 13, 2024
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Hiroki Furukawa, Shinichi Yuzuriha
  • Patent number: 11817126
    Abstract: Methods and devices related to converting sign language are described. In an example, a method can include receiving, at a processing resource of a computing device via a radio of the computing device, first signaling including at least one of text data, audio data, or video data, or any combination thereof, converting, at the processing resource, at least one of the text data, the audio data, or the video data to data representing a sign language, generating, at the processing resource, different video data based at least in part on the data representing the sign language, wherein the different video data comprises instructions for display of a performance of the sign language, transmitting second signaling representing the different video data from the processing resource to a user interface, and displaying the performance of the sign language on the user interface in response to the user interface receiving the second signaling.
    Type: Grant
    Filed: April 20, 2021
    Date of Patent: November 14, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Ariela E. Gruszka, Angela S. Parekh, Mandy W. Fortunati, Teresa M. Di Dio
  • Patent number: 11749073
    Abstract: Method of identifying actions of a shopper to account for taken items by the shopper in a cashierless checkout includes sampling a shopping environment using one or more video cameras to generate video features related a shopper in connection to an item and sampling using one or more supplemental sensors to generate supplemental sensor feature data, receiving output of the sampled video and supplemental sensor features as feature inputs to a deep learning model used for making inferences related to the state of a scenario involving shopper action of taking the item into their possession or held and other actions including moving outside a zone initially associated with the item.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: September 5, 2023
    Inventors: Gary M. Zalewski, Albert S. Penilla
  • Patent number: 11706267
    Abstract: The present invention relates to a method for generating and causing display of a communication interface that facilitates the sharing of emotions through the creation of 3D avatars, and more particularly with the creation of such interfaces for displaying 3D avatars for use with mobile devices, cloud based systems and the like.
    Type: Grant
    Filed: May 24, 2022
    Date of Patent: July 18, 2023
    Assignee: Snap Inc.
    Inventors: Jesse Chand, Jeremy Voss
  • Patent number: 11704983
    Abstract: Disclosed are methods to manipulate a given parametrized haptic curve in order to yield a smooth phase function for each acoustic transducer which minimizes unwanted parametric audio. Further, the impulse response of a haptic system describes the behavior of the system over time and can be convolved with a given input to simulate a response to that input. To produce a specific response, a deconvolution with the impulse response is necessary to generate an input.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: July 18, 2023
    Assignee: ULTRAHAPTICS IP LTD
    Inventors: Benjamin John Oliver Long, Brian Kappus
  • Patent number: 11682398
    Abstract: An electronic apparatus includes: a communication device configured to receive a signal from each of a plurality of acceleration sensors attached to a face of a user; a memory configured to store a classification learning model that classifies words based on a plurality of sensor output values; and a processor configured to determine a word corresponding to a mouth shape of the user by input a value of the received signal to the classification learning model, when the signal is received from each of the plurality of acceleration sensors.
    Type: Grant
    Filed: November 16, 2020
    Date of Patent: June 20, 2023
    Assignee: INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY
    Inventors: Chang-Hwan Im, Jinuk Kwon, Younsoo Chae
  • Patent number: 11601750
    Abstract: According to examples, an apparatus may include a processor and a non-transitory computer readable medium on which is stored instructions that the processor may execute to access an audio signal captured by a microphone of a user's speech while the microphone is in a muted state. The processor may also execute the instructions to analyze a spectral or frequency content of the accessed audio signal to determine whether the user was facing the microphone while the user spoke. In addition, based on a determination that the user was facing the microphone while the user spoke, the processor may execute the instructions to unmute the microphone.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: March 7, 2023
    Assignee: Hewlett-Packard Development Company, L.P
    Inventors: Srikanth Kuthuru, Sunil Bharitkar
  • Patent number: 11561760
    Abstract: An electronic device for changing a voice of a personal assistant function, and a method therefor are provided. The electronic device includes a display, a transceiver, processor, and a memory for storing commands executable by the processor. The processor is configured to, based on a user command to request acquisition of voice data feature of a person included in a media content displayed on the display being received, control the display to display information of a person, based on a user input to select the one of the information of a person being received, acquire voice data corresponding to an utterance of a person related to the selected information of a person, and acquire voice data feature from the acquired voice data, control the transceiver to transmit the acquired voice data feature to a server.
    Type: Grant
    Filed: June 10, 2019
    Date of Patent: January 24, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jaehong Kim, Sangkyung Lee, Jihak Jung
  • Patent number: 11551681
    Abstract: Devices and techniques are generally described for a speech processing routing architecture. In various examples, first data comprising a first feature definition is received. The first feature definition may include a first indication of first source data and first instructions for generating feature data using the first source data. In various examples, the feature data may be generated according to the first feature definition. In some examples, a speech processing system may receive a first request to process a first utterance. The feature data may be retrieved from a non-transitory computer-readable memory. The speech processing system may determine a first skill for processing the first utterance based at least in part on the feature data.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: January 10, 2023
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Rajesh Kumar Pandey, Ruhi Sarikaya, Shubham Katiyar, Arun Kumar Thenappan, Isaac Joseph Madwed, Jihwan Lee, David Thomas, Julia Kennedy Nemer, Mohamed Farouk AbdelHady, Joe Pemberton, Young-Bum Kim, Arima Vu Ram Thayumanavar, Wangyao Ge
  • Patent number: 11516570
    Abstract: Implementations of the subject matter described herein provide a silent voice input solution without being noticed by surroundings. Compared with conventional voice input solutions which are based on normal speech or whispering, the proposed “silent” voice input method is performed by using ingressive voice during the user's breathing-in process. By placing the apparatus very close to the user's mouth with a ultra-small gap formed between the microphone and the apparatus, the proposed silent voice input solution can realize a very small voice leakage, and thereby allowing the user to use ultra-low voice speech input in public and mobile situations, without disturbing surrounding people.
    Type: Grant
    Filed: July 1, 2021
    Date of Patent: November 29, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Masaaki Fukumoto
  • Patent number: 11488596
    Abstract: A method for recording audio content in a group conversation among a plurality of members includes: controlling an image capturing device to continuously capture images of the members; executing an image processing procedure on the images of the members to determine whether a specific gesture is detected; when the determination is affirmative, controlling an audio recording device to activate and perform directional audio collection with respect to a direction that is associated with the specific gesture to record audio data; and controlling a data storage to store the audio data and a time stamp associated with the audio data as an entry of conversation record.
    Type: Grant
    Filed: April 27, 2020
    Date of Patent: November 1, 2022
    Inventor: Hsiao-Han Chen
  • Patent number: 10878226
    Abstract: In an approach, a computer determines based, at least in part, on a video of an attendee of a video conference, a first sentiment of the attendee wherein the first sentiment includes at least a sentiment from a sentiment analysis of one or more facial expressions of the attendee and a sentiment from a sentiment analysis of a plurality of the attendee's spoken words. The approach includes a computer receiving an indication of an attendee activity in at least a first application in computing devices accessed by the attendee and determining whether the first sentiment of the attendee is related to the video conference based, in part, on the attendee activity in at least the first application. Responsive to determining that the first sentiment of the attendee is not related to the video conference, the computer discards the first sentiment that is unrelated to the video conference.
    Type: Grant
    Filed: March 8, 2017
    Date of Patent: December 29, 2020
    Assignee: International Business Machines Corporation
    Inventors: Hernan A. Cunico, Asima Silva
  • Patent number: 9898170
    Abstract: An approach is provided for automatically generating user-specific interaction modes for processing question and answers at the information handling system by receiving a question from a user, extracting user context parameters identifying a usage scenario for the user, identifying first input and output presentation modes for the user based on the extracted user context parameters, monitoring user interaction with the system in relation to the question, and adjusting the first input and output presentation modes based on the extracted user context parameters and detected user interaction with the system.
    Type: Grant
    Filed: December 10, 2014
    Date of Patent: February 20, 2018
    Assignee: International Business Machines Corporation
    Inventors: John P. Bufe, Donna K. Byron, Mary D. Swift, Timothy Winkler
  • Patent number: 8635066
    Abstract: Methods, system, and articles are described herein for receiving an audio input and a facial image sequence for a period of time, in which the audio input includes speech input from multiple speakers. The audio input is extracted based on the received facial image sequence to extract a speech input of a particular speaker.
    Type: Grant
    Filed: April 14, 2010
    Date of Patent: January 21, 2014
    Assignee: T-Mobile USA, Inc.
    Inventor: Andrew R. Morrison
  • Publication number: 20110099013
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.
    Type: Application
    Filed: October 23, 2009
    Publication date: April 28, 2011
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Dan MELAMED, Srinivas Bangalore, Michael Johnston
  • Publication number: 20100250250
    Abstract: A hybrid text generator is disclosed that generates a hybrid text string from multiple text strings that are produced from an audio input by multiple automated speech recognition systems. The hybrid text generator receives metadata that describes a time-location that each word from the multiple text strings is located in the audio input. The hybrid text generator matches words between the multiple text strings using the metadata and generates a hybrid text string that includes the matched words. The hybrid text generator utilizes confidence scores associated with words that do not match between the multiple text strings to determine whether to add an unmatched word to the hybrid text string.
    Type: Application
    Filed: March 29, 2010
    Publication date: September 30, 2010
    Inventor: Jonathan Wiggs
  • Publication number: 20100185447
    Abstract: Embodiments are provided for selecting and utilizing multiple recognizers to process an utterance based on a markup language document. The markup language document and an utterance are received in a computing device. One or more recognizers are selected from among the multiple recognizers for returning a results set for the utterance based on markup language in the markup language document. The results set is received from the one or more selected recognizers in a format determined by a processing method specified in the markup language document. An event is then executed on the computing device in response to receiving the results set.
    Type: Application
    Filed: January 22, 2009
    Publication date: July 22, 2010
    Applicant: Microsoft Corporation
    Inventors: Andrew K. Krumel, Pierre-Alexandre F. Masse, Joseph A. Ruff
  • Publication number: 20090326941
    Abstract: A speech recognition circuit comprises an input buffer for receiving processed speech parameters. A lexical memory contains lexical data for word recognition. The lexical data comprises a plurality of lexical tree data structures. Each lexical tree data structure comprises a model of words having common prefix components. An initial component of each lexical tree structure is unique. A plurality of lexical tree processors are connected in parallel to the input buffer for processing the speech parameters in parallel to perform parallel lexical tree processing for word recognition by accessing the lexical data in the lexical memory. A results memory is connected to the lexical tree processors for storing processing results from the lexical tree processors and lexical tree identifiers to identify lexical trees to be processed by the lexical tree processors.
    Type: Application
    Filed: September 4, 2009
    Publication date: December 31, 2009
    Inventor: Mark Catchpole
  • Publication number: 20090171662
    Abstract: The performance of traditional speech recognition systems (as applied to information extraction or translation) decreases significantly with, larger domain size, scarce training data as well as under noisy environmental conditions. This invention mitigates these problems through the introduction of a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language. The predictive features are combined with text classifiers to map the noisy text to one of the semantically or functionally similar groups. The features used by the classifier can be syntactic, semantic, and statistical.
    Type: Application
    Filed: December 27, 2007
    Publication date: July 2, 2009
    Applicant: SEHDA, INC.
    Inventors: Jun Huang, Yookyung Kim, Youssef Billawala, Farzad Ehsani, Demitrios Master
  • Publication number: 20090048838
    Abstract: Provided is a system and method for building and managing a customized voice of an end-user, comprising the steps of designing a set of prompts for collection from the user, wherein the prompts are selected from both an analysis tool and by the user's own choosing to capture voice characteristics unique to the user. The prompts are delivered to the user over a network to allow the user to save a user recording on a server of a service provider. This recording is then retrieved and stored on the server and then set up on the server to build a voice database using text-to-speech synthesis tools. A graphical interface allows the user to continuously refine the data file to improve the voice and customize parameter and configuration settings, thereby forming a customized voice database which can be deployed or accessed.
    Type: Application
    Filed: May 29, 2008
    Publication date: February 19, 2009
    Inventors: Craig F. Campbell, Kevin A. Lenzo, Alexandre D. Cox
  • Publication number: 20090006093
    Abstract: A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.
    Type: Application
    Filed: June 29, 2007
    Publication date: January 1, 2009
    Applicant: Microsoft Corporation
    Inventor: Amitava Das
  • Publication number: 20080300880
    Abstract: This application discloses A multi-lingual output device for output of transactional information for a given customer, the device that includes a data base for determining what transaction information needs to be outputted, the local language in which the information is to be outputted, and the preferred language of the customer in which the information is to be outputted; and, a local transaction subsystem in communication with said database, wherein said local transaction sub system includes input device receiving means for accepting an input device and output generating means for generating a signal to an output device.
    Type: Application
    Filed: January 22, 2008
    Publication date: December 4, 2008
    Inventor: Lawrence Stephen Gelbman
  • Publication number: 20080103758
    Abstract: A method for language translation of a toolkit menu is provided, which includes receiving, by a Subscriber Identity Module (SIM) toolkit module, the toolkit menu from a SIM card module, determining, by the SIM toolkit module, whether a language of the toolkit menu matches a user-defined language, and translating, by the SIM toolkit module, the language of the toolkit menu into the user-defined language, if the language of the toolkit menu is different from the user defined language.
    Type: Application
    Filed: October 25, 2007
    Publication date: May 1, 2008
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Suraparaju VENKATESWARLU
  • Patent number: RE42868
    Abstract: A method and apparatus accesses a database where entries are linked to at least two sets of patterns. One or more patterns of a first set of patterns are recognized within a received signal. The recognized patterns are used to identify entries and compile a list of patterns in a second set of patterns to which those entries are also linked. The list is then used to recognize a second received signal. The received signals may, for example, be voice signals or signals indicating the origin or destination of the received signals.
    Type: Grant
    Filed: October 25, 1995
    Date of Patent: October 25, 2011
    Assignee: Cisco Technology, Inc.
    Inventors: David J. Attwater, Steven J. Whittaker, Francis J. Scahill, Alison D. Simons