Endpoint Detection Patents (Class 704/253)

Voice activity detection using zero crossing detection

Patent number: 11790931

Abstract: A first VAD system outputs a pulse stream for zero crossings in an audio signal. The pulse density of the pulse stream is evaluated to identify speech. The audio signal may have noise added to it before evaluating zero crossings. A second VAD system rectifies each audio signal sample and processes each rectified sample by updating a first statistic and evaluating the rectified sample per a first threshold condition that is a function of the first statistic. Rectified samples meeting the first threshold condition may be used to update a second statistic and the rectified sample evaluated per a second threshold condition that is a function of the second statistic. Rectified samples meeting the second threshold condition may be used to update a third statistic. The audio signal sample may be selected as speech if the second statistic is less than a downscaled third statistic.

Type: Grant

Filed: October 27, 2020

Date of Patent: October 17, 2023

Assignee: Ambiq Micro, Inc.

Inventor: Roger David Serwy
System and method for natural language understanding

Patent number: 11790895

Abstract: An electronic device for natural language understanding includes at least one memory and at least one processor coupled to the at least one memory. The at least one processor is configured to process an utterance using a trained model. The at least one processor is also configured to replace a first portion of the utterance with a first token, where the first token represents a semantic role of the first portion of the utterance based on a slot vocabulary. The at least one processor is further configured to determine a slot value in the utterance based on the first token. In addition, the at least one processor is configured to perform a task corresponding to the utterance based on the determined slot value.

Type: Grant

Filed: October 23, 2019

Date of Patent: October 17, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Avik Ray, Hongxia Jin
Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate

Patent number: 11386134

Abstract: Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate. In some aspects, the systems and methods calculate first and second pronunciation rates of first and second voice queries. The systems and methods determine that the second pronunciation rate is lower than the first pronunciation rate and determine a first candidate pronunciation time for a first candidate word from the first voice query. The systems and methods determine a second candidate pronunciation time, adjusted to the first pronunciation rate, for the second candidate word from the second voice query. The systems and methods determine that the first candidate pronunciation time matches the second candidate pronunciation time and generate a third voice query based on the first voice query by replacing the first candidate word with the second candidate word.

Type: Grant

Filed: July 7, 2020

Date of Patent: July 12, 2022

Assignee: Rovi Guides, Inc.

Inventor: Arun Sreedhara
Method for constructing voice detection model and voice endpoint detection system

Patent number: 11295761

Abstract: The present disclosure discloses a method for constructing a voice detection model and a voice endpoint detection system, and belongs to the technical field of voice recognition. In the method for constructing a voice detection model according to the present disclosure, audio data is first collected and a mixed voice is synthesized, feature extraction is performed on the mixed voice to obtain a 62-dimensional feature, and then the 62-dimensional feature is input to a recurrent neural network (RNN) model for training to obtain a voice detection model. The voice endpoint detection system according to the present disclosure includes a collecting unit, a calculating unit, a transmitting unit, and a terminal, the collecting unit being electrically connected to the calculating unit, and the calculating unit and the terminal being respectively connected to the transmitting unit. The voice detection model can be applied to a real-time conference communication device.

Type: Grant

Filed: May 11, 2020

Date of Patent: April 5, 2022

Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.

Inventors: Zehuang Fang, Yuanxun Kang, Wanjian Feng
Speaker apparatus, audio data supply apparatus, and audio data reproduction system

Patent number: 11240603

Abstract: A speaker apparatus includes a measuring unit configured to measure a first distance to a detection object, the detection object existing on one side in a vertical direction substantially orthogonal to a sound radiation direction, and a second distance to a detection object, the detection object existing on another side in the vertical direction.

Type: Grant

Filed: February 5, 2018

Date of Patent: February 1, 2022

Assignee: SONY CORPORATION

Inventor: Fumito Yoshida
Voice recognition device, voice recognition method, and voice recognition program

Patent number: 11195535

Abstract: A voice recognition device includes a memory and a processor including hardware. The processor is configured to extract a feature of input voice data and set a duration of a silent state after transition of the voice data to the silent state. The duration is used for determining that an input of the voice data is completed.

Type: Grant

Filed: September 6, 2019

Date of Patent: December 7, 2021

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor: Fumio Wada
Information security/privacy via a decoupled security cap to an always listening assistant device

Patent number: 11100913

Abstract: An always-listening-capable decoupled cap for a computing device having a communication module is disclosed, comprising an electronic sensor and a gatekeeping module. All data received by the communications module based on data from the electronic sensor passes through the gatekeeping module while a gatekeeping function is disabled, no data based on data from the electronic sensor passes through the communications module while the gatekeeping function is enabled, all data input to the gatekeeping module is received via an exclusive input lead from the electronic sensor, and all data output from the gatekeeping module is transmitted via an exclusive output lead to a component other than the electronic sensor. The cap interferes with a function of the computing device, determines that user input recorded by the first electronic sensor comprises a first input content, and changes a nature of interference with the function of the computing device.

Type: Grant

Filed: November 21, 2018

Date of Patent: August 24, 2021

Inventor: Thomas Stachura
Computer device and method for managing privilege delegation

Patent number: 11062055

Abstract: A computer device and method for managing privilege delegation to control execution of commands on files on the computer device is described. An agent plugin intercepts a request in a user account of a logged-in user to execute a command therein on a file having first privileges assigned thereto, wherein the agent plugin is provided for the file. The agent plugin obtains information related to the request and forwards the information to an agent service cooperating with an operating system of the computer device. The agent service determines whether to execute the command on the file in the user account according to second privileges different from the first privileges. The agent service launches an agent proxy process having the second privileges assigned thereto by the agent service if it is determined to execute the command on the file in the user account according to the second privileges.

Type: Grant

Filed: September 26, 2018

Date of Patent: July 13, 2021

Assignee: AVECTO LIMITED

Inventor: John Goodridge
Method and device for recognizing text segmentation position

Patent number: 11004448

Abstract: The present disclosure provides a method and a device for recognizing a text segmentation position. The method includes: receiving a continuous voice message inputted by a user, and recognizing the continuous voice message to generate a text message corresponding to the continuous voice message; analyzing the text message to determine an interval position, and sequentially inserting a sentence end and sentence begin sign at each interval position; calculating a segmentation value corresponding to the sentence end and sentence begin sign inserted at a present interval position according to a preset algorithm; and determining whether the segmentation value is greater than a preset threshold, and determining the present interval position as a segmentation position when the segmentation value is greater than the preset threshold.

Type: Grant

Filed: June 20, 2018

Date of Patent: May 11, 2021

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Sheng Qian, Qiang Cheng
Information processing apparatus, information processing method, and non-transitory computer-readable storage medium for deriving a level of understanding of an intent of speech

Patent number: 10978055

Abstract: In an information processing apparatus, an acquisition unit acquires speech information on a user's speech. A derivation unit, derives a level of understanding of an intent of the user's speech based on the speech information acquired by the acquisition unit. A display control unit causes a display unit to display an image corresponding to the level of understanding derived by the derivation unit. An output unit outputs a response to the user's speech based on the speech information acquired by the acquisition unit.

Type: Grant

Filed: February 12, 2019

Date of Patent: April 13, 2021

Assignee: Toyota Jidosha Kabushiki Kaisha

Inventors: Chikage Kubo, Takuji Yamada
Identifying intra sentence vague components

Patent number: 10949621

Abstract: A method, system, and computer program product for identifying intra sentence vague components includes: receiving a first sentence, assigning tokens to each component of the first sentence, comparing a total number of tokens of the first sentence to a threshold number of tokens, in response to determining that the total number of tokens is less than the threshold number of tokens, identifying one or more relevant complete sentences from a database, determining a best sentence from the one or more relevant complete sentences, and determining a vague component of the first sentence by comparing the best sentence to the first sentence.

Type: Grant

Filed: March 28, 2019

Date of Patent: March 16, 2021

Assignee: International Business Machines Corporation

Inventors: Anil Manohar Omanwar, Anand Shantilal Borse, Gandhi Sivakumar, Lalit Agarwalla
Syllable based automatic speech recognition

Patent number: 10916235

Abstract: Systems, methods, and computer programs are described which utilize the structure of syllables as an organizing element of automated speech recognition processing to overcome variations in pronunciation, to efficiently resolve confusable aspects, to exploit context, and to map the speech to orthography.

Type: Grant

Filed: July 10, 2018

Date of Patent: February 9, 2021

Assignee: VOX FRONTERA, INC.

Inventors: Mark B. Pinson, Darrel T. Pinson
Speech correction system and speech correction method

Patent number: 10885914

Abstract: The speech correction system includes a storage device and a processing device. The storage device stores a first database. The processing device includes an audio receiver, a speech recognition engine, a calculation module, and a determination module. The audio receiver receives multiple voice inputs. The speech recognition engine recognizes the voice inputs, generates multiple candidate vocabularies corresponding to each of the voice inputs, and generates a vocabulary probability corresponding to each of the candidate vocabularies. The calculation module performs a specific operation on the vocabulary probabilities corresponding to the same candidate vocabulary, to generate a plurality of corresponding operation results. The determination module determines whether each of the operation results is greater than a score threshold, and stores at least one output result that is greater than the score threshold to the first database.

Type: Grant

Filed: January 24, 2019

Date of Patent: January 5, 2021

Assignee: QUANTA COMPUTER INC.

Inventors: Kuan-Chung Chen, Chih-Wei Sung, Yu-Cheng Chien, Yi-Ling Chen
Secure and private processing of gestures via video input

Patent number: 10867623

Abstract: An always-listening-capable computing device is disclosed, comprising a camera for recording video of human actions, a module for communication with a remote server, and a gate-keeping module that, when enabled prevents the communication module from transmitting data external to the device. The device determines, based on the content of the video, a user desire or situation requiring human attention and transmits messages as appropriate for satisfying that desire or addressing that situation. Additional methods for handling user input directed to a recipient other than the device and for ensuring data security via controlling the device's network access are also disclosed.

Type: Grant

Filed: June 18, 2018

Date of Patent: December 15, 2020

Inventor: Thomas Stachura
Information security/privacy via a decoupled security accessory to an always listening assistant device

Patent number: 10867054

Abstract: An always-listening-capable decoupled accessory for a computing device is disclosed. The accessory comprises an electronic sensor, configured to record user input comprising an utterance or gesture, and a gatekeeping module implemented by a processor, wherein all data received by the communications module based on data from the first electronic sensor passes through the gatekeeping module while a gatekeeping function is disabled, wherein no data based on data from the first electronic sensor passes through the communications module while the gatekeeping function is enabled, wherein all data input to the gatekeeping module is received via an exclusive input lead from the first electronic sensor, and wherein all data output from the gatekeeping module is transmitted via an exclusive output lead to a component other than the first electronic sensor. The processor determines that user input comprises a first input content and in response automatically transmits a message to the computing device.

Type: Grant

Filed: November 21, 2018

Date of Patent: December 15, 2020

Inventor: Xiao Ming Mai
Modality learning on mobile devices

Patent number: 10831366

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.

Type: Grant

Filed: December 29, 2016

Date of Patent: November 10, 2020

Assignee: Google LLC

Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
Voice-controlled secure remote actuation system

Patent number: 10810999

Abstract: A secure remote actuation system is described herein that operates based on voice commands provided by a user and/or owner of the system. The system may include: a remote input receptor having a user interface for receiving one or more user inputs from a user, the user interface having a voice input processor, and the user inputs including vocalization; and a cloud-based network storing one or acceptable inputs and including a network device for obtaining said one or more user inputs from the remote input receptor. The network device may obtain said one or more user inputs from the remote input receptor while the user is using the user interface. The cloud-based network may compare said one or more user inputs to said one or more acceptable inputs. The voice input processor may include a microphone, a speaker, or both, and may perform various types of voice recognition.

Type: Grant

Filed: July 27, 2017

Date of Patent: October 20, 2020

Assignee: Hall Labs LLC

Inventors: David R. Hall, Mark Hall, Craig Boswell, Joe Fox, Jedediah Knight
Audio file quality and accuracy assessment

Patent number: 10776419

Abstract: Disclosed computer-based systems and methods for analyzing a plurality of audio files corresponding to text-based news stories and received from a plurality of audio file creators are configured to (i) compare quality and/or accuracy metrics of individual audio files against corresponding quality and/or accuracy thresholds, and (ii) based on the comparison: (a) accept audio files meeting the quality and/or accuracy thresholds for distribution to a plurality of subscribers for playback, (b) reject audio files failing to meet one or more certain quality and/or accuracy thresholds, (c) remediate audio files failing to meet certain quality thresholds, and (d) designate for human review, audio files failing to meet one or more certain quality and/or accuracy thresholds by a predetermined margin.

Type: Grant

Filed: May 16, 2014

Date of Patent: September 15, 2020

Assignee: GRACENOTE DIGITAL VENTURES, LLC

Inventors: Gregory P. Defouw, Venkatarama Anilkumar Panguluri
Method and system for facilitating a guided dialog between a user and a conversational agent

Patent number: 10672397

Abstract: The present teaching relates to facilitating a guided dialog with a user. In one example, an input utterance is obtained from the user. One or more task sets are estimated based on the input utterance. Each of the one or more task sets includes a plurality of tasks estimated to be requested by the user via the input utterance and is associated with a confidence score computed based on statistics with respect to the plurality of tasks in the task set. At least one of the one or more task sets is selected based on their respective confidence scores. A response is generated based on the tasks in the selected at least one task set. The response is provided to the user.

Type: Grant

Filed: July 26, 2019

Date of Patent: June 2, 2020

Assignee: Oath Inc.

Inventors: Sungjin Lee, Amanda Stent
Methods and systems for evaluating voice call quality

Patent number: 10667155

Abstract: A method, a device, and a non-transitory storage medium for estimating voice call quality include performing automatic speech recognition, for each of a plurality of voice calls, to generate recognized text for both an originating device acoustic signal and a receiving device acoustic signal. The recognized text for both the originating device acoustic signal and the receiving device acoustic signal are compared to the reference text to identified recognition errors and a voice call quality score for each of the originating device acoustic signal and the receiving device acoustic signal are determined. A correlation between the network conditions and the voice call quality scores is then determined.

Type: Grant

Filed: July 16, 2018

Date of Patent: May 26, 2020

Assignee: Verizon Patent and Licensing Inc.

Inventors: Ye Ouyang, Krishna Pichumani Iyer, Zhenyi Lin, Le Su
Voice recognition-based dialing

Patent number: 10582046

Abstract: A voice recognition-based dialing method and a voice recognition-based dialing system are provided. The methods includes: determining a recognition result based on a user's voice input, at least one acoustic model and at least one language model, where the at least one acoustic model and the at least one language model are obtained based on information collected in an electronic device. The system includes: obtain at least one acoustic model and at least one language model based on information collected in an electronic device; and determine a recognition result based on a user's voice input, the at least one acoustic model and the at least one language model. The acoustic models and the language models are updated based on the information collected in the electronic device, which may be helpful to the voice recognition-based dialing.

Type: Grant

Filed: December 30, 2014

Date of Patent: March 3, 2020

Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED

Inventors: Jianjun Ma, Liping Hu, Richard Allen Kreifeldt
Information processing method, information processing system, and information processing apparatus

Patent number: 10523897

Abstract: One or more sensor devices detect a condition of each of users. With regard to a user set including at least two of the users as elements, an information processing apparatus calculates an agreement degree representing a degree of agreement between the at least two users being the elements in the user set, based on the condition information on the at least two users which is in the obtained condition information on all the users. An information presentation device provides presentation information based on a result of the agreement degree calculation.

Type: Grant

Filed: August 30, 2017

Date of Patent: December 31, 2019

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Yuri Nishikawa, Masayuki Misaki
Method for delivering primary information that exists in at least one electronic form

Patent number: 10447675

Abstract: A method for delivering primary information that exists in at least one electronic form, includes transmission of the primary information via at least one communication network to at least one communication system allocated to an addressee of the primary information; creation of verification information relating to the acknowledgement of the primary information by the addressee; and saving and/or transmission of the verification information via at least one communication network. Individualized data is thus made available which not only documents the receipt of the primary information by the addressee but also the acknowledgement of the receipt of the primary information by the addressee. A telecommunication arrangement and a telecommunication unit which are suitable for carrying out the method are also disclosed.

Type: Grant

Filed: November 30, 2007

Date of Patent: October 15, 2019

Assignee: Sigram Schindler Beteiligungsgesellschaft MbH

Inventors: Sigram Schindler, Juergen Schulze
Method and system for facilitating a guided dialog between a user and a conversational agent

Patent number: 10403273

Abstract: The present teaching relates to facilitating a guided dialog with a user. In one example, an input utterance is obtained from the user. One or more task sets are estimated based on the input utterance. Each of the one or more task sets includes a plurality of tasks estimated to be requested by the user via the input utterance and is associated with a confidence score computed based on statistics with respect to the plurality of tasks in the task set. At least one of the one or more task sets is selected based on their respective confidence scores. A response is generated based on the tasks in the selected at least one task set. The response is provided to the user.

Type: Grant

Filed: September 9, 2016

Date of Patent: September 3, 2019

Assignee: Oath Inc.

Inventors: Sungjin Lee, Amanda Stent
Device and method for voice translation

Patent number: 10402500

Abstract: Provided are a method and electronic device for voice translation. The electronic device includes a voice receiver configured to receive a voice signal; a processor configured to divide the voice signal into a plurality of voice segments, determine an input language and a speaker that correspond to each of the plurality of voice segments, determine a translation direction based on the input language and the speaker of the voice segments, and translate the voice segments according to the translation direction to generate a translation result; and an output device configured to output the translation result.

Type: Grant

Filed: February 8, 2017

Date of Patent: September 3, 2019

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Marcin Chochowski, Pawel Przybysz, Elzbieta Gajewska-Dendek
Correction system, method of correction, and computer program product

Patent number: 10276150

Abstract: A correction system of the embodiment includes an interface system, a calculator, a generator, and a display controller. The interface system receives correction information for correcting a voice recognition result. The calculator estimates a part of the voice recognition result to be corrected and calculates a degree of association between the part to be corrected and the correction information. The generator generates corrected display information comprising at least one of the correction information and the part to be corrected using a display format corresponding to the degree of association. The display controller outputs the corrected display information on a display.

Type: Grant

Filed: February 23, 2017

Date of Patent: April 30, 2019

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kosei Fume, Taira Ashikawa, Masayuki Ashikawa, Hiroshi Fujimura
Using machine learning to classify phone posterior context information and estimating boundaries in speech from combined boundary posteriors

Patent number: 10049657

Abstract: Phoneme boundaries may be determined from a signal corresponding to recorded audio by extracting auditory attention features from the signal and extracting phoneme posteriors from the signal. The auditory attention features and phoneme posteriors may then be combined to detect boundaries in the signal.

Type: Grant

Filed: May 26, 2017

Date of Patent: August 14, 2018

Assignee: SONY INTERACTIVE ENTERTAINMENT INC.

Inventor: Ozlem Kalinli-Akbacak
Information security/privacy in an always listening assistant device

Patent number: 10002259

Abstract: An always-listening-capable computing device includes a receiver for input from a user, a module for communication with a remote server, and a gate-keeping module that, when enabled, prevents the communication module from transmitting data external to the device. After determining that user input includes a first wake up phrase, the device processor automatically transmits a representation of user input subsequent to the phrase, activates an always-receiving mode to transmit a stream of user input captured subsequent to the phrase, deactivates the always-receiving mode to prevent transmission of user input received subsequent to the phrase, unless also preceded by a second wake up phrase, or enables the gate-keeping module to prevent transmission of data external to the device.

Type: Grant

Filed: November 14, 2017

Date of Patent: June 19, 2018

Inventor: Xiao Ming Mai
Splitting utterances for quick responses

Patent number: 9972308

Abstract: Methods, a system, and a classifier are provided. A method includes preparing, by a processor, pairs for an information retrieval task. Each pair includes (i) a training-stage speech recognition result for a respective sequence of training words and (ii) an answer label corresponding to the training-stage speech recognition result. The method further includes obtaining, by the processor, a respective rank for the answer label included in each pair to obtain a set of ranks. The method also includes determining, by the processor, for each pair, an end of question part in the training-stage speech recognition result based on the set of ranks. The method additionally includes building, by the processor, the classifier such that the classifier receives a recognition-stage speech recognition result and returns a corresponding end of question part for the recognition-stage speech recognition result, based on the end of question part determined for the pairs.

Type: Grant

Filed: November 8, 2016

Date of Patent: May 15, 2018

Assignee: International Business Machines Corporation

Inventors: Tohru Nagano, Ryuki Tachibana
Using averaged audio measurements to automatically set audio compressor threshold levels

Patent number: 9954507

Abstract: Various aspects of this disclosure describe setting an audio compressor threshold using averaged audio measurements. Examples include calculating one or more average values of amplitude values of an audio file, and setting a threshold used in the audio compressor based on the calculated thresholds. Samples of the audio file with amplitude values above the threshold are attenuated, while samples of the audio file with amplitude values below the threshold are not attenuated. The threshold can be set equal to a calculated average value, or from a function of one or more calculated average values. Different audio channels comprising the audio file can be processed to set a respective compressor threshold for each audio channel.

Type: Grant

Filed: August 1, 2016

Date of Patent: April 24, 2018

Assignee: ADOBE SYSTEMS INCORPORATED

Inventor: Matthew Gehring Stegner
Real-time interactive voice recognition and response over the internet

Patent number: 9947321

Abstract: Methods and systems for handling speech recognition processing in effectively real-time, via the Internet, in order that users do not experience noticeable delays from the start until they receive responsive feedback. A user uses a client to access the Internet and a server supporting speech recognition processing. The user inputs speech to the client, which transmits the user speech to the server in approximate real-time. The server evaluates the user speech, and provides responsive feedback to the client, again, in approximate real-time, with minimum latency delays. The client upon receiving responsive feedback from the server, displays, or otherwise provides, the feedback to the user.

Type: Grant

Filed: February 21, 2017

Date of Patent: April 17, 2018

Assignee: PEARSON EDUCATION, INC.

Inventor: Christopher S. Jochumson
System and method for multimodal utterance detection

Patent number: 9922640

Abstract: The disclosure describe a system and method for detecting one or more segments of desired speech utterances from an audio stream using timings of events from other modes that are correlated to the timings of the desired segments of speech. The redundant information from other modes results in a highly accurate and robust utterance detection.

Type: Grant

Filed: February 3, 2014

Date of Patent: March 20, 2018

Inventor: Ashwin P Rao
Using word confidence score, insertion and substitution thresholds for selected words in speech recognition

Patent number: 9886943

Abstract: A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.

Type: Grant

Filed: January 13, 2017

Date of Patent: February 6, 2018

Assignee: Adadel Inc.

Inventor: Chang-Qing Shu
Directional keyword verification method applicable to electronic device and electronic device using the same

Patent number: 9805714

Abstract: The disclosure is directed to a directional keyword verification method and an electronic device using the same method. According to an exemplary embodiment, the proposed keyword verification method would include receiving an audio stream; analyzing the audio stream to obtain at least a word; determining whether the word matches a key word from a keyword database; assigning the word as a filler if the word does not match the keyword from the keyword database; determining whether a vowel pattern of the word matches the vowel pattern of the keyword if the word matches the key word from the keyword database; assigning the first word as a trigger or command word if the vowel pattern of the word matches the vowel pattern of the key word; and otherwise assigning the word as a filler if the vowel pattern of the word does not match the vowel pattern of the key word.

Type: Grant

Filed: March 22, 2016

Date of Patent: October 31, 2017

Assignee: ASUSTeK COMPUTER INC.

Inventors: Bhoomek D. Pandya, Hsing-Yu Tsai, Min-Hong Wang, Cheng-Chung Hsu
Device and method for data compression

Patent number: 9806740

Abstract: A device for data compression includes a processing unit, a temporary memory, and a storage device. The temporary memory is used to temporarily store data to be compressed. The storage device includes multiple physical blocks. Each physical block has a same volume size. The processing unit compresses the to-be-compressed data, generates compressed data, and stores the compressed data into one of the physical blocks. The processing unit compares a data size of the compressed data and a volume size of one physical block, and when the data size of the compressed data is smaller than the volume size, the processing unit stores remnant data into the same physical block as the compressed data stored in, wherein the total data size of the remnant data plus the compressed data is equal to the volume size of the physical block both are stored in.

Type: Grant

Filed: July 19, 2017

Date of Patent: October 31, 2017

Assignee: ACCELSTOR, INC.

Inventors: An-Nan Chang, Shih-Chiang Tsao, Pao-Chien Li, Chih-Kang Nung
Estimation of target character train

Patent number: 9711133

Abstract: A desired character train included in a predefined reference character train, such as lyrics, is set as a target character train, and a user designates a target phoneme train that is indirectly representative of the target character train by use of a limited plurality of kinds of particular phonemes, such as vowels and a particular consonants. A reference phoneme train indirectly representative of the reference character train by use of the particular phonemes is prepared in advance. Based on a comparison between the target phoneme train and the reference phoneme train, a sequence of the particular phonemes in the reference phoneme train that matches the target phoneme train is identified, and a character sequence in the reference character train that corresponds to the identified sequence of the particular phonemes is identified. The thus-identified character sequence estimates the target character train.

Type: Grant

Filed: July 29, 2015

Date of Patent: July 18, 2017

Assignee: YAMAHA CORPORATION

Inventor: Kazuhiko Yamamoto
Various apparatus and methods for a speech recognition system

Patent number: 9646603

Abstract: A method, apparatus, and system are described for a continuous speech recognition engine that includes a fine speech recognizer model, a coarse sound representation generator, and a coarse match generator. The fine speech recognizer model receives a time coded sequence of sound feature frames, applies a speech recognition process to the sound feature frames and determines at least a best guess at each recognizable word that corresponds to the sound feature frames. The coarse sound representation generator generates a coarse sound representation of the recognized word. The coarse match generator determines a likelihood of the coarse sound representation actually being the recognized word based on comparing the coarse sound representation of the recognized word to a database containing the known sound of that recognized word and assigns the likelihood as a robust confidence level parameter to that recognized word.

Type: Grant

Filed: February 27, 2009

Date of Patent: May 9, 2017

Assignee: LONGSAND LIMITED

Inventor: Mahapathy Kadirkamanathan
Using word confidence score, insertion and substitution thresholds for selected words in speech recognition

Patent number: 9583094

Abstract: A method and system for improving the accuracy of a speech recognition system using were confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.

Type: Grant

Filed: September 22, 2016

Date of Patent: February 28, 2017

Assignee: ADACEL, INC.

Inventor: Chang-Qing Shu
Adjusting language models based on topics identified using context

Patent number: 9542945

Abstract: Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

Type: Grant

Filed: June 10, 2015

Date of Patent: January 10, 2017

Assignee: Google Inc.

Inventor: Matthew I. Lloyd
Detection of data in a sequence of characters

Patent number: 9489371

Abstract: A method for detecting data in a sequence of characters or text using both a statistical engine and a pattern engine. The statistical engine is trained to recognize certain types of data and the pattern engine is programmed to recognize the grammatical pattern of certain types of data. The statistical engine may scan the sequence of characters to output first data, and the pattern engine may break down the first data into subsets of data. Alternatively, the statistical engine may output items that have a predetermined probability or greater of being a certain type of data and the pattern engine may then detect the data from the output items and/or remove incorrect information from the output items.

Type: Grant

Filed: July 12, 2013

Date of Patent: November 8, 2016

Assignee: Apple Inc.

Inventors: Olivier Bonnet, Frederick de Jaeger, Romain Goyet, Jean-Pierre Ciudad
Using word confidence score, insertion and substitution thresholds for selected words in speech recognition

Patent number: 9478218

Abstract: A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.

Type: Grant

Filed: October 24, 2008

Date of Patent: October 25, 2016

Assignee: Adacel, Inc.

Inventor: Chang-Qing Shu
Speech analytics system and system and method for determining structured speech

Patent number: 9401145

Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.

Type: Grant

Filed: May 5, 2014

Date of Patent: July 26, 2016

Assignee: VERINT SYSTEMS LTD.

Inventors: Omer Ziv, Ran Achituv, Ido Shapira
Systems and methods for gathering research data

Patent number: 9378728

Abstract: Methods and systems are provided for gathering research data that includes information pertaining to audio signals received on a portable device, such as a cell phone. Frequency domain data is received or produced, a signature is extracted from the frequency domain data and an ancillary code is read from the frequency domain data.

Type: Grant

Filed: April 25, 2014

Date of Patent: June 28, 2016

Assignee: The Nielsen Company (US), LLC

Inventor: Alan R. Neuhauser
Determining whether speech interference occurs based on time interval between speech instructions and status of the speech instructions

Patent number: 9330676

Abstract: A method to filter out speech interference is provided. The method includes defining a time threshold by using a probability distribution model. When a current instruction from a speech input is recognized, a reference instruction recognized from the speech input is obtained. The current instruction is recognized right after the recognition of the reference instruction, wherein the reference instruction and the current instruction correspond to a first time point and a second time point respectively. The method includes determining whether speech interference occurs according to a comparison result of the time threshold and an interval between the first time point and the second time point as well as a state corresponding to the first time point. The method includes filtering out the reference instruction and the current instruction if the speech interference occurs, and outputting the reference instruction or the current instruction if the speech interference does not occur.

Type: Grant

Filed: October 18, 2013

Date of Patent: May 3, 2016

Assignee: Wistron Corporation

Inventor: Hsi-Chun Hsiao
Prosody generator, speech synthesizer, prosody generating method and prosody generating program

Patent number: 9324316

Abstract: There is provided a prosody generator that generates prosody information for implementing highly natural speech synthesis without unnecessarily collecting large quantities of learning data. A data dividing means 81 divides into subspaces the data space of a learning database as an assembly of learning data indicative of the feature quantities of speech waveforms. A density information extracting means 82 extracts density information indicative of the density state in terms of information quantity of the learning data in each of the subspaces divided by the data dividing means 81. A prosody information generating method selecting means 83 selects either a first method or a second method as a prosody information generating method based on the density information, the first method involving generating the prosody information using a statistical technique, the second method involving generating the prosody information using rules based on heuristics.

Type: Grant

Filed: May 10, 2012

Date of Patent: April 26, 2016

Assignee: NEC CORPORATION

Inventors: Yasuyuki Mitsui, Reishi Kondo, Masanori Kato
System and method for recognizing a user voice command in noisy environment

Patent number: 9318103

Abstract: An automatic speech recognition system for recognizing a user voice command in noisy environment, including: matching means for matching elements retrieved from speech units forming said command with templates in a template library; characterized by processing means including a MultiLayer Perceptron for computing posterior templates (P(Otemplate(q))) stored as said templates in said template library; means for retrieving posterior vectors (P(Otest(q))) from said speech units, said posterior vectors being used as said elements. The present invention relates also to a method for recognizing a user voice command in noisy environments.

Type: Grant

Filed: February 21, 2013

Date of Patent: April 19, 2016

Assignee: VEOVOX SA

Inventors: John Dines, Jorge Carmona, Olivier Masson, Guillermo Aradilla
Detection of chopped speech

Patent number: 9263061

Abstract: Methods and systems are provided for detecting chop in an audio signal. A time-frequency representation, such as a spectrogram, is created for an audio signal and used to calculate a gradient of mean power per frame of the audio signal. Positive and negative gradients are defined for the signal based on the gradient of mean power, and a maximum overlap offset between the positive and negative gradients is determined by calculating a value that maximizes the cross-correlation of the positive and negative gradients. The negative gradient values may be combined (e.g., summed) with the overlap offset, and the combined values then compared with a threshold to estimate the amount of chop present in the audio signal. The chop detection model provided is low-complexity and is applicable to narrowband, wideband, and superwideband speech.

Type: Grant

Filed: May 21, 2013

Date of Patent: February 16, 2016

Assignee: GOOGLE INC.

Inventors: Andrew J. Hines, Jan Skoglund, Naomi Harte, Anil Kokaram
Detection of end of utterance in speech recognition system

Patent number: 9117460

Abstract: The present invention relates to speech recognition systems, especially to arranging detection of end-of utterance in such systems. A speech recognizer of the system is configured to determine whether recognition result determined from received speech data is stabilized. The speech recognizer is configured to process values of best state scores and best token scores associated with frames of received speech data for end of utterance detection purposes. Further, the speech recognizer is configured to determine whether end of utterance is detected or not, based on the processing, if the recognition result is stabilized.

Type: Grant

Filed: May 12, 2004

Date of Patent: August 25, 2015

Assignee: Core Wireless Licensing S.A.R.L.

Inventor: Tommi Lahti
Voice activity detection in presence of background noise

Patent number: 9099098

Abstract: In speech processing systems, compensation is made for sudden changes in the background noise in the average signal-to-noise ratio (SNR) calculation. SNR outlier filtering may be used, alone or in conjunction with weighting the average SNR. Adaptive weights may be applied on the SNRs per band before computing the average SNR. The weighting function can be a function of noise level, noise type, and/or instantaneous SNR value. Another weighting mechanism applies a null filtering or outlier filtering which sets the weight in a particular band to be zero. This particular band may be characterized as the one that exhibits an SNR that is several times higher than the SNRs in other bands.

Type: Grant

Filed: November 6, 2012

Date of Patent: August 4, 2015

Assignee: QUALCOMM Incorporated

Inventors: Venkatraman Srinivasa Atti, Venkatesh Krishnan
Adjusting language models using context information

Patent number: 9076445

Abstract: Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

Type: Grant

Filed: December 5, 2012

Date of Patent: July 7, 2015

Assignee: Google Inc.

Inventor: Matthew I. Lloyd

1 2 3 4 5 next