Endpoint Detection Patents (Class 704/253)
  • Patent number: 11790895
    Abstract: An electronic device for natural language understanding includes at least one memory and at least one processor coupled to the at least one memory. The at least one processor is configured to process an utterance using a trained model. The at least one processor is also configured to replace a first portion of the utterance with a first token, where the first token represents a semantic role of the first portion of the utterance based on a slot vocabulary. The at least one processor is further configured to determine a slot value in the utterance based on the first token. In addition, the at least one processor is configured to perform a task corresponding to the utterance based on the determined slot value.
    Type: Grant
    Filed: October 23, 2019
    Date of Patent: October 17, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Avik Ray, Hongxia Jin
  • Patent number: 11790931
    Abstract: A first VAD system outputs a pulse stream for zero crossings in an audio signal. The pulse density of the pulse stream is evaluated to identify speech. The audio signal may have noise added to it before evaluating zero crossings. A second VAD system rectifies each audio signal sample and processes each rectified sample by updating a first statistic and evaluating the rectified sample per a first threshold condition that is a function of the first statistic. Rectified samples meeting the first threshold condition may be used to update a second statistic and the rectified sample evaluated per a second threshold condition that is a function of the second statistic. Rectified samples meeting the second threshold condition may be used to update a third statistic. The audio signal sample may be selected as speech if the second statistic is less than a downscaled third statistic.
    Type: Grant
    Filed: October 27, 2020
    Date of Patent: October 17, 2023
    Assignee: Ambiq Micro, Inc.
    Inventor: Roger David Serwy
  • Patent number: 11386134
    Abstract: Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate. In some aspects, the systems and methods calculate first and second pronunciation rates of first and second voice queries. The systems and methods determine that the second pronunciation rate is lower than the first pronunciation rate and determine a first candidate pronunciation time for a first candidate word from the first voice query. The systems and methods determine a second candidate pronunciation time, adjusted to the first pronunciation rate, for the second candidate word from the second voice query. The systems and methods determine that the first candidate pronunciation time matches the second candidate pronunciation time and generate a third voice query based on the first voice query by replacing the first candidate word with the second candidate word.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: July 12, 2022
    Assignee: Rovi Guides, Inc.
    Inventor: Arun Sreedhara
  • Patent number: 11295761
    Abstract: The present disclosure discloses a method for constructing a voice detection model and a voice endpoint detection system, and belongs to the technical field of voice recognition. In the method for constructing a voice detection model according to the present disclosure, audio data is first collected and a mixed voice is synthesized, feature extraction is performed on the mixed voice to obtain a 62-dimensional feature, and then the 62-dimensional feature is input to a recurrent neural network (RNN) model for training to obtain a voice detection model. The voice endpoint detection system according to the present disclosure includes a collecting unit, a calculating unit, a transmitting unit, and a terminal, the collecting unit being electrically connected to the calculating unit, and the calculating unit and the terminal being respectively connected to the transmitting unit. The voice detection model can be applied to a real-time conference communication device.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: April 5, 2022
    Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.
    Inventors: Zehuang Fang, Yuanxun Kang, Wanjian Feng
  • Patent number: 11240603
    Abstract: A speaker apparatus includes a measuring unit configured to measure a first distance to a detection object, the detection object existing on one side in a vertical direction substantially orthogonal to a sound radiation direction, and a second distance to a detection object, the detection object existing on another side in the vertical direction.
    Type: Grant
    Filed: February 5, 2018
    Date of Patent: February 1, 2022
    Assignee: SONY CORPORATION
    Inventor: Fumito Yoshida
  • Patent number: 11195535
    Abstract: A voice recognition device includes a memory and a processor including hardware. The processor is configured to extract a feature of input voice data and set a duration of a silent state after transition of the voice data to the silent state. The duration is used for determining that an input of the voice data is completed.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: December 7, 2021
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Fumio Wada
  • Patent number: 11100913
    Abstract: An always-listening-capable decoupled cap for a computing device having a communication module is disclosed, comprising an electronic sensor and a gatekeeping module. All data received by the communications module based on data from the electronic sensor passes through the gatekeeping module while a gatekeeping function is disabled, no data based on data from the electronic sensor passes through the communications module while the gatekeeping function is enabled, all data input to the gatekeeping module is received via an exclusive input lead from the electronic sensor, and all data output from the gatekeeping module is transmitted via an exclusive output lead to a component other than the electronic sensor. The cap interferes with a function of the computing device, determines that user input recorded by the first electronic sensor comprises a first input content, and changes a nature of interference with the function of the computing device.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: August 24, 2021
    Inventor: Thomas Stachura
  • Patent number: 11062055
    Abstract: A computer device and method for managing privilege delegation to control execution of commands on files on the computer device is described. An agent plugin intercepts a request in a user account of a logged-in user to execute a command therein on a file having first privileges assigned thereto, wherein the agent plugin is provided for the file. The agent plugin obtains information related to the request and forwards the information to an agent service cooperating with an operating system of the computer device. The agent service determines whether to execute the command on the file in the user account according to second privileges different from the first privileges. The agent service launches an agent proxy process having the second privileges assigned thereto by the agent service if it is determined to execute the command on the file in the user account according to the second privileges.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: July 13, 2021
    Assignee: AVECTO LIMITED
    Inventor: John Goodridge
  • Patent number: 11004448
    Abstract: The present disclosure provides a method and a device for recognizing a text segmentation position. The method includes: receiving a continuous voice message inputted by a user, and recognizing the continuous voice message to generate a text message corresponding to the continuous voice message; analyzing the text message to determine an interval position, and sequentially inserting a sentence end and sentence begin sign at each interval position; calculating a segmentation value corresponding to the sentence end and sentence begin sign inserted at a present interval position according to a preset algorithm; and determining whether the segmentation value is greater than a preset threshold, and determining the present interval position as a segmentation position when the segmentation value is greater than the preset threshold.
    Type: Grant
    Filed: June 20, 2018
    Date of Patent: May 11, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Sheng Qian, Qiang Cheng
  • Patent number: 10978055
    Abstract: In an information processing apparatus, an acquisition unit acquires speech information on a user's speech. A derivation unit, derives a level of understanding of an intent of the user's speech based on the speech information acquired by the acquisition unit. A display control unit causes a display unit to display an image corresponding to the level of understanding derived by the derivation unit. An output unit outputs a response to the user's speech based on the speech information acquired by the acquisition unit.
    Type: Grant
    Filed: February 12, 2019
    Date of Patent: April 13, 2021
    Assignee: Toyota Jidosha Kabushiki Kaisha
    Inventors: Chikage Kubo, Takuji Yamada
  • Patent number: 10949621
    Abstract: A method, system, and computer program product for identifying intra sentence vague components includes: receiving a first sentence, assigning tokens to each component of the first sentence, comparing a total number of tokens of the first sentence to a threshold number of tokens, in response to determining that the total number of tokens is less than the threshold number of tokens, identifying one or more relevant complete sentences from a database, determining a best sentence from the one or more relevant complete sentences, and determining a vague component of the first sentence by comparing the best sentence to the first sentence.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: March 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Anil Manohar Omanwar, Anand Shantilal Borse, Gandhi Sivakumar, Lalit Agarwalla
  • Patent number: 10916235
    Abstract: Systems, methods, and computer programs are described which utilize the structure of syllables as an organizing element of automated speech recognition processing to overcome variations in pronunciation, to efficiently resolve confusable aspects, to exploit context, and to map the speech to orthography.
    Type: Grant
    Filed: July 10, 2018
    Date of Patent: February 9, 2021
    Assignee: VOX FRONTERA, INC.
    Inventors: Mark B. Pinson, Darrel T. Pinson
  • Patent number: 10885914
    Abstract: The speech correction system includes a storage device and a processing device. The storage device stores a first database. The processing device includes an audio receiver, a speech recognition engine, a calculation module, and a determination module. The audio receiver receives multiple voice inputs. The speech recognition engine recognizes the voice inputs, generates multiple candidate vocabularies corresponding to each of the voice inputs, and generates a vocabulary probability corresponding to each of the candidate vocabularies. The calculation module performs a specific operation on the vocabulary probabilities corresponding to the same candidate vocabulary, to generate a plurality of corresponding operation results. The determination module determines whether each of the operation results is greater than a score threshold, and stores at least one output result that is greater than the score threshold to the first database.
    Type: Grant
    Filed: January 24, 2019
    Date of Patent: January 5, 2021
    Assignee: QUANTA COMPUTER INC.
    Inventors: Kuan-Chung Chen, Chih-Wei Sung, Yu-Cheng Chien, Yi-Ling Chen
  • Patent number: 10867054
    Abstract: An always-listening-capable decoupled accessory for a computing device is disclosed. The accessory comprises an electronic sensor, configured to record user input comprising an utterance or gesture, and a gatekeeping module implemented by a processor, wherein all data received by the communications module based on data from the first electronic sensor passes through the gatekeeping module while a gatekeeping function is disabled, wherein no data based on data from the first electronic sensor passes through the communications module while the gatekeeping function is enabled, wherein all data input to the gatekeeping module is received via an exclusive input lead from the first electronic sensor, and wherein all data output from the gatekeeping module is transmitted via an exclusive output lead to a component other than the first electronic sensor. The processor determines that user input comprises a first input content and in response automatically transmits a message to the computing device.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: December 15, 2020
    Inventor: Xiao Ming Mai
  • Patent number: 10867623
    Abstract: An always-listening-capable computing device is disclosed, comprising a camera for recording video of human actions, a module for communication with a remote server, and a gate-keeping module that, when enabled prevents the communication module from transmitting data external to the device. The device determines, based on the content of the video, a user desire or situation requiring human attention and transmits messages as appropriate for satisfying that desire or addressing that situation. Additional methods for handling user input directed to a recipient other than the device and for ensuring data security via controlling the device's network access are also disclosed.
    Type: Grant
    Filed: June 18, 2018
    Date of Patent: December 15, 2020
    Inventor: Thomas Stachura
  • Patent number: 10831366
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: November 10, 2020
    Assignee: Google LLC
    Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
  • Patent number: 10810999
    Abstract: A secure remote actuation system is described herein that operates based on voice commands provided by a user and/or owner of the system. The system may include: a remote input receptor having a user interface for receiving one or more user inputs from a user, the user interface having a voice input processor, and the user inputs including vocalization; and a cloud-based network storing one or acceptable inputs and including a network device for obtaining said one or more user inputs from the remote input receptor. The network device may obtain said one or more user inputs from the remote input receptor while the user is using the user interface. The cloud-based network may compare said one or more user inputs to said one or more acceptable inputs. The voice input processor may include a microphone, a speaker, or both, and may perform various types of voice recognition.
    Type: Grant
    Filed: July 27, 2017
    Date of Patent: October 20, 2020
    Assignee: Hall Labs LLC
    Inventors: David R. Hall, Mark Hall, Craig Boswell, Joe Fox, Jedediah Knight
  • Patent number: 10776419
    Abstract: Disclosed computer-based systems and methods for analyzing a plurality of audio files corresponding to text-based news stories and received from a plurality of audio file creators are configured to (i) compare quality and/or accuracy metrics of individual audio files against corresponding quality and/or accuracy thresholds, and (ii) based on the comparison: (a) accept audio files meeting the quality and/or accuracy thresholds for distribution to a plurality of subscribers for playback, (b) reject audio files failing to meet one or more certain quality and/or accuracy thresholds, (c) remediate audio files failing to meet certain quality thresholds, and (d) designate for human review, audio files failing to meet one or more certain quality and/or accuracy thresholds by a predetermined margin.
    Type: Grant
    Filed: May 16, 2014
    Date of Patent: September 15, 2020
    Assignee: GRACENOTE DIGITAL VENTURES, LLC
    Inventors: Gregory P. Defouw, Venkatarama Anilkumar Panguluri
  • Patent number: 10672397
    Abstract: The present teaching relates to facilitating a guided dialog with a user. In one example, an input utterance is obtained from the user. One or more task sets are estimated based on the input utterance. Each of the one or more task sets includes a plurality of tasks estimated to be requested by the user via the input utterance and is associated with a confidence score computed based on statistics with respect to the plurality of tasks in the task set. At least one of the one or more task sets is selected based on their respective confidence scores. A response is generated based on the tasks in the selected at least one task set. The response is provided to the user.
    Type: Grant
    Filed: July 26, 2019
    Date of Patent: June 2, 2020
    Assignee: Oath Inc.
    Inventors: Sungjin Lee, Amanda Stent
  • Patent number: 10667155
    Abstract: A method, a device, and a non-transitory storage medium for estimating voice call quality include performing automatic speech recognition, for each of a plurality of voice calls, to generate recognized text for both an originating device acoustic signal and a receiving device acoustic signal. The recognized text for both the originating device acoustic signal and the receiving device acoustic signal are compared to the reference text to identified recognition errors and a voice call quality score for each of the originating device acoustic signal and the receiving device acoustic signal are determined. A correlation between the network conditions and the voice call quality scores is then determined.
    Type: Grant
    Filed: July 16, 2018
    Date of Patent: May 26, 2020
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: Ye Ouyang, Krishna Pichumani Iyer, Zhenyi Lin, Le Su
  • Patent number: 10582046
    Abstract: A voice recognition-based dialing method and a voice recognition-based dialing system are provided. The methods includes: determining a recognition result based on a user's voice input, at least one acoustic model and at least one language model, where the at least one acoustic model and the at least one language model are obtained based on information collected in an electronic device. The system includes: obtain at least one acoustic model and at least one language model based on information collected in an electronic device; and determine a recognition result based on a user's voice input, the at least one acoustic model and the at least one language model. The acoustic models and the language models are updated based on the information collected in the electronic device, which may be helpful to the voice recognition-based dialing.
    Type: Grant
    Filed: December 30, 2014
    Date of Patent: March 3, 2020
    Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED
    Inventors: Jianjun Ma, Liping Hu, Richard Allen Kreifeldt
  • Patent number: 10523897
    Abstract: One or more sensor devices detect a condition of each of users. With regard to a user set including at least two of the users as elements, an information processing apparatus calculates an agreement degree representing a degree of agreement between the at least two users being the elements in the user set, based on the condition information on the at least two users which is in the obtained condition information on all the users. An information presentation device provides presentation information based on a result of the agreement degree calculation.
    Type: Grant
    Filed: August 30, 2017
    Date of Patent: December 31, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Yuri Nishikawa, Masayuki Misaki
  • Patent number: 10447675
    Abstract: A method for delivering primary information that exists in at least one electronic form, includes transmission of the primary information via at least one communication network to at least one communication system allocated to an addressee of the primary information; creation of verification information relating to the acknowledgement of the primary information by the addressee; and saving and/or transmission of the verification information via at least one communication network. Individualized data is thus made available which not only documents the receipt of the primary information by the addressee but also the acknowledgement of the receipt of the primary information by the addressee. A telecommunication arrangement and a telecommunication unit which are suitable for carrying out the method are also disclosed.
    Type: Grant
    Filed: November 30, 2007
    Date of Patent: October 15, 2019
    Assignee: Sigram Schindler Beteiligungsgesellschaft MbH
    Inventors: Sigram Schindler, Juergen Schulze
  • Patent number: 10403273
    Abstract: The present teaching relates to facilitating a guided dialog with a user. In one example, an input utterance is obtained from the user. One or more task sets are estimated based on the input utterance. Each of the one or more task sets includes a plurality of tasks estimated to be requested by the user via the input utterance and is associated with a confidence score computed based on statistics with respect to the plurality of tasks in the task set. At least one of the one or more task sets is selected based on their respective confidence scores. A response is generated based on the tasks in the selected at least one task set. The response is provided to the user.
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: September 3, 2019
    Assignee: Oath Inc.
    Inventors: Sungjin Lee, Amanda Stent
  • Patent number: 10402500
    Abstract: Provided are a method and electronic device for voice translation. The electronic device includes a voice receiver configured to receive a voice signal; a processor configured to divide the voice signal into a plurality of voice segments, determine an input language and a speaker that correspond to each of the plurality of voice segments, determine a translation direction based on the input language and the speaker of the voice segments, and translate the voice segments according to the translation direction to generate a translation result; and an output device configured to output the translation result.
    Type: Grant
    Filed: February 8, 2017
    Date of Patent: September 3, 2019
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Marcin Chochowski, Pawel Przybysz, Elzbieta Gajewska-Dendek
  • Patent number: 10276150
    Abstract: A correction system of the embodiment includes an interface system, a calculator, a generator, and a display controller. The interface system receives correction information for correcting a voice recognition result. The calculator estimates a part of the voice recognition result to be corrected and calculates a degree of association between the part to be corrected and the correction information. The generator generates corrected display information comprising at least one of the correction information and the part to be corrected using a display format corresponding to the degree of association. The display controller outputs the corrected display information on a display.
    Type: Grant
    Filed: February 23, 2017
    Date of Patent: April 30, 2019
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kosei Fume, Taira Ashikawa, Masayuki Ashikawa, Hiroshi Fujimura
  • Patent number: 10049657
    Abstract: Phoneme boundaries may be determined from a signal corresponding to recorded audio by extracting auditory attention features from the signal and extracting phoneme posteriors from the signal. The auditory attention features and phoneme posteriors may then be combined to detect boundaries in the signal.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: August 14, 2018
    Assignee: SONY INTERACTIVE ENTERTAINMENT INC.
    Inventor: Ozlem Kalinli-Akbacak
  • Patent number: 10002259
    Abstract: An always-listening-capable computing device includes a receiver for input from a user, a module for communication with a remote server, and a gate-keeping module that, when enabled, prevents the communication module from transmitting data external to the device. After determining that user input includes a first wake up phrase, the device processor automatically transmits a representation of user input subsequent to the phrase, activates an always-receiving mode to transmit a stream of user input captured subsequent to the phrase, deactivates the always-receiving mode to prevent transmission of user input received subsequent to the phrase, unless also preceded by a second wake up phrase, or enables the gate-keeping module to prevent transmission of data external to the device.
    Type: Grant
    Filed: November 14, 2017
    Date of Patent: June 19, 2018
    Inventor: Xiao Ming Mai
  • Patent number: 9972308
    Abstract: Methods, a system, and a classifier are provided. A method includes preparing, by a processor, pairs for an information retrieval task. Each pair includes (i) a training-stage speech recognition result for a respective sequence of training words and (ii) an answer label corresponding to the training-stage speech recognition result. The method further includes obtaining, by the processor, a respective rank for the answer label included in each pair to obtain a set of ranks. The method also includes determining, by the processor, for each pair, an end of question part in the training-stage speech recognition result based on the set of ranks. The method additionally includes building, by the processor, the classifier such that the classifier receives a recognition-stage speech recognition result and returns a corresponding end of question part for the recognition-stage speech recognition result, based on the end of question part determined for the pairs.
    Type: Grant
    Filed: November 8, 2016
    Date of Patent: May 15, 2018
    Assignee: International Business Machines Corporation
    Inventors: Tohru Nagano, Ryuki Tachibana
  • Patent number: 9954507
    Abstract: Various aspects of this disclosure describe setting an audio compressor threshold using averaged audio measurements. Examples include calculating one or more average values of amplitude values of an audio file, and setting a threshold used in the audio compressor based on the calculated thresholds. Samples of the audio file with amplitude values above the threshold are attenuated, while samples of the audio file with amplitude values below the threshold are not attenuated. The threshold can be set equal to a calculated average value, or from a function of one or more calculated average values. Different audio channels comprising the audio file can be processed to set a respective compressor threshold for each audio channel.
    Type: Grant
    Filed: August 1, 2016
    Date of Patent: April 24, 2018
    Assignee: ADOBE SYSTEMS INCORPORATED
    Inventor: Matthew Gehring Stegner
  • Patent number: 9947321
    Abstract: Methods and systems for handling speech recognition processing in effectively real-time, via the Internet, in order that users do not experience noticeable delays from the start until they receive responsive feedback. A user uses a client to access the Internet and a server supporting speech recognition processing. The user inputs speech to the client, which transmits the user speech to the server in approximate real-time. The server evaluates the user speech, and provides responsive feedback to the client, again, in approximate real-time, with minimum latency delays. The client upon receiving responsive feedback from the server, displays, or otherwise provides, the feedback to the user.
    Type: Grant
    Filed: February 21, 2017
    Date of Patent: April 17, 2018
    Assignee: PEARSON EDUCATION, INC.
    Inventor: Christopher S. Jochumson
  • Patent number: 9922640
    Abstract: The disclosure describe a system and method for detecting one or more segments of desired speech utterances from an audio stream using timings of events from other modes that are correlated to the timings of the desired segments of speech. The redundant information from other modes results in a highly accurate and robust utterance detection.
    Type: Grant
    Filed: February 3, 2014
    Date of Patent: March 20, 2018
    Inventor: Ashwin P Rao
  • Patent number: 9886943
    Abstract: A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
    Type: Grant
    Filed: January 13, 2017
    Date of Patent: February 6, 2018
    Assignee: Adadel Inc.
    Inventor: Chang-Qing Shu
  • Patent number: 9806740
    Abstract: A device for data compression includes a processing unit, a temporary memory, and a storage device. The temporary memory is used to temporarily store data to be compressed. The storage device includes multiple physical blocks. Each physical block has a same volume size. The processing unit compresses the to-be-compressed data, generates compressed data, and stores the compressed data into one of the physical blocks. The processing unit compares a data size of the compressed data and a volume size of one physical block, and when the data size of the compressed data is smaller than the volume size, the processing unit stores remnant data into the same physical block as the compressed data stored in, wherein the total data size of the remnant data plus the compressed data is equal to the volume size of the physical block both are stored in.
    Type: Grant
    Filed: July 19, 2017
    Date of Patent: October 31, 2017
    Assignee: ACCELSTOR, INC.
    Inventors: An-Nan Chang, Shih-Chiang Tsao, Pao-Chien Li, Chih-Kang Nung
  • Patent number: 9805714
    Abstract: The disclosure is directed to a directional keyword verification method and an electronic device using the same method. According to an exemplary embodiment, the proposed keyword verification method would include receiving an audio stream; analyzing the audio stream to obtain at least a word; determining whether the word matches a key word from a keyword database; assigning the word as a filler if the word does not match the keyword from the keyword database; determining whether a vowel pattern of the word matches the vowel pattern of the keyword if the word matches the key word from the keyword database; assigning the first word as a trigger or command word if the vowel pattern of the word matches the vowel pattern of the key word; and otherwise assigning the word as a filler if the vowel pattern of the word does not match the vowel pattern of the key word.
    Type: Grant
    Filed: March 22, 2016
    Date of Patent: October 31, 2017
    Assignee: ASUSTeK COMPUTER INC.
    Inventors: Bhoomek D. Pandya, Hsing-Yu Tsai, Min-Hong Wang, Cheng-Chung Hsu
  • Patent number: 9711133
    Abstract: A desired character train included in a predefined reference character train, such as lyrics, is set as a target character train, and a user designates a target phoneme train that is indirectly representative of the target character train by use of a limited plurality of kinds of particular phonemes, such as vowels and a particular consonants. A reference phoneme train indirectly representative of the reference character train by use of the particular phonemes is prepared in advance. Based on a comparison between the target phoneme train and the reference phoneme train, a sequence of the particular phonemes in the reference phoneme train that matches the target phoneme train is identified, and a character sequence in the reference character train that corresponds to the identified sequence of the particular phonemes is identified. The thus-identified character sequence estimates the target character train.
    Type: Grant
    Filed: July 29, 2015
    Date of Patent: July 18, 2017
    Assignee: YAMAHA CORPORATION
    Inventor: Kazuhiko Yamamoto
  • Patent number: 9646603
    Abstract: A method, apparatus, and system are described for a continuous speech recognition engine that includes a fine speech recognizer model, a coarse sound representation generator, and a coarse match generator. The fine speech recognizer model receives a time coded sequence of sound feature frames, applies a speech recognition process to the sound feature frames and determines at least a best guess at each recognizable word that corresponds to the sound feature frames. The coarse sound representation generator generates a coarse sound representation of the recognized word. The coarse match generator determines a likelihood of the coarse sound representation actually being the recognized word based on comparing the coarse sound representation of the recognized word to a database containing the known sound of that recognized word and assigns the likelihood as a robust confidence level parameter to that recognized word.
    Type: Grant
    Filed: February 27, 2009
    Date of Patent: May 9, 2017
    Assignee: LONGSAND LIMITED
    Inventor: Mahapathy Kadirkamanathan
  • Patent number: 9583094
    Abstract: A method and system for improving the accuracy of a speech recognition system using were confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
    Type: Grant
    Filed: September 22, 2016
    Date of Patent: February 28, 2017
    Assignee: ADACEL, INC.
    Inventor: Chang-Qing Shu
  • Patent number: 9542945
    Abstract: Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.
    Type: Grant
    Filed: June 10, 2015
    Date of Patent: January 10, 2017
    Assignee: Google Inc.
    Inventor: Matthew I. Lloyd
  • Patent number: 9489371
    Abstract: A method for detecting data in a sequence of characters or text using both a statistical engine and a pattern engine. The statistical engine is trained to recognize certain types of data and the pattern engine is programmed to recognize the grammatical pattern of certain types of data. The statistical engine may scan the sequence of characters to output first data, and the pattern engine may break down the first data into subsets of data. Alternatively, the statistical engine may output items that have a predetermined probability or greater of being a certain type of data and the pattern engine may then detect the data from the output items and/or remove incorrect information from the output items.
    Type: Grant
    Filed: July 12, 2013
    Date of Patent: November 8, 2016
    Assignee: Apple Inc.
    Inventors: Olivier Bonnet, Frederick de Jaeger, Romain Goyet, Jean-Pierre Ciudad
  • Patent number: 9478218
    Abstract: A method and system for improving the accuracy of a speech recognition system using word confidence score (WCS) processing is introduced. Parameters in a decoder are selected to minimize a weighted total error rate, such that deletion errors are weighted more heavily than substitution and insertion errors. The occurrence distribution in WCS is different depending on whether the word was correctly identified and based on the type of error. This is used to determine thresholds in WCS for insertion and substitution errors. By processing the hypothetical word (HYP) (output of the decoder), a mHYP (modified HYP) is determined. In some circumstances, depending on the WCS's value in relation to insertion and substitution threshold values, mHYP is set equal to: null, a substituted HYP, or HYP.
    Type: Grant
    Filed: October 24, 2008
    Date of Patent: October 25, 2016
    Assignee: Adacel, Inc.
    Inventor: Chang-Qing Shu
  • Patent number: 9401145
    Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.
    Type: Grant
    Filed: May 5, 2014
    Date of Patent: July 26, 2016
    Assignee: VERINT SYSTEMS LTD.
    Inventors: Omer Ziv, Ran Achituv, Ido Shapira
  • Patent number: 9378728
    Abstract: Methods and systems are provided for gathering research data that includes information pertaining to audio signals received on a portable device, such as a cell phone. Frequency domain data is received or produced, a signature is extracted from the frequency domain data and an ancillary code is read from the frequency domain data.
    Type: Grant
    Filed: April 25, 2014
    Date of Patent: June 28, 2016
    Assignee: The Nielsen Company (US), LLC
    Inventor: Alan R. Neuhauser
  • Patent number: 9330676
    Abstract: A method to filter out speech interference is provided. The method includes defining a time threshold by using a probability distribution model. When a current instruction from a speech input is recognized, a reference instruction recognized from the speech input is obtained. The current instruction is recognized right after the recognition of the reference instruction, wherein the reference instruction and the current instruction correspond to a first time point and a second time point respectively. The method includes determining whether speech interference occurs according to a comparison result of the time threshold and an interval between the first time point and the second time point as well as a state corresponding to the first time point. The method includes filtering out the reference instruction and the current instruction if the speech interference occurs, and outputting the reference instruction or the current instruction if the speech interference does not occur.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: May 3, 2016
    Assignee: Wistron Corporation
    Inventor: Hsi-Chun Hsiao
  • Patent number: 9324316
    Abstract: There is provided a prosody generator that generates prosody information for implementing highly natural speech synthesis without unnecessarily collecting large quantities of learning data. A data dividing means 81 divides into subspaces the data space of a learning database as an assembly of learning data indicative of the feature quantities of speech waveforms. A density information extracting means 82 extracts density information indicative of the density state in terms of information quantity of the learning data in each of the subspaces divided by the data dividing means 81. A prosody information generating method selecting means 83 selects either a first method or a second method as a prosody information generating method based on the density information, the first method involving generating the prosody information using a statistical technique, the second method involving generating the prosody information using rules based on heuristics.
    Type: Grant
    Filed: May 10, 2012
    Date of Patent: April 26, 2016
    Assignee: NEC CORPORATION
    Inventors: Yasuyuki Mitsui, Reishi Kondo, Masanori Kato
  • Patent number: 9318103
    Abstract: An automatic speech recognition system for recognizing a user voice command in noisy environment, including: matching means for matching elements retrieved from speech units forming said command with templates in a template library; characterized by processing means including a MultiLayer Perceptron for computing posterior templates (P(Otemplate(q))) stored as said templates in said template library; means for retrieving posterior vectors (P(Otest(q))) from said speech units, said posterior vectors being used as said elements. The present invention relates also to a method for recognizing a user voice command in noisy environments.
    Type: Grant
    Filed: February 21, 2013
    Date of Patent: April 19, 2016
    Assignee: VEOVOX SA
    Inventors: John Dines, Jorge Carmona, Olivier Masson, Guillermo Aradilla
  • Patent number: 9263061
    Abstract: Methods and systems are provided for detecting chop in an audio signal. A time-frequency representation, such as a spectrogram, is created for an audio signal and used to calculate a gradient of mean power per frame of the audio signal. Positive and negative gradients are defined for the signal based on the gradient of mean power, and a maximum overlap offset between the positive and negative gradients is determined by calculating a value that maximizes the cross-correlation of the positive and negative gradients. The negative gradient values may be combined (e.g., summed) with the overlap offset, and the combined values then compared with a threshold to estimate the amount of chop present in the audio signal. The chop detection model provided is low-complexity and is applicable to narrowband, wideband, and superwideband speech.
    Type: Grant
    Filed: May 21, 2013
    Date of Patent: February 16, 2016
    Assignee: GOOGLE INC.
    Inventors: Andrew J. Hines, Jan Skoglund, Naomi Harte, Anil Kokaram
  • Patent number: 9117460
    Abstract: The present invention relates to speech recognition systems, especially to arranging detection of end-of utterance in such systems. A speech recognizer of the system is configured to determine whether recognition result determined from received speech data is stabilized. The speech recognizer is configured to process values of best state scores and best token scores associated with frames of received speech data for end of utterance detection purposes. Further, the speech recognizer is configured to determine whether end of utterance is detected or not, based on the processing, if the recognition result is stabilized.
    Type: Grant
    Filed: May 12, 2004
    Date of Patent: August 25, 2015
    Assignee: Core Wireless Licensing S.A.R.L.
    Inventor: Tommi Lahti
  • Patent number: 9099098
    Abstract: In speech processing systems, compensation is made for sudden changes in the background noise in the average signal-to-noise ratio (SNR) calculation. SNR outlier filtering may be used, alone or in conjunction with weighting the average SNR. Adaptive weights may be applied on the SNRs per band before computing the average SNR. The weighting function can be a function of noise level, noise type, and/or instantaneous SNR value. Another weighting mechanism applies a null filtering or outlier filtering which sets the weight in a particular band to be zero. This particular band may be characterized as the one that exhibits an SNR that is several times higher than the SNRs in other bands.
    Type: Grant
    Filed: November 6, 2012
    Date of Patent: August 4, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatraman Srinivasa Atti, Venkatesh Krishnan
  • Patent number: 9076453
    Abstract: The present invention relates to a postfilter and a postfilter control to be associated with a postfilter for improving perceived quality of speech reconstructed at a speech decoder. The postfilter control comprises means for measuring stationarity of a speech signal reconstructed at a decoder, means for determining a coefficient to a postfilter control parameter based on the measured stationarity, and means for transmitting the determined coefficient to a postfilter, such that the postfilter can process the reconstructed speech signal by applying the determined coefficient to the postfilter control parameter to obtain an enhanced speech signal.
    Type: Grant
    Filed: May 15, 2014
    Date of Patent: July 7, 2015
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventor: Volodya Grancharov