Correlation Patents (Class 704/237)
  • Patent number: 11813076
    Abstract: The present invention provides systems, methods, and articles for stress reduction and sleep promotion. A stress reduction and sleep promotion system includes at least one remote device, at least one body sensor, and at least one remote server. In other embodiments, the stress reduction and sleep promotion system includes machine learning.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: November 14, 2023
    Assignee: SLEEPME INC.
    Inventors: Todd Youngblood, Tara Youngblood
  • Patent number: 11354808
    Abstract: An image processing apparatus including a unit configured to acquire a current image from an inputted video and a background model which comprises a background image and foreground/background classification information of visual elements; a unit configured to determine first similarity measures between visual elements in the current image and the visual elements in the background model; and a unit configured to classify the visual elements in the current image as the foreground or the background according to the current image, the background image in the background model and the first similarity measures. Wherein, the visual elements in the background model are the visual elements whose classification information is the background and which neighbour to corresponding portions of the visual elements in the current image. Accordingly, the accuracy of the foreground detection could be improved.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: June 7, 2022
    Assignee: Canon Kabushiki Kaisha
    Inventors: Qin Yang, Tsewei Chen
  • Patent number: 11282530
    Abstract: Methods, an encoder and a decoder are configured for transition between frames with different internal sampling rates. Linear predictive (LP) filter parameters are converted from a sampling rate S1 to a sampling rate S2. A power spectrum of a LP synthesis filter is computed, at the sampling rate S1, using the LP filter parameters. The power spectrum of the LP synthesis filter is modified to convert it from the sampling rate S1 to the sampling rate S2. The modified power spectrum of the LP synthesis filter is inverse transformed to determine autocorrelations of the LP synthesis filter at the sampling rate S2. The autocorrelations are used to compute the LP filter parameters at the sampling rate S2.
    Type: Grant
    Filed: October 7, 2019
    Date of Patent: March 22, 2022
    Assignee: VOICEAGE EVS LLC
    Inventors: Redwan Salami, Vaclav Eksler
  • Patent number: 11079594
    Abstract: Disclosed is a head-up display device. The head-up display device includes: a host (1), configured to send an image signal to a display device (2); a display device (2) provided at one side of the host (1), configured to display the image signal; a curved semi-transparent reflecting screen (3) provided in a position opposite to the display device (2), configured to receive an image displayed by the display device (2), and display the image on the other side of the semi-transparent reflecting screen (3) at one side of the host (1), wherein the semi-transparent reflecting screen (3) is further configured to display the image after ghost is removed; a voice interaction unit (20) pre-configured in the host (1), and a driver detection device (12) mounted in the host (1). The head-up display device supports voice interaction throughout a process, driver real-time monitoring, and advance warning, and has good display effect.
    Type: Grant
    Filed: November 4, 2016
    Date of Patent: August 3, 2021
    Assignee: BEIJING ILEJA TECH. CO. LTD.
    Inventors: Yanlong Wang, Shunping Miao, Jianhui Wang, Jianfeng Min
  • Patent number: 11069366
    Abstract: A method for evaluating performance of a speech enhancement algorithm includes: acquiring a first speech signal including noise and a second speech signal including noise, wherein the first speech signal is acquired from a near-end audio acquisition device close to a sound source, the second speech signal is acquired from a far-end audio acquisition device far from the sound source, and the near-end audio acquisition device is closer to the sound source than the far-end audio acquisition device; acquiring a pseudo-pure speech signal based on the first speech signal and the second speech signal, as a reference speech signal; enhancing the second speech signal by using a preset speech enhancement algorithm, to obtain a denoised speech signal to be tested; and acquiring a correlation coefficient between the reference speech signal and the speech signal to be tested, for evaluating the speech enhancement algorithm.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: July 20, 2021
    Assignee: Beijing Xiaomi Mobile Software Co., Ltd.
    Inventors: Yuhong Yang, Linjun Cai, Fei Xiang, Shicong Li, Jiaqian Feng, Weiping Tu, Haojun Ai
  • Patent number: 10966034
    Abstract: A method of training an algorithm for optimizing intelligibility of speech components of a sound signal in hearing aids, headsets, etc.
    Type: Grant
    Filed: January 16, 2019
    Date of Patent: March 30, 2021
    Assignee: OTICON A/S
    Inventors: Asger Heidemann Andersen, Jan M. De Haan, Jesper Jensen
  • Patent number: 10916344
    Abstract: A device receives historical information associated with an individual to be monitored, wherein the historical information includes at least one of information associated with a health history of the individual, health histories of other individuals, activities of the individual, or activities of the other individuals. The device receives monitored information associated with the individual from one or more client devices associated with the individual, and pre-processes the monitored information to generate pre-processed monitored information that is understood by the trained machine learning model. The device processes the pre-processed monitored information, with a trained machine learning model, to identify one or more activities of the individual and one or more deviations from the one or more activities by the individual, and performs one or more actions based on the one or more activities of the individual and/or the one or more deviations from the one or more activities.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: February 9, 2021
    Assignee: Accenture Global Solutions Limited
    Inventors: Laetitia Cailleteau Eriksson, Kar Lok Chan, Faisal Ahmed Valli, Christopher Paul Ashley
  • Patent number: 10909623
    Abstract: A method and apparatus use hardware logic deployed on a reconfigurable logic device to process a stream of financial information at hardware speeds. The hardware logic can be configured to perform data reduction operations on the financial information stream. Examples of such data reductions operations include data processing operations to compute a latest stock price, a minimum stock price, and a maximum stock price.
    Type: Grant
    Filed: November 21, 2011
    Date of Patent: February 2, 2021
    Assignee: IP Reservoir, LLC
    Inventors: Ronald S. Indeck, Ron Kaplan Cytron, Mark Allen Franklin, Roger D. Chamberlain
  • Patent number: 10867611
    Abstract: A low power sound recognition sensor is configured to receive an analog signal that may contain a signature sound. Sparse sound parameter information is extracted from the analog signal. The extracted sparse sound parameter information is processed using a speaker dependent sound signature database stored in the sound recognition sensor to identify sounds or speech contained in the analog signal. The sound signature database may include several user enrollments for a sound command each representing an entire word or multiword phrase. The extracted sparse sound parameter information may be compared to the multiple user enrolled signatures using cosine distance, Euclidean distance, correlation distance, etc., for example.
    Type: Grant
    Filed: August 11, 2016
    Date of Patent: December 15, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Bozhao Tan
  • Patent number: 10783887
    Abstract: An application processor is provided. The application processor includes a system bus, a host processor, a voice trigger system, and an interrupt pad. The host processor is electrically connected to the system bus. The voice trigger system is electrically connected to the system bus and performs a voice trigger operation and issues a trigger event based on a trigger input signal that is provided through a trigger interface. The interrupt pad receives a first interrupt signal in response to an operating environment changing from a low noise environment to a noisy environment, and a part of the voice trigger system is changed from an idle state to a normal state to perform the voice trigger operation in response to the first interrupt signal being received.
    Type: Grant
    Filed: November 16, 2018
    Date of Patent: September 22, 2020
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Sun-Kyu Kim
  • Patent number: 10713262
    Abstract: Approaches are described for ranking multiple products or other items, such as products obtained in response to a search request submitted to a server. The ranking system determines a ranking score for the products based on both data available online and item data that must be computed offline due to longer computation time or unavailability of data. The ranking score can be used to rank the products and determine which products are the most relevant to the user. A hybrid boosting method is used to first train an online ranking function to produce an online ranking score for the item. In the second phase, an offline ranking function is trained to produce a second ranking score for the item. The online rank score is combined with the offline rank score at query time to produce a combined rank for the items in the search results.
    Type: Grant
    Filed: October 26, 2016
    Date of Patent: July 14, 2020
    Assignee: A9.com, Inc.
    Inventors: Yue Zhou, Francois Huet
  • Patent number: 10693905
    Abstract: An invalidity detection electronic control unit connected to a bus used by a plurality of electronic control units (ECUs) to communicate with one another in accordance with controller area network (CAN) protocol includes a receiving unit that receives a frame for which transmission is started and a transmitting unit that transmits an error frame on the bus before a tail end of the frame is transmitted if the frame received by the receiving unit meets a predetermined condition indicating invalidity and transmits a normal frame that conforms to the CAN protocol after the error frame is transmitted. Even when a reception error counter of the ECU connected to the bus is incremented due to the impact of the error frame, the reception error counter is decremented by the normal frame.
    Type: Grant
    Filed: January 25, 2018
    Date of Patent: June 23, 2020
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Hiroshi Amano, Toshihisa Nakano, Natsume Matsuzaki, Tomoyuki Haga, Yoshihiro Ujiie, Takeshi Kishikawa
  • Patent number: 10629226
    Abstract: System and method for acoustic signal processing are disclosed. An exemplary device for acoustic signal processing includes a voice activity detector configured to detect a speech of a user. The device includes a microphone configured to receive acoustic signals from the user. The device further includes at least one processor configured to process the acoustic signals in response to detecting the speech of the user. The at least one processor is in an idle state before the speech of the user is detected.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: April 21, 2020
    Assignee: BESTECHNIC (SHANGHAI) CO., LTD.
    Inventors: Weifeng Tong, Qian Li, Liang Zhang
  • Patent number: 10199053
    Abstract: A method and apparatus for eliminating popping sounds at the beginning of audio includes: examining audio frames within a pre-set time period at the beginning of audio to determine a popping residing section; applying popping elimination to audio frames in the popping residing section; calculating an average value of amplitudes of M audio frames preceding the popping residing section and an average value of amplitudes of K audio frames succeeding the popping residing section; setting the amplitudes of the audio frames in the popping residing section to zero in response to a determination that the two average values are both smaller than a pre-set sound reduction threshold; weakening the amplitudes of the audio frames in the popping residing section in response to a determination that both the two average values are not smaller than a pre-set sound reduction threshold; M and K are integers larger than one.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: February 5, 2019
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventor: Lingcheng Kong
  • Patent number: 10157619
    Abstract: A method and a device for searching according to a speech based on artificial intelligence are provided. The method includes: identifying an input speech of a user to determine whether the input speech is a child speech; filtrating a searched result obtained according to the input speech to obtain a filtrated searched result, if the input speech is the child speech; and feeding the filtrated searched result back to the user.
    Type: Grant
    Filed: November 28, 2017
    Date of Patent: December 18, 2018
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Chao Li, Xiangang Li, Jue Sun
  • Patent number: 10134425
    Abstract: A system for determining an endpoint of an utterance during automatic speech recognition (ASR) processing that accounts for the direction and duration of the incoming speech. Beamformers of the ASR system may identify a source direction of the audio. The system may track the duration speech has been received from that source direction so that if speech is detected in another direction, the original source speech may be weighted differently for purposes of determining an endpoint of the utterance. Speech from a new direction may be discarded or treated like non-speech for purposes of determining an endpoint of speech from an original direction.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: November 20, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: Charles Melvin Johnson, Jr.
  • Patent number: 10121471
    Abstract: An automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder. The ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the non-speech duration by the probability of each hypotheses. When the aggregate weighted non-speech exceeds a threshold, an endpoint may be declared.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: November 6, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Bjorn Hoffmeister, Ariya Rastrow, Baiyang Liu
  • Patent number: 9905240
    Abstract: Systems, methods, and devices for intelligent speech recognition and processing are disclosed. According to one embodiment, a method for improving intelligibility of a speech signal may include (1) at least one processor receiving an incoming speech signal comprising a plurality of sound elements; (2) the at least one processor recognizing a sound element in the incoming speech signal to improve the intelligibility thereof; (3) the at least one processor processing the sound element by at least one of modifying and replacing the sound element; and (4) the at least one processor outputting the processed speech signal comprising the processed sound element.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: February 27, 2018
    Assignee: AUDIMAX, LLC
    Inventor: Harry Levitt
  • Patent number: 9690776
    Abstract: Methods and systems are provided for contextual language understanding. A natural language expression may be received at a single-turn model and a multi-turn model for determining an intent of a user. For example, the single-turn model may determine a first prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The multi-turn model may determine a second prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The first prediction and the second prediction may be combined to produce a final prediction relative to the intent of the natural language expression. An action may be performed based on the final prediction of the natural language expression.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: June 27, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ruhi Sarikaya, Puyang Xu, Alexandre Rochette, Asli Celikyilmaz
  • Patent number: 9484022
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.
    Type: Grant
    Filed: May 23, 2014
    Date of Patent: November 1, 2016
    Assignee: Google Inc.
    Inventor: Alexander H. Gruenstein
  • Patent number: 9247376
    Abstract: A method of recommending an application, which is capable of selecting and recommending an application with a high possibility of use, the method including: receiving, in a server, frequencies of use of a plurality of applications that are classified according to a time when each application is executed or a location where each application is executed; selecting an application from among the plurality of applications based on time and location information of where a mobile terminal is located and the frequency of use of the application; and transmitting application recommendation information including the selected application from the server to the mobile terminal.
    Type: Grant
    Filed: April 11, 2012
    Date of Patent: January 26, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ji-in Nam, Moon-sang Lee, Min-soo Koo, Seung-hyun Yoon
  • Patent number: 9245529
    Abstract: A method of encoding samples in a digital signal is provided that includes receiving a plurality of samples of the digital signal, and encoding the plurality of samples, wherein an output number of bits is adapted for coding efficiency when a value in a range of possible distinct data values of the plurality of samples is not found in the plurality of samples.
    Type: Grant
    Filed: June 18, 2010
    Date of Patent: January 26, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Jacek Piotr Stachurski, Lorin Paul Netsch
  • Patent number: 9123330
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data encoding ambient sounds, identifying media content that matches the audio data, and a timestamp corresponding to a particular portion of the identified media content, identifying a speaker associated with the particular portion of the identified media content corresponding to the timestamp, and providing information identifying the speaker associated with the particular portion of the identified media content for output.
    Type: Grant
    Filed: May 1, 2013
    Date of Patent: September 1, 2015
    Assignee: Google Inc.
    Inventors: Matthew Sharifi, Dominik Roblek
  • Patent number: 9117458
    Abstract: A method of processing an audio signal is disclosed. The present invention includes a method for processing an audio signal, comprising: receiving, by an audio processing apparatus, the spectral data including a current block, and substitution type information indicating whether to apply a shape prediction scheme to a current block; when the substitution type information indicates that the shape prediction scheme is applied to the current block, receiving lag information indicating an interval between spectral coefficients of the current block and the predictive shape vector of a current frame or a previous frame; obtaining spectral coefficients by substituting for spectral hole included in the current block using the predictive shape vector.
    Type: Grant
    Filed: November 2, 2010
    Date of Patent: August 25, 2015
    Assignees: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei University
    Inventors: Hyen-O Oh, Chang Heon Lee, Hong Goo Kang
  • Patent number: 9020817
    Abstract: Methods and apparatus, including computer program products, for using speech to text for detecting commercials and aligning edited episodes with transcripts. A method includes, receiving an original video or audio having a transcript, receiving an edited video or audio of the original video or audio, applying a speech-to-text process to the received original video or audio having a transcript, applying a speech-to-text process to the received edited video or audio, and applying an alignment to determine locations of the edits.
    Type: Grant
    Filed: January 18, 2013
    Date of Patent: April 28, 2015
    Assignee: Ramp Holdings, Inc.
    Inventors: R Paul Johnson, Raymond Lau
  • Publication number: 20150088508
    Abstract: In embodiments, apparatuses, methods and storage media are described that are associated with training adaptive speech recognition systems (“ASR”) using audio and text obtained from captioned video. In various embodiments, the audio and caption may be aligned for identification, such as according to a start and end time associated with a caption, and the alignment may be adjusted to better fit audio to a given caption. In various embodiments, the aligned audio and caption may then be used for training if an error value associated with the audio and caption demonstrates that the audio and caption will aid in training the ASR. In various embodiments, filters may be used on audio and text prior to training. Such filters may be used to exclude potential training audio and text based on filter criteria. Other embodiments may be described and claimed.
    Type: Application
    Filed: September 25, 2013
    Publication date: March 26, 2015
    Applicant: Verizon Patent and Licensing Inc.
    Inventors: Sujeeth S. Bharadwaj, Suri B. Medapati
  • Patent number: 8942978
    Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
    Type: Grant
    Filed: July 14, 2011
    Date of Patent: January 27, 2015
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
  • Publication number: 20150012273
    Abstract: An apparatus includes a function module, a strength module, and a filter module. The function module compares an input signal, which has a component, to a first delayed version of the input signal and a second delayed version of the input signal to produce a multi-dimensional model. The strength module calculates a strength of each extremum from a plurality of extrema of the multi-dimensional model based on a value of at least one opposite extremum of the multi-dimensional model. The strength module then identifies a first extremum from the plurality of extrema, which is associated with a pitch of the component of the input signal, that has the strength greater than the strength of the remaining extrema. The filter module extracts the pitch of the component from the input signal based on the strength of the first extremum.
    Type: Application
    Filed: March 3, 2014
    Publication date: January 8, 2015
    Applicant: University Of Maryland, College Park
    Inventors: Carol Espy-Wilson, Srikanth Vishnubhotla
  • Publication number: 20150012275
    Abstract: A semiconductor integrated circuit device for speech recognition includes a scenario setting unit that receives a command designating scenario flow information and selects prescribed speech reproduction data in a speech reproduction data storage and a prescribed conversion list, in accordance with the scenario flow information, a standard pattern extraction unit that extracts a standard pattern corresponding to at least part of individual words or sentences included in the prescribed conversion list from a speech recognition database, a speech signal synthesizer that synthesizes an output speech signal, a signal processor that generates a feature pattern representing the distribution state of the frequency component of an input speech signal, and a match detector that compares the feature pattern with the standard pattern and outputs a speech recognition result.
    Type: Application
    Filed: July 7, 2014
    Publication date: January 8, 2015
    Inventor: Tsutomu NONAKA
  • Publication number: 20150012274
    Abstract: An apparatus for extracting features for speech recognition in accordance with the present invention includes: a frame forming portion configured to separate input speech signals in frame units having a prescribed size; a static feature extracting portion configured to extract a static feature vector for each frame of the speech signals; a dynamic feature extracting portion configured to extract a dynamic feature vector representing a temporal variance of the extracted static feature vector by use of a basis function or a basis vector; and a feature vector combining portion configured to combine the extracted static feature vector with the extracted dynamic feature vector to configure a feature vector stream.
    Type: Application
    Filed: May 15, 2014
    Publication date: January 8, 2015
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Sung-Joo LEE, Byung-Ok Kang, Hoon Chung, Ho-Young Jung, Hwa-Jeon Song, Yoo-Rhee Oh, Yun-Keun Lee
  • Patent number: 8924216
    Abstract: A method for synchronizing sound data and text data, said text data being obtained by manual transcription of said sound data during playback of the latter. The proposed method comprises the steps of repeatedly querying said sound data and said text data to obtain a current time position corresponding to a currently played sound datum and a currently transcribed text datum, respectively, correcting said current time position by applying a time correction value in accordance with a transcription delay, and generating at least one association datum indicative of a synchronization association between said corrected time position and said currently transcribed text datum. Thus, the proposed method achieves cost-effective synchronization of sound and text in connection with the manual transcription of sound data.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: December 30, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Andreas Neubacher, Miklos Papai
  • Patent number: 8892436
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: November 18, 2014
    Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation
    Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
  • Patent number: 8825480
    Abstract: A system is provided for transmitting information through a speech codec (in-band) such as found in a wireless communication network. A modulator transforms the data into a spectrally noise-like signal based on the mapping of a shaped pulse to predetermined positions within a modulation frame, and the signal is efficiently encoded by a speech codec. A synchronization sequence provides modulation frame timing at the receiver and is detected based on analysis of a correlation peak pattern. A request/response protocol provides reliable transfer of data using message redundancy, retransmission, and/or robust modulation modes dependent on the communication channel conditions.
    Type: Grant
    Filed: June 3, 2009
    Date of Patent: September 2, 2014
    Assignee: Qualcomm Incorporated
    Inventors: Christoph A. Joetten, Christian Sgraja, Georg Frank, Pengjun Huang, Christian Pietsch, Marc W. Werner, Ethan R. Duni, Eugene J. Baik
  • Patent number: 8781825
    Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: July 15, 2014
    Assignee: Sensory, Incorporated
    Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
  • Patent number: 8694317
    Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: April 8, 2014
    Assignee: Aurix Limited
    Inventors: Adrian I Skilling, Howard A K Wright
  • Patent number: 8655655
    Abstract: A sound event detecting module for detecting whether a sound event with characteristic of repeating is generated. A sound end recognizing unit recognizes ends of sounds according to a sound signal to generate sound sections and multiple sets of feature vectors of the sound sections correspondingly. A storage unit stores at least M sets of feature vectors. A similarity comparing unit compares the at least M sets of feature vectors with each other, and correspondingly generates a similarity score matrix, which stores similarity scores of any two of the sound sections of the at least M of the sound sections. A correlation arbitrating unit determines the number of sound sections with high correlations to each other according to the similarity score matrix. When the number is greater than one threshold value, the correlation arbitrating unit indicates that the sound event with the characteristic of repeating is generated.
    Type: Grant
    Filed: December 30, 2010
    Date of Patent: February 18, 2014
    Assignee: Industrial Technology Research Institute
    Inventors: Yuh-Ching Wang, Kuo-Yuan Li
  • Patent number: 8589165
    Abstract: The present disclosure provides method and system for converting a free text expression of an identity to a phonetic equivalent code. The conversion follows a set of rules based on phonetic groupings and compresses the expression to a shorter series of characters than the expression. The phonetic equivalent code may be compared to one or more other phonetic equivalent code to establish a correlation between the codes. The phonetic equivalent code of the free text expression may be associated with the code of a known identity. The known identity may be provided to a user for confirmation of the identity. Further, a plurality of expressions stored in a database may be consolidated by converting the expressions to phonetic equivalent codes, comparing the codes to find correlations, and if appropriate reducing the number of expressions or mapping the expressions to a fewer number of expressions.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: November 19, 2013
    Assignee: United Services Automobile Association (USAA)
    Inventors: Gregory Brian Meyer, James Elden Nicholson
  • Patent number: 8576961
    Abstract: A method for determining an overlap and add length estimate comprises determining a plurality of correlation values of a plurality of ordered frequency domain samples obtained from a data frame; comparing the correlation values of a first subset of the samples to a first predetermined threshold to determine a first edge sample; comparing the correlation values of a second subset of the samples to a second predetermined threshold to determine a second edge sample; using the first and second edge samples to determine an overlap and add length estimate; and providing the overlap and add length estimate to an overlap and add circuit.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: November 5, 2013
    Assignee: Olympus Corporation
    Inventors: Haidong Zhu, Dumitru Mihai Ionescu, Abu Amanullah
  • Patent number: 8560327
    Abstract: A method for synchronizing sound data and text data, said text data being obtained by manual transcription of said sound data during playback of the latter. The proposed method comprises the steps of repeatedly querying said sound data and said text data to obtain a current time position corresponding to a currently played sound datum and a currently transcribed text datum, respectively, correcting said current time position by applying a time correction value in accordance with a transcription delay, and generating at least one association datum indicative of a synchronization association between said corrected time position and said currently transcribed text datum. Thus, the proposed method achieves cost-effective synchronization of sound and text in connection with the manual transcription of sound data.
    Type: Grant
    Filed: August 18, 2006
    Date of Patent: October 15, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Andreas Neubacher, Miklos Papai
  • Patent number: 8515753
    Abstract: The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker's speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker's speech. Thereafter, based on variation pronunciation of a non-native speaker's speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker's speech while reducing the degradation of recognition performance for a native speaker's speech.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: August 20, 2013
    Assignee: Gwangju Institute of Science and Technology
    Inventors: Hong Kook Kim, Yoo Rhee Oh, Jae Sam Yoon
  • Patent number: 8494847
    Abstract: A weighting factor learning system includes an audio recognition section that recognizes learning audio data and outputting the recognition result; a weighting factor updating section that updates a weighting factor applied to a score obtained from an acoustic model and a language model so that the difference between a correct-answer score calculated with the use of a correct-answer text of the learning audio data and a score of the recognition result becomes large; a convergence determination section that determines, with the use of the score after updating, whether to return to the weighting factor updating section to update the weighting factor again; and a weighting factor convergence determination section that determines, with the use of the score after updating, whether to return to the audio recognition section to perform the process again and update the weighting factor using the weighting factor updating section.
    Type: Grant
    Filed: February 19, 2008
    Date of Patent: July 23, 2013
    Assignee: NEC Corporation
    Inventors: Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20130166291
    Abstract: Mental state of a person is classified in an automated manner by analysing natural speech of the person. A glottal waveform is extracted from a natural speech signal. Pre-determined parameters defining at least one diagnostic class of a class model are retrieved, the parameters determined from selected training glottal waveform features. The selected glottal waveform features are extracted from the signal. Current mental state of the person is classified by comparing extracted glottal waveform features with the parameters and class model. Feature extraction from a glottal waveform or other natural speech signal may involve determining spectral amplitudes of the signal, setting spectral amplitudes below a pre-defined threshold to zero and, for each of a plurality of sub bands, determining an area under the thresholded spectral amplitudes, and deriving signal feature parameters from the determined areas in accordance with a diagnostic class model.
    Type: Application
    Filed: August 23, 2010
    Publication date: June 27, 2013
    Applicant: RMIT UNIVERSITY
    Inventors: Margaret Lech, Nicholas Brian Allen, Ian Shaw Burnett, Ling He
  • Patent number: 8447605
    Abstract: A game apparatus includes a CPU core for creating an input envelope and a registered envelope. The input envelope has a plurality of envelope values detected from a voice waveform input in real time through a microphone. The registered envelope has a plurality of envelope values detected from a voice waveform previously input. Both of the input envelope and the registered envelope are stored in a RAM. The CPU core evaluates difference of the envelope values between the input envelope and the registered envelope. When an evaluated value satisfies a condition, the CPU core executes a process according to a command assigned to the registered envelope.
    Type: Grant
    Filed: June 3, 2005
    Date of Patent: May 21, 2013
    Assignee: Nintendo Co., Ltd.
    Inventor: Yoji Inagaki
  • Patent number: 8433568
    Abstract: A method for measuring speech intelligibility includes inputting a speech waveform to a system. At least one acoustic feature is extracted from the waveform. From the acoustic feature, at least one phoneme is segmented. At least one acoustic correlate measure is extracted from the at least one phoneme and at least one intelligibility measure is determined. The at least one acoustic correlate measure is mapped to the at least one intelligibility measure.
    Type: Grant
    Filed: March 29, 2010
    Date of Patent: April 30, 2013
    Assignee: Cochlear Limited
    Inventors: Lee Krause, Mark Skowranski, Bonny Banerjee
  • Patent number: 8417518
    Abstract: A voice recognition system comprises: a voice input unit that receives an input signal from a voice input element and output it; a voice detection unit that detects an utterance segment in the input signal; a voice recognition unit that performs voice recognition for the utterance segment; and a control unit that outputs a control signal to at least one of the voice input unit and the voice detection unit and suppresses a detection frequency if the detection frequency satisfies a predetermined condition.
    Type: Grant
    Filed: February 27, 2008
    Date of Patent: April 9, 2013
    Assignee: NEC Corporation
    Inventor: Toru Iwasawa
  • Publication number: 20130013308
    Abstract: An approach is provided for determining a user age range. An age estimator causes, at least in part, acquisition of voice data. Next, the age estimator calculates a first set of probability values, wherein each of the probability values represents a probability that the voice data is in a respective one of a plurality of predefined age ranges, and the predefined age ranges are segments of a lifespan. Then, the age estimator derives a second set of probability values by applying a correlation matrix to the first set of probability values, wherein the correlation matrix associates the first set of probability values with probabilities of the voice data matching individual ages over the lifespan. Then, the age estimator, for each of the predefined age ranges, calculates a sum of the probabilities in the second set of probability values corresponding to the individual ages within the respective predefined age ranges.
    Type: Application
    Filed: March 23, 2010
    Publication date: January 10, 2013
    Applicant: NOKIA CORPORATION
    Inventors: Yang Cao, Feng Ding, Jilei Tian
  • Patent number: 8296313
    Abstract: A method for controlling a relational database system, with a query statement comprised of keywords being analyzed, with the RTN being formed of independent RTN building blocks. Each RTN building block has an inner, directed decision graph which is defined independently from the inner, directed decision graphs of the other RTN building blocks with at least one decision position along at least one decision path. The inner decision graphs of all RTN building blocks are run by means of the keywords in a selection step and all possible paths of this decision graph are followed until either no match with the respectively selected path is determined by the decision graph and the process is interrupted, or the respectively chosen path is run until the end.
    Type: Grant
    Filed: June 7, 2010
    Date of Patent: October 23, 2012
    Assignee: Mediareif Moestl & Reif Kommunikations-und Informationstechnologien OEG
    Inventor: Matthias Moestl
  • Patent number: 8265932
    Abstract: A system and method for identifying audio command prompts for use in a voice response environment is provided. A signature is generated for audio samples each having preceding audio, reference phrase audio, and trailing audio segments. The trailing segment is removed and each of the preceding and reference phrase segments are divided into buffers. The buffers are transformed into discrete fourier transform buffers. One of the discrete fourier transform buffers from the reference phrase segment that is dissimilar to each of the discrete fourier transform buffers from the preceding segment is selected as the signature. Audio command prompts are processed to generate a discrete fourier transform. Each discrete fourier transform for the audio command prompts is compared with each of the signatures and a correlation value is determined. One such audio command prompt matches one such signature when the correlation value for that audio command prompt satisfies a threshold.
    Type: Grant
    Filed: October 3, 2011
    Date of Patent: September 11, 2012
    Assignee: Intellisist, Inc.
    Inventor: Martin R. M. Dunsmuir
  • Patent number: 8249871
    Abstract: A clustering tool to generate word clusters. In embodiments described, the clustering tool includes a clustering component that generates word clusters for words or word combinations in input data. In illustrated embodiments, the word clusters are used to modify or update a grammar for a closed vocabulary speech recognition application.
    Type: Grant
    Filed: November 18, 2005
    Date of Patent: August 21, 2012
    Assignee: Microsoft Corporation
    Inventor: Kunal Mukerjee
  • Publication number: 20120197642
    Abstract: Embodiments of the present invention relate to a signal identifying method, including: obtaining signal characteristics of a current frame of input signals; deciding, according to the signal characteristics of the current frame and updated signal characteristics of a background signal frame before the current frame, whether the current frame is a background signal frame; detecting whether the current frame serving as a background signal frame is in a first type signal state; and adjusting a signal classification decision threshold according to whether the current frame serving as a background signal frame is in the first type signal state to enhance the speech signal identification capability.
    Type: Application
    Filed: April 12, 2012
    Publication date: August 2, 2012
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Yuanyuan Liu, Zhe Wang, Eyal Shlomot