Voiced Or Unvoiced Patents (Class 704/214)
-
Patent number: 9183177Abstract: Methods, apparatuses, and media for filtering a data stream are provided. The data stream is partitioned into a plurality of data stream segments. An acoustic parameter of each of the data stream segments is measured, and it is determined whether the acoustic parameter of each of the data stream segments satisfies a predetermined condition. Extraneous segments of the data stream segments are identified in which the predetermined condition is satisfied, and it is determined whether the extraneous segments have a predetermined relationship in the data stream. The extraneous segments are deleted from the data stream to produce a filtered data stream in response to the extraneous segments having the predetermined relationship.Type: GrantFiled: April 22, 2013Date of Patent: November 10, 2015Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Yeon-Jun Kim, I. Dan Melamed, Steven Neil Tischer, Bernard S. Renger
-
Patent number: 9135809Abstract: A remote control device includes a digital audio storage device, a talk button, and an optical distance measurer. The digital audio storage device is configured to continually record an audio input for a specific amount of time. The talk button is coupled to the digital audio storage device and is configured to initiate a transmission of the audio input to a set-top box device. The optical distance measurer is coupled to the talk button and is configured to automatically measure a distance to a user in response to the talk button being pressed.Type: GrantFiled: June 20, 2008Date of Patent: September 15, 2015Assignee: AT&T Intellectual Property I, LPInventors: Hisao M. Chang, Iker Arizmendi
-
Patent number: 9070375Abstract: A voice activity detection method in a low SNR environment. The voice activity detection is performed by extracting a long-term spectrum variation component and a harmonic structure as feature vectors from a speech signal and increasing difference in feature vectors between speech and non-speech (i) using the long-term spectrum variation component feature or (ii) using a long-term spectrum variation component extraction and a harmonic structure feature extraction. A correct rate and an accuracy rate of the voice activity detection is improved over conventional methods by using a long-term spectrum variation component having a window length over an average phoneme duration of an utterance in the speech signal. The voice activity detection system and method provides speech processing, automatic speech recognition, and speech output capable of very accurate voice activity detection.Type: GrantFiled: February 27, 2009Date of Patent: June 30, 2015Assignee: International BUsiness Machines CorporationInventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
-
Patent number: 9009034Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: November 12, 2014Date of Patent: April 14, 2015Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Patent number: 9009054Abstract: This invention relates to retrieval for multimedia content, and provides a program endpoint time detection apparatus for detecting an endpoint time of a program by performing processing on audio signals of said program, comprising an audio classification unit for classifying said audio signals into a speech signal portion and a non-speech signal portion; a keyword retrieval unit for retrieving, as a candidate endpoint keyword, an endpoint keyword indicating start or end of the program from said speech signal portion; a content analysis unit for performing content analysis on context of the candidate endpoint keyword retrieved by the keyword retrieval unit to determine whether the candidate endpoint keyword is a valid endpoint keyword; and a program endpoint time determination unit for performing statistics analysis based on the retrieval result of said keyword retrieval unit and the determination result of said content analysis unit, and determining the endpoint time of the program.Type: GrantFiled: October 28, 2010Date of Patent: April 14, 2015Assignees: Sony Corporation, Institute of Acoustics, Chinese Academy of SciencesInventors: Kun Liu, Weiguo Wu, Li Lu, Qingwei Zhao, Yonghong Yan, Hongbin Suo
-
Patent number: 8996380Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.Type: GrantFiled: May 4, 2011Date of Patent: March 31, 2015Assignee: Shazam Entertainment Ltd.Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
-
Patent number: 8996389Abstract: Various techniques are disclosed for reducing artifacts generated by time compression. by adapting the time compression based on the state of the received audio. The amount of time compression may be bounded based on audio characteristics. Another feature provides a way of determining the most correlated portions of segments of audio. Voiced speech may be distinguished from unvoiced speech. Another feature provides a way of distinguishing between silence, voiced speech, and unvoiced speech. Time compression may be adapted during periods of lengthy silence. Another feature allows for reducing time compression during sensitive portions of the received audio. One or more of these features may be present in different embodiments.Type: GrantFiled: June 14, 2011Date of Patent: March 31, 2015Assignee: Polycom, Inc.Inventor: Eric David Elias
-
Patent number: 8990079Abstract: When a voice-activated device or application is first started, the signal levels corresponding to spoken commands are initially unknown, making it difficult to set detection thresholds. The inventive method provides an initial command-detection threshold based on the noise level alone. The first command is detected using this initial threshold. Then the threshold is revised according to the first command sound, and a second command is detected using the revised threshold. After detecting each command, the detection threshold is further refined according to the current noise and command sounds. Methods are also disclosed for optimizing the thresholds, adjusting parameters according to sound, and detecting voiced and unvoiced sounds separately. These capabilities enable many emerging voice-activated products and applications.Type: GrantFiled: September 17, 2014Date of Patent: March 24, 2015Assignee: ZanavoxInventor: David Edward Newman
-
Patent number: 8990094Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.Type: GrantFiled: September 8, 2011Date of Patent: March 24, 2015Assignee: QUALCOMM IncorporatedInventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
-
Patent number: 8984061Abstract: The conferencing system is composed of computers, a moderator's computer, and a projector connected on a network. The moderator's computer receives image data from the computers, and generates synthesized image data therefrom, which is transmitted to the projector for display of the synthesized image. The moderator's computer has the capability to switch the image being projected by the projector from the synthesized image to an image handled by one of the computers or by the moderator's computer. With such an arrangement, utilizing existing hardware resources it will be possible to display in a single split-screen display the images handled by the terminals connected on the network. Additionally, it will be possible to switch smoothly between on-screen displays, and to reduce the burden on the on-screen display operator in a networked conferencing system.Type: GrantFiled: August 6, 2008Date of Patent: March 17, 2015Assignee: Seiko Epson CorporationInventor: Noboru Inoue
-
Patent number: 8982971Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernable local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.Type: GrantFiled: March 29, 2012Date of Patent: March 17, 2015Assignee: QRC, Inc.Inventors: Sinisa Peric, Thomas F. Callahan, III
-
Publication number: 20150073783Abstract: In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.Type: ApplicationFiled: September 3, 2014Publication date: March 12, 2015Inventor: Yang Gao
-
Patent number: 8976906Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernible local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.Type: GrantFiled: March 29, 2012Date of Patent: March 10, 2015Assignee: QRC, Inc.Inventors: Sinisa Peric, Thomas F. Callahan, III
-
Patent number: 8965773Abstract: A method is provided for hierarchical coding of a digital audio signal comprising, for a current frame of the input signal: a core coding, delivering a scalar quantization index for each sample of the current frame and at least one enhancement coding delivering indices of scalar quantization for each coded sample of an enhancement signal. The enhancement coding comprises a step of obtaining a filter for shaping the coding noise used to determine a target signal and in that the indices of scalar quantization of said enhancement signal are determined by minimizing the error between a set of possible values of scalar quantization and said target signal. The coding method can also comprise a shaping of the coding noise for the core bitrate coding. A coder implementing the coding method is also provided.Type: GrantFiled: November 17, 2009Date of Patent: February 24, 2015Assignee: OrangeInventors: Balazs Kovesi, Stéphane Ragot, Alain Le Guyader
-
Patent number: 8959025Abstract: Methods and systems for extracting speech from such packet streams. The methods and systems analyze the encoded speech in a given packet stream, and automatically identify the actual speech coding scheme that was used to produce it. These techniques may be used, for example, in interception systems where the identity of the actual speech coding scheme is sometimes unavailable or inaccessible. For instance, the identity of the actual speech coding scheme may be sent in a separate signaling stream that is not intercepted. As another example, the identity of the actual speech coding scheme may be sent in the same packet stream as the encoded speech, but in encrypted form.Type: GrantFiled: April 28, 2011Date of Patent: February 17, 2015Assignee: Verint Systems Ltd.Inventor: Genady Malinsky
-
Patent number: 8942975Abstract: Techniques are described herein that suppress noise in a Mel-filtered spectral domain. For example, a window may be applied to a representation of a speech signal in a time domain. The windowed representation in the time domain may be converted to a subsequent representation of the speech signal in the Mel-filtered spectral domain. A noise suppression operation may be performed with respect to the subsequent representation to provide noise-suppressed Mel coefficients.Type: GrantFiled: March 22, 2011Date of Patent: January 27, 2015Assignee: Broadcom CorporationInventor: Jonas Borgstrom
-
Patent number: 8930183Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.Type: GrantFiled: August 25, 2011Date of Patent: January 6, 2015Assignee: Kabushiki Kaisha ToshibaInventors: Byung Ha Chun, Mark John Francis Gales
-
Patent number: 8930184Abstract: A signal bandwidth extending apparatus including: a bandwidth extending section configured to extend a frequency bandwidth of a target signal, the target signal included in an input signal; a calculating section configured to calculate a degree of the target signal included in the input signal; and a controller configured to change a method of extending the frequency bandwidth by the bandwidth extending section according to a result of the calculating section.Type: GrantFiled: September 14, 2009Date of Patent: January 6, 2015Assignee: Kabushiki Kaisha ToshibaInventors: Takashi Sudo, Masataka Osada
-
Patent number: 8924200Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.Type: GrantFiled: September 28, 2011Date of Patent: December 30, 2014Assignee: Motorola Mobility LLCInventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
-
Patent number: 8924209Abstract: A method is disclosed for identifying a spoken command by detecting intervals of voiced and unvoiced sound, and then comparing the order of voiced and unvoiced sounds to a set of templates. Each template represents one of the predetermined acceptable commands of the application, and is associated with a predetermined action. When the order of voiced and unvoiced intervals in the spoken command matches the order in one of the templates, the associated action is thus selected. Silent intervals in the command may also be included for enhanced recognition. Efficient protocols are disclosed for discriminating voiced and unvoiced sounds, and for detecting the beginning and ending of each sound interval in the command, and for comparing the command sequence to the templates. In a sparse-command application, this method provides fast and robust recognition, and can be implemented with low-cost hardware and extremely minimal software.Type: GrantFiled: September 12, 2012Date of Patent: December 30, 2014Assignee: ZanavoxInventor: David Edward Newman
-
Patent number: 8924204Abstract: Unlike sound based pressure waves that go everywhere, air turbulence caused by wind is usually a fairly local event. Therefore, in a system that utilizes two or more spatially separated microphones to pick up sound signals (e.g., speech), wind noise picked up by one of the microphones often will not be picked up (or at least not to the same extent) by the other microphone(s). Embodiments of methods and apparatuses that utilize this fact and others to effectively detect and suppress wind noise using multiple microphones that are spatially separated are described.Type: GrantFiled: September 30, 2011Date of Patent: December 30, 2014Assignee: Broadcom CorporationInventors: Juin-Hwey Chen, Jes Thyssen, Xianxian Zhang, Huaiyu Zeng
-
Patent number: 8909519Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: March 10, 2014Date of Patent: December 9, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Patent number: 8898058Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.Type: GrantFiled: October 24, 2011Date of Patent: November 25, 2014Assignee: QUALCOMM IncorporatedInventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
-
Patent number: 8886527Abstract: A purpose is to suppress recognition process delay generated due to load in signal processing. Included is a speech input means 10 that inputs a speech signal, an output evaluation means 20 that evaluates whether or not the speech signal input by the speech input means 10 is the speech signal in a sound section, which is a speech section assuming that a speaker is speaking, and outputs the speech signal as a speech signal to be processed only when evaluated as the speech signal in the sound section, a signal processing means 30 that performs signal processing to the speech signal, which is output by the output evaluation means 20 as the speech signal to be processed, and a speech recognition processing means 40 that performs a speech recognition process to the speech signal which is signal-processed by the signal processing means 30.Type: GrantFiled: April 16, 2009Date of Patent: November 11, 2014Assignee: NEC CorporationInventor: Toru Iwasawa
-
Patent number: 8879762Abstract: A method and apparatus to evaluate a quality of an audio signal, in which the number of effective channels is determined for each of a reference signal of a current frame and a test signal indicative of the reference signal that has passed through an audio codec, and an audio quality evaluation score of the current frame is calculated by evaluating an audio quality of the current frame based on the determined number of effective channels for each of the reference signal and the test signal by means of a predetermined evaluator.Type: GrantFiled: January 28, 2010Date of Patent: November 4, 2014Assignee: Samsung Electronics Co., Ltd.Inventor: In-Yong Choi
-
Patent number: 8880409Abstract: A system provided herein may perform automatic temporal alignment between music audio signal and lyrics with higher accuracy than ever. A non-fricative section extracting 4 extracts non-fricative sound sections, where no fricative sounds exist, from the music audio signal. An alignment portion 17 includes a phone model 15 for singing voice capable of estimating phonemes corresponding to temporal-alignment features. The alignment portion 17 performs an alignment operation using as inputs temporal-alignment features obtained from a temporal-alignment feature extracting portion 11, information on vocal and non-vocal sections obtained from a vocal section estimating portion 9, and a phoneme network SN on conditions that no phonemes exist at least in non-vocal sections and that no fricative phonemes exist in non-fricative sound sections.Type: GrantFiled: February 5, 2009Date of Patent: November 4, 2014Assignee: National Institute of Advanced Industrial Science and TechnologyInventors: Hiromasa Fujihara, Masataka Goto
-
Patent number: 8868432Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.Type: GrantFiled: September 28, 2011Date of Patent: October 21, 2014Assignee: Motorola Mobility LLCInventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
-
Patent number: 8862465Abstract: An electronic device for determining a set of pitch cycle energy parameters is described. The electronic device includes a processor and executable instructions stored in memory. The electronic device obtains a frame, a set of filter coefficients and a residual signal based on the frame and the set of filter coefficients. The electronic device determines a set of peak locations based on the residual signal and segments the residual signal such that each segment includes one peak. The electronic device determines a first set of pitch cycle energy parameters based on a frame region between two consecutive peak locations and maps regions between peaks in the residual signal to regions between peaks in a synthesized excitation signal to produce a mapping. The electronic device determines a second set of pitch cycle energy parameters based on the first set of pitch cycle energy parameters and the mapping.Type: GrantFiled: September 8, 2011Date of Patent: October 14, 2014Assignee: QUALCOMM IncorporatedInventors: Venkatesh Krishnan, Stephane Pierre Villette
-
Patent number: 8842534Abstract: According to one embodiment of the invention, a method for managing time-sensitive packetized data streams at a receiver includes receiving a time-sensitive packet of a data stream, analyzing an energy level of a payload signal of the packet, and determining whether to drop the packet based on the energy level of the payload signal.Type: GrantFiled: January 23, 2012Date of Patent: September 23, 2014Assignee: Cisco Technology, Inc.Inventors: Paul S. Hahn, Michael E. Knappe, Richard A. Dunlap, Luke K. Surazski
-
Patent number: 8825478Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.Type: GrantFiled: January 10, 2011Date of Patent: September 2, 2014Assignee: Nuance Communications, Inc.Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
-
Patent number: 8792777Abstract: The present invention is directed to system(s), method(s), and apparatus for accurate fast forward rate when performing trick play with variable distance between frames. In one embodiment, there is presented a circuit for providing a fast forward video sequence. The circuit comprises a system time clock for providing a time reference, said time reference incremented at a predetermined fast forward rate; a comparator for comparing the time reference with timing information associated with a picture; and a controller for determining whether to display the picture based at least in part on the comparison between the timing information and the time reference.Type: GrantFiled: January 10, 2007Date of Patent: July 29, 2014Assignee: Broadcom CorporationInventor: Tim Ross
-
Patent number: 8781821Abstract: A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.Type: GrantFiled: April 30, 2012Date of Patent: July 15, 2014Assignee: ZanavoxInventor: David Edward Newman
-
Patent number: 8775168Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.Type: GrantFiled: August 3, 2007Date of Patent: July 8, 2014Assignee: STMicroelectronics Asia Pacific PTE, Ltd.Inventors: Karthik Muralidhar, Anoop Kumar Krishna
-
Patent number: 8762158Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.Type: GrantFiled: August 5, 2011Date of Patent: June 24, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
-
Patent number: 8762139Abstract: A noise suppression device includes: a power spectrum calculator converting an input signal of time domain into power spectra of frequency domain; a voice/noise determination unit determining whether the power spectra indicate voice or noise; a noise spectrum estimation unit estimating noise spectra of the power spectra; a period component estimation unit analyzing a harmonic structure constituting the power spectra and estimating periodical information about the power spectra; a weighting coefficient calculator calculating a weighting coefficient for weighting the power spectra; a suppression coefficient calculator calculating a suppression coefficient for suppressing noise included in the power spectra; a spectrum suppression unit suppressing amplitude of the power spectra in accordance with the suppression coefficient; and an inverse Fourier transformer converting the power spectra output by the spectrum suppression unit into a signal of time domain to generate a noise-suppressed signal.Type: GrantFiled: September 21, 2010Date of Patent: June 24, 2014Assignee: Mitsubishi Electric CorporationInventors: Satoru Furuta, Hirohisa Tasaki
-
Patent number: 8762144Abstract: A method and apparatus for detecting voice activity are disclosed. The method of detecting voice activity includes: extracting a feature parameter from a frame signal; determining whether the frame signal is a voice signal or a noise signal by comparing the feature parameter with model parameters of a plurality of comparison signals, respectively; and outputting the frame signal when the frame signal is determined to be a voice signal. The apparatus includes a classifier module which extracts a feature parameter from a frame signal, and generating labeling information with respect to the frame signal by comparing the feature parameter with model parameters of a plurality of comparison signals; and a voice detection unit which determines whether the frame signal is a noise signal or a voice signal with reference to the labeling information, and outputting the frame signal when the frame signal is determined to be a voice signal.Type: GrantFiled: May 3, 2011Date of Patent: June 24, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Nam-gook Cho, Eun-kyoung Kim
-
Patent number: 8751222Abstract: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage.Type: GrantFiled: May 22, 2009Date of Patent: June 10, 2014Assignee: Accenture Global Services Limited DublinInventors: Thomas J. Ryan, Biji K. Janan
-
Patent number: 8744839Abstract: Target word recognition includes: obtaining a candidate word set and corresponding characteristic computation data, the candidate word set comprising text data, and characteristic computation data being associated with the candidate word set; performing segmentation of the characteristic computation data to generate a plurality of text segments; combining the plurality of text segments to form a text data combination set; determining an intersection of the candidate word set and the text data combination set, the intersection comprising a plurality of text data combinations; determining a plurality of designated characteristic values for the plurality of text data combinations; based at least in part on the plurality of designated characteristic values and according to at least a criterion, recognizing among the plurality of text data combinations target words whose characteristic values fulfill the criterion.Type: GrantFiled: September 22, 2011Date of Patent: June 3, 2014Assignee: Alibaba Group Holding LimitedInventors: Haibo Sun, Yang Yang, Yining Chen
-
Patent number: 8738367Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.Type: GrantFiled: February 18, 2010Date of Patent: May 27, 2014Assignee: NEC CorporationInventor: Tadashi Emori
-
Patent number: 8731914Abstract: A system and method for locating a preferable playback start location after a winding or rewinding action in an audio playing device. In response to an adjustment of the playing location for audio content to a desired playing position, the system determines whether at least one non-speech or silent period of at least a predetermined duration exists within the vicinity of the desired playing position. If at least one such non-speech or silent period exists within the vicinity of the desired playing position, the system adjusts the playing position to fall within one of the at least one non-speech period or silent period.Type: GrantFiled: November 15, 2005Date of Patent: May 20, 2014Assignee: Nokia CorporationInventors: Janne Vainio, Hannu J. Mikkola, Jari M. Makinen
-
Patent number: 8731209Abstract: In order to generate a multi-channel signal having a number of output channels greater than a number of input channels, a mixer is used for upmixing the input signal to form at least a direct channel signal and at least an ambience channel signal. A speech detector is provided for detecting a section of the input signal, the direct channel signal or the ambience channel signal in which speech portions occur. Based on this detection, a signal modifier modifies the input signal or the ambience channel signal in order to attenuate speech portions in the ambience channel signal, whereas such speech portions in the direct channel signal are attenuated to a lesser extent or not at all. A loudspeaker signal outputter then maps the direct channel signals and the ambience channel signals to loudspeaker signals which are associated to a defined reproduction scheme, such as, for example, a 5.1 scheme.Type: GrantFiled: October 1, 2008Date of Patent: May 20, 2014Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Christian Uhle, Oliver Hellmuth, Juergen Herre, Harald Popp, Thorsten Kastner
-
Patent number: 8731207Abstract: An embodiment of an apparatus for computing control information for a suppression filter for filtering a second audio signal to suppress an echo based on a first audio signal includes a computer having a value determiner for determining at least one energy-related value for a band-pass signal of at least two temporally successive data blocks of at least one signal of a group of signals. The computer further includes a mean value determiner for determining at least one mean value of the at least one determined energy-related value for the band-pass signal. The computer further includes a modifier for modifying the at least one energy-related value for the band-pass signal on the basis of the determined mean value for the band-pass signal. The computer further includes a control information computer for computing the control information for the suppression filter on the basis of the at least one modified energy-related value.Type: GrantFiled: January 12, 2009Date of Patent: May 20, 2014Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.Inventors: Fabian Kuech, Markus Kallinger, Christof Faller, Alexis Favrot
-
Patent number: 8725508Abstract: A computer-implemented method and apparatus for searching for an element sequence, the method comprising: receiving a signal; determining an initial segment of the signal; inputting the initial segment into an element extraction engine to obtain a first element sequence; determining one or more second segments, each of the second segments at least partially overlapping with the initial segment; inputting the second segments into the element extraction engine to obtain at least one second element sequence; and searching for an element subsequence common to at least a predetermined number of sequences of the first element sequence and the second element sequences.Type: GrantFiled: March 27, 2012Date of Patent: May 13, 2014Assignee: NovospeechInventor: Yossef Ben-Ezra
-
Patent number: 8725498Abstract: A computer-implemented method for digital speech processing, including (1) receiving, at a server computer, digital speech data from a computing device, the digital speech data comprising data points sampled at respective time points; (2) computing, by the server computer, a tonal feature of the digital speech data, the tonal feature comprising information encoding fundamental frequencies at the respective time points; (3) computing, by the server computer, a logarithm of the tonal feature at the respective time points; and (4) processing, by the server computer, the logarithm of the tonal feature based on a characterization of the digital speech data at the respective time points.Type: GrantFiled: July 24, 2012Date of Patent: May 13, 2014Assignee: Google Inc.Inventors: Yun-hsuan Sung, Meihong Wang, Xin Lei
-
Patent number: 8700390Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: October 7, 2013Date of Patent: April 15, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Patent number: 8694308Abstract: A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined.Type: GrantFiled: November 26, 2008Date of Patent: April 8, 2014Assignee: Nec CorporationInventors: Takayuki Arakawa, Masanori Tsujikawa
-
Patent number: 8694326Abstract: A communication terminal includes a decoder which decodes an input bitstream received from another communication terminal, to generate an output audio signal and outputs the generated output audio signal to a speaker; and an echo canceller which obtains an input audio signal representing sound captured by a microphone placed in a space to which the speaker outputs the sound, and removes, for respective subbands, an echo component included in the obtained input audio signal and corresponding to the output audio signal, to generate an audio signal for transmission. An encoder codes the audio signal for transmission to generate an output bitstream and transmits the generated output bitstream to another communication terminal; and a control unit controls, for the respective subbands, echo cancellation processing according to a reproduction band of at least one of the output audio signal and the audio signal for transmission.Type: GrantFiled: August 21, 2012Date of Patent: April 8, 2014Assignee: Panasonic CorporationInventors: Shuji Miyasaka, Kosuke Nishio, Ichiro Kawashima
-
Patent number: 8687831Abstract: An external device for a hearing implant system and a hearing implant system having an external device is described. An external transmitter generates a radio-frequency inductive link signal to an implanted receiver including a sequence of data word segments which communicate data to the implanted receiver, and a sequence of data word pause segments between each data word segment which communicate energy without data to the implanted receiver. A data word pause controller controls the inductive link signal during the data word pause segments according to an energy management rule.Type: GrantFiled: October 26, 2012Date of Patent: April 1, 2014Assignee: Med-El Elektromedizinische Geraete GmbHInventors: Martin Stoffaneller, Peter Schleich, Thomas Schwarzenbeck
-
Patent number: 8688438Abstract: A speech processing system includes a plurality of signal analyzers that extract salient signal attributes of an input voice signal. A difference module computes the differences in the salient signal attributes. One or more control modules control a plurality of speech generators using an output signal from the difference module in a speech-locked loop (SLL), the speech generators use the output signal to generate a voice signal.Type: GrantFiled: February 9, 2010Date of Patent: April 1, 2014Assignee: Massachusetts Institute of TechnologyInventors: Keng Hoong Wee, Lorenzo Turicchia, Rahul Sarpeshkar
-
Publication number: 20140088958Abstract: The present invention is a method and system to convert speech signal into a parametric representation in terms of timbre vectors, and to recover the speech signal thereof. The speech signal is first segmented into non-overlapping frames using the glottal closure instant information, each frame is converted into an amplitude spectrum using a Fourier analyzer, and then using Laguerre functions to generate a set of coefficients which constitute a timbre vector. A sequence of timbre vectors can be subject to a variety of manipulations. The new timbre vectors are converted back into voice signals by first transforming into amplitude spectra using Laguerre functions, then generating phase spectra from the amplitude spectra using Kramers-Knonig relations. A Fourier transformer converts the amplitude spectra and phase spectra into elementary acoustic waves, then superposed to become the output voice. The method and system can be used for voice transformation, speech synthesis, and automatic speech recognition.Type: ApplicationFiled: December 3, 2012Publication date: March 27, 2014Inventor: Chengjun Julian Chen