Zero Crossing Patents (Class 704/213)
-
Patent number: 11367445Abstract: Aspects of the disclosure relate to various systems and techniques that provide for a method and apparatus for transmitting speech as text to a remote server and converting the text stream back to speech for delivery to a remote application. For example, a person, through workspace virtualization, is accessing a remote application that accepts speech as its input. The user, using a microphone, would speak into the microphone where the speech would be converted into text with a local speech-to-text converter. The text version of speech is sent to a remote server, which converts the text back to speech using a remote server based text-to-speech converter where the reconstructed speech is usable as input to a remote application or device.Type: GrantFiled: February 5, 2020Date of Patent: June 21, 2022Assignee: Citrix Systems, Inc.Inventors: Pawan Kumar Dixit, Dinesh Jidugu
-
Patent number: 11067661Abstract: An information processing device including an acquisition unit that acquires a sound collection result of a sound from each of one or more sound sources obtained by a sound collection portion of which positional information indicating at least one of a position and a direction is changed and an estimation unit that estimates a direction of each of the one or more sound sources on a basis of a change in a frequency of a sound collected by the sound collection portion in association with a change in the positional information of the sound collection portion.Type: GrantFiled: September 28, 2016Date of Patent: July 20, 2021Assignee: Sony CorporationInventors: Naoya Takahashi, Yuhki Mitsufuji
-
Patent number: 10824391Abstract: A method comprises converting an audio frequency domain signal into one or more voltage signals. Then the characteristics of the one or more voltage signals are determined. Afterwards the characteristics of the one or more voltage signals are compared with one or more characteristics of an audio trigger command. Activation of an audio user interface is then activated on the basis of the comparison.Type: GrantFiled: March 14, 2018Date of Patent: November 3, 2020Assignee: Nokia Technologies OyInventors: Jari Tuomas Savolainen, Jukka Mikael Jalkanen, Jyrki Porio
-
Patent number: 10803885Abstract: An audio event detection system that processes audio data into audio feature data and processes the audio feature data using pre-configured candidate interval lengths to identify top candidate regions of the feature data that may include an audio event. The feature data from the top candidate regions are then scored by a classifier, where the score indicates a likelihood that the candidate region corresponds to a desired audio event. The scores are compared to a threshold, and if the threshold is satisfied, the top scoring candidate region is determined to include an audio event.Type: GrantFiled: June 29, 2018Date of Patent: October 13, 2020Assignee: Amazon Technologies, Inc.Inventors: Chieh-Chi Kao, Chao Wang, Weiran Wang, Ming Sun
-
Patent number: 10720154Abstract: Provided is an information processing device including: a collected sound data acquisition portion that acquires collected sound data; and an output controller that causes an output portion to output at least whether or not a state of the collected sound data is suitable for speech recognition.Type: GrantFiled: September 15, 2015Date of Patent: July 21, 2020Assignee: SONY CORPORATIONInventors: Shinichi Kawano, Yuhei Taki, Takashi Shibuya
-
Patent number: 9921803Abstract: A method comprises converting an audio frequency domain signal into one or more voltage signals. Then the characteristics of the one or more voltage signals are determined. Afterwards the characteristics of the one or more voltage signals are compared with one or more characteristics of an audio trigger command. Activation of an audio user interface is then activated on the basis of the comparison.Type: GrantFiled: August 23, 2010Date of Patent: March 20, 2018Assignee: Nokia Technologies OyInventors: Jari Tuomas Savolainen, Jukka Mikael Jalkanen, Jyrki Porio
-
Patent number: 9865253Abstract: The present invention is a system and method for discriminating between human and synthetic speech. The method and system include memory for storing a speaker verification application, a communication network that receives from a client device a speech signal having one or more discriminating features, and a processor for executing instructions stored in memory. The execution of the instructions by the processor extracts the one or more discriminating features from the speech signal and classifies the speech signal as human or synthetic based on the extracted features.Type: GrantFiled: August 21, 2014Date of Patent: January 9, 2018Assignee: VoiceCipher, Inc.Inventors: Phillip L. De Leon, Steven Spence, Bryan Stewart, Junichi Yamagishi
-
Patent number: 9749762Abstract: The disclosed embodiments provide a system that performs a sound-recognition operation. During operation, the system recognizes a sequence of sound primitives in an audio stream, wherein a sound primitive is associated with a semantic label comprising one or more words that describe a sound characterized by the sound primitive. Next, the system feeds the sequence of sound primitives into a finite-state automaton that recognizes events associated with sequences of sound primitives. Finally, the system feeds the recognized events into an output system that generates an output associated with the recognized events to be displayed to a user.Type: GrantFiled: July 13, 2016Date of Patent: August 29, 2017Assignee: OtoSense, Inc.Inventors: Sebastien J. V. Christian, Thor C. Whalen
-
Patent number: 9552807Abstract: A system and method for automatically dubbing a video in a first language into a second language, comprising: an audio/video pre-processor configured to provide separate original audio and video files of the same media; a text analysis unit to receive a first text file of the video's subtitles in the first language and a second text file of the video's sub-titles in the second language, and re-divide them into text sentences; a text-to-speech unit to receive the text sentences in the first and second languages from the text analysis unit and produce therefrom first and second standard TTS spoken sentences; a prosody unit to receive the first and second spoken sentences, the separated audio file and timing parameters and produce therefrom dubbing recommendations; and a dubbing unit configured to receive the second spoken sentence and the recommendations and produce therefrom an automatically dubbed sentence in the second language.Type: GrantFiled: March 11, 2014Date of Patent: January 24, 2017Assignee: Video Dubber LTD.Inventors: Boaz Rossano, Jacob Dvir
-
Patent number: 9426335Abstract: In one method embodiment, providing a multiplex of compressed versions of a first video stream and a first audio stream, each corresponding to an audiovisual (A/V) program, the first video stream and the first audio stream each corresponding to a first playout rate and un-synchronized with each other for an initial playout portion; and providing a compressed version of a second audio stream, the second audio stream corresponding to a pitch-preserving, second playout rate different than the first playout rate, the second audio stream synchronized to the initial playout portion of the first video stream when the first video stream is played out at the second playout rate, the first audio stream replaceable by the second audio stream for the initial playout portion.Type: GrantFiled: January 14, 2014Date of Patent: August 23, 2016Assignee: Cisco Technology, Inc.Inventors: Ali C. Begen, Tankut Akgul, Michael A. Ramalho, David R. Oran, William C. Ver Steeg
-
Patent number: 8947499Abstract: Methods and systems for communicating with rate control. A communication is sent and received from a first device to a second device over a network, wherein the communication comprises at least one audio stream and a second communication stream. A capacity of the network is probed at the first device for the sending and receiving the communication. A presence of a voice in the at least one audio stream is detected at the first device via a voice activity detection of the at least one audio stream. A rate limit is set for the sending and receiving the communication at the first device based on the capacity of the network and the detection of the presence of the at least one audio stream.Type: GrantFiled: December 6, 2012Date of Patent: February 3, 2015Assignee: TangoMe, Inc.Inventors: Alexander Subbotin, Olivier Furon, Shaowei Su, Yevgeni Litvin, Xu Liu
-
Patent number: 8924200Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.Type: GrantFiled: September 28, 2011Date of Patent: December 30, 2014Assignee: Motorola Mobility LLCInventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
-
Patent number: 8903720Abstract: Provided is an encoding apparatus for integrally encoding and decoding a speech signal and a audio signal, and may include: an input signal analyzer to analyze a characteristic of an input signal; a stereo encoder to down mix the input signal to a mono signal when the input signal is a stereo signal, and to extract stereo sound image information; a frequency band expander to expand a frequency band of the input signal; a sampling rate converter to convert a sampling rate; a speech signal encoder to encode the input signal using a speech encoding module when the input signal is a speech characteristics signal; a audio signal encoder to encode the input signal using a audio encoding module when the input signal is a audio characteristic signal; and a bitstream generator to generate a bitstream.Type: GrantFiled: July 14, 2009Date of Patent: December 2, 2014Assignee: Electronics and Telecommunications Research InstituteInventors: Tae Jin Lee, Seung-Kwon Baek, Min Je Kim, Dae Young Jang, Jeongil Seo, Kyeongok Kang, Jin-Woo Hong, Hochong Park, Young-Cheol Park
-
Patent number: 8892231Abstract: Embodiments for audio classification are described. An audio classification system includes at least one device which executes a process of audio classification on an audio signal. The at least one device can operate in at least two modes requiring different resources. The audio classification system also includes a complexity controller which determines a combination and instructs the at least one device to operate according to the combination. For each of the at least one device, the combination specifies one of the modes of the device, and the resources requirement of the combination does not exceed maximum available resources. By controlling the modes, the audio classification system has improved scalability to an execution environment.Type: GrantFiled: August 22, 2012Date of Patent: November 18, 2014Assignee: Dolby Laboratories Licensing CorporationInventors: Bin Cheng, Lie Lu
-
Patent number: 8879762Abstract: A method and apparatus to evaluate a quality of an audio signal, in which the number of effective channels is determined for each of a reference signal of a current frame and a test signal indicative of the reference signal that has passed through an audio codec, and an audio quality evaluation score of the current frame is calculated by evaluating an audio quality of the current frame based on the determined number of effective channels for each of the reference signal and the test signal by means of a predetermined evaluator.Type: GrantFiled: January 28, 2010Date of Patent: November 4, 2014Assignee: Samsung Electronics Co., Ltd.Inventor: In-Yong Choi
-
Patent number: 8868432Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.Type: GrantFiled: September 28, 2011Date of Patent: October 21, 2014Assignee: Motorola Mobility LLCInventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
-
Patent number: 8762158Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.Type: GrantFiled: August 5, 2011Date of Patent: June 24, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
-
Patent number: 8682654Abstract: Disclosed are systems, methods, and computer readable media having programs for classifying sports video. In one embodiment, a method includes: extracting, from an audio stream of a video clip, a plurality of key audio components contained therein; and classifying, using at least one of the plurality of key audio components, a sport type contained in the video clip. In one embodiment, a computer readable medium having a computer program for classifying ports video includes: logic configured to extract a plurality of key audio components from a video clip; and logic configured to classify a sport type corresponding to the video clip.Type: GrantFiled: April 25, 2006Date of Patent: March 25, 2014Assignee: Cyberlink Corp.Inventors: Ming-Jun Chen, Jiun-Fu Chen, Shih-Min Tang, Ho-Chao Huang
-
Patent number: 8655156Abstract: In one method embodiment, providing a multiplex of compressed versions of a first video stream and a first audio stream, each corresponding to an audiovisual (A/V) program, the first video stream and the first audio stream each corresponding to a first playout rate and un-synchronized with each other for an initial playout portion; and providing a compressed version of a second audio stream, the second audio stream corresponding to a pitch-preserving, second playout rate different than the first playout rate, the second audio stream synchronized to the initial playout portion of the first video stream when the first video stream is played out at the second playout rate, the first audio stream replaceable by the second audio stream for the initial playout portion.Type: GrantFiled: March 2, 2010Date of Patent: February 18, 2014Assignee: Cisco Technology, Inc.Inventors: Ali C. Begen, Tankut Akgul, Michael A. Ramalho, David R. Oran, William C. Ver Steeg
-
Patent number: 8635065Abstract: The present invention discloses an apparatus for automatic extraction of important events in audio signals comprising: signal input means for supplying audio signals; audio signal fragmenting means for partitioning audio signals supplied by the signal input means into audio fragments of a predetermined length and for allocating a sequence of one or more audio fragments to a respective audio window; feature extracting means for analyzing acoustic characteristics of the audio signals comprised in the audio fragments and for analyzing acoustic characteristics of the audio signals comprised in the audio windows; and important event extraction means for extracting important events in audio signals supplied by the audio signal fragmenting means based on predetermined important event classifying rules depending on acoustic characteristics of the audio signals comprised in the audio fragments and on acoustic characteristics of the audio signals comprised in the audio windows, wherein each important event extractedType: GrantFiled: November 10, 2004Date of Patent: January 21, 2014Assignee: Sony Deutschland GmbHInventors: Silke Goronzy-Thomae, Thomas Kemp, Ralf Kompe, Yin Hay Lam, Krzysztof Marasek, Raquel Tato
-
Patent number: 8554560Abstract: Discrimination between two classes comprises receiving a set of frames including an input signal and determining at least two different feature vectors for each of the frames. Discrimination between two classes further comprises classifying the two different feature vectors using sets of preclassifiers trained for at least two classes of events and from that classification, and determining values for at least one weighting factor. Discrimination between two classes still further comprises calculating a combined feature vector for each of the received frames by applying the weighting factor to the feature vectors and classifying the combined feature vector for each of the frames by using a set of classifiers trained for at least two classes of events.Type: GrantFiled: September 4, 2012Date of Patent: October 8, 2013Assignee: International Business Machines CorporationInventor: Zica Valsan
-
Patent number: 8554547Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term-sliding mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term-sliding mean of the frequency domain parameter in the history background noise frame; and judging whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the judgment criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.Type: GrantFiled: July 11, 2012Date of Patent: October 8, 2013Assignee: Huawei Technologies Co., Ltd.Inventor: Zhe Wang
-
Patent number: 8489404Abstract: A method for detecting a transient in an audio signal that has been broken up into frames includes obtaining a time domain feature of the frames and comparing the domain feature with a predetermined value. If the time domain feature is greater than the predetermined value, the frames are taken as transient and if the time domain feature is less than the predetermined value, the frames are taken as non-transient. The method has a low computational intensity and is thus very suitable for devices with limited processing resources.Type: GrantFiled: March 15, 2011Date of Patent: July 16, 2013Assignee: Freescale Semiconductor, Inc.Inventors: Zhongsong Lin, Shidong Shang, Shengjiu Wang
-
Voice analysis device, voice analysis method, voice analysis program, and system integration circuit
Patent number: 8478587Abstract: A sound analysis device comprises: a sound parameter calculation unit operable to acquire an audio signal and calculate a sound parameter for each of partial audio signals, the partial audio signals each being the acquired audio signal in a unit of time; a category determination unit operable to determine, from among a plurality of environmental sound categories, which environmental sound category each of the partial audio signals belongs to, based on a corresponding one of the calculated sound parameters; a section setting unit operable to sequentially set judgement target sections on a time axis as time elapses, each of the judgment target sections including two or more of the units of time, the two or more of the units of time being consecutive; and an environment judgment unit operable to judge, based on a number of partial audio signals in each environmental sound category determined in at least a most recent judgment target section, an environment that surrounds the sound analysis device in at least theType: GrantFiled: March 13, 2008Date of Patent: July 2, 2013Assignee: Panasonic CorporationInventors: Takashi Kawamura, Ryouichi Kawanishi -
Patent number: 8326612Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.Type: GrantFiled: April 5, 2010Date of Patent: December 4, 2012Assignee: Fujitsu LimitedInventors: Nobuyuki Washio, Shoji Hayakawa
-
Patent number: 8311813Abstract: Discrimination between at least two classes of events in an input signal is carried out in the following way. A set of frames containing an input signal is received, and at least two different feature vectors are determined for each of said frames. Said at least two different feature vectors are classified using respective sets of preclassifiers trained for said at least two classes of events. Values for at least one weighting factor are determined based on outputs of said preclassifiers for each of said frames. A combined feature vector is calculated for each of said frames by applying said at least one weighting factor to said at least two different feature vectors. Said combined feature vector is classified using a set of classifiers trained for said at least two classes of events.Type: GrantFiled: October 26, 2007Date of Patent: November 13, 2012Assignee: International Business Machines CorporationInventor: Zica Valsan
-
Patent number: 8296133Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term sliding mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term sliding mean of the frequency domain parameter in the history background noise frame; and judging whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the judgment criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.Type: GrantFiled: November 30, 2011Date of Patent: October 23, 2012Assignee: Huawei Technologies Co., Ltd.Inventor: Zhe Wang
-
Patent number: 8248109Abstract: Methods and systems for detection of zero crossings in a signal are described. For example, true zero crossings in an alternating voltage power source signal can be detected in the presence of noise pulses. The zero crossing detections are performed by establishing a value of a signal status counter, and at a repeating interval if the signal is a logic low value, the value of the signal status counter is decremented if the signal status counter is greater than a first value otherwise a flag is set to enable detection of a zero crossing in the signal. In addition, at the repeating interval, if the signal is a logic high value, the value of the signal status counter is incremented, and if after incrementing the signal status counter is equal to a second value and the flag is set, a zero crossing of the signal is declared.Type: GrantFiled: January 11, 2010Date of Patent: August 21, 2012Assignee: ASCO Power Technologies, L.P.Inventor: William Scholder
-
Patent number: 8160887Abstract: Digital audio sample data are adaptively processed for interpolation based on whether the frequency at which the digital audio signal samples reverse polarity is at least equal to a predetermined threshold, the threshold being determined by their sampling frequency. If so, the digital audio signal samples are subjected to zero-order interpolation, with zero-inserting between the samples followed by lowpass filtering; if not, the samples are subjected to Lagrange (spline) interpolation processing.Type: GrantFiled: March 10, 2005Date of Patent: April 17, 2012Assignee: D&M Holdings, Inc.Inventor: Mitsugi Fukushima
-
Patent number: 8156179Abstract: Disclosed herein are systems and methods for a distributed computing system having a service-oriented architecture. The system is configured to receive workloads from client applications and to execute workloads on service hosts. The distributed computing system dynamically assigns the workloads to the applications running on the service hosts, with the workloads being assigned according to the service needs and the availability of service hosts and other resources on the system. The presently disclosed systems and methods provide for high-throughput communications through an asynchronous binary or a synchronous binary communications protocol. Further disclosed embodiments include flexible failover and upgrade techniques, isolation between execution users of the system, virtualization through mobility and the ability to grow and shrink assigned resources, and for a software development kit adapted for the present architecture.Type: GrantFiled: April 26, 2007Date of Patent: April 10, 2012Assignee: Platform Computing CorporationInventors: Onkar S. Parmar, Yonggang Hu
-
Patent number: 8069039Abstract: In a sound signal processing apparatus, a frame information generation section generates frame information of each frame of a sound signal. A storage stores the frame information generated by the frame information generation section. A first interval determination section determines a first utterance interval in the sound signal. A second interval determination section determines a second utterance interval based on the frame information of the first utterance interval stored in the storage such that the second utterance interval is made shorter than the first utterance interval and confined within the first utterance interval by trimming frames from either of a start point or an end point of the first utterance interval.Type: GrantFiled: December 21, 2007Date of Patent: November 29, 2011Assignee: Yamaha CorporationInventor: Yasuo Yoshioka
-
Patent number: 8046215Abstract: A method and apparatus to detect voice activity by using a zero-crossing rate includes removing noise included in an audio signal, adding a random signal having energy of a predetermined size to the audio signal from which noise is removed, extracting predetermined voice detection parameters from the audio signal to which the random signal is added, and comparing the extracted predetermined voice detection parameters with a threshold value and determining voice and non-voice activities.Type: GrantFiled: May 23, 2008Date of Patent: October 25, 2011Assignee: SAMSUNG Electronics Co., Ltd.Inventor: Jae-youn Cho
-
Patent number: 8032366Abstract: To increase channel capacity, mobile phone carriers have deployed speech coders, such as Advanced MultiBand Excitation coding (AMBE), in networks to reduce the bit rate of each call. One undesired consequence of employing such speech coders is that the voice quality can be much worse as compared to higher bit-rate speech coders. A method or corresponding apparatus in an example embodiment of the present invention performs voice quality enhancement transparently within a network by detecting use of a coder applying rate reduction to a speech signal and known to have an adverse effect on a coded speech signal. Upon detection of the use of such coder, the coded speech signal is corrected based on components introduced into the coded speech signal due to the rate reduction. As a result of applying the voice quality enhancement, adverse effects of speech coders can be reduced, while maintaining high quality voice signals.Type: GrantFiled: May 16, 2008Date of Patent: October 4, 2011Assignee: Tellabs Operations, Inc.Inventors: Daniel Mapes-Riordan, Steve R. Page
-
Patent number: 8015000Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.Type: GrantFiled: April 13, 2007Date of Patent: September 6, 2011Assignee: Broadcom CorporationInventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
-
Patent number: 7983906Abstract: There is provided a voice activity detection method for indicating an active voice mode and an inactive voice mode. The method comprises receiving a first portion of an input signal; determining that the first portion of the input signal includes an active voice signal; indicating the active voice mode in response to the determining that the first portion of the input signal includes the active voice signal; receiving a second portion of the input signal immediately following the first portion of the input signal; determining that the second portion of the input signal includes an inactive voice signal; extending the indicating the active voice mode for a period of time after determining that the second portion of the input signal includes the inactive voice signal, wherein the period of time varies based on one or more conditions; and indicating the inactive voice mode after expiration of the period of time.Type: GrantFiled: January 26, 2006Date of Patent: July 19, 2011Assignee: Mindspeed Technologies, Inc.Inventors: Yang Gao, Eyal Shlomot, Adil Benyassine
-
Patent number: 7970121Abstract: In a voice activity detection (VAD) device a method for defining tone signals comprises defining a threshold for zero amplitude change, calculating a zero crossing rate of a signal, extracting a set of parameters from a plurality of duration periods of the signal, defining a tolerance threshold between the plurality of duration periods when a zero amplitude change occurs, calculating a maximum difference between the plurality of duration periods, and comparing the maximum difference with the threshold. The method is implemented in the International Telecommunications Union (ITU) recommendation G.729 Annex B VAD.Type: GrantFiled: August 29, 2007Date of Patent: June 28, 2011Assignee: Texas Instruments IncorporatedInventor: Dunling Li
-
Patent number: 7929520Abstract: In a method, apparatus and system for transmitting packet loss concealment (PLC) information, a subscriber device divides a voice sample into a plurality of packets, each including a plurality of successive frames having portions of the voice sample. The subscriber device determines if a predetermined look ahead time duration from the final frame of the plurality of successive frames in a current packet of the plurality of packets includes a noise to voice transition. When the predetermined look ahead time duration is determined to include the noise to voice transition, the subscriber device packs packing information regarding the predetermined look ahead time duration into the current packet. Finally, the subscriber device encodes the plurality of successive frames into the current packet for transmission.Type: GrantFiled: May 2, 2008Date of Patent: April 19, 2011Assignee: Texas Instruments IncorporatedInventor: Dunling Li
-
Patent number: 7805297Abstract: A system and method for performing frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The system and method utilizes a plurality of different FLC techniques, wherein each technique is tuned or designed for a different kind of audio signal. When a frame is lost, a previously-decoded audio signal corresponding to one or more previously-received good frames is analyzed. Based on the result of the analysis, the FLC technique that is most likely to perform well for the previously-decoded audio signal is chosen to perform the FLC operation for the current lost frame. In one implementation, the plurality of different FLC techniques include an FLC technique designed for music, such as a frame repeat FLC technique, and an FLC technique designed for speech, such as a periodic waveform extrapolation (PWE) technique.Type: GrantFiled: November 23, 2005Date of Patent: September 28, 2010Assignee: Broadcom CorporationInventor: Juin-Hwey Chen
-
Patent number: 7702502Abstract: The present invention provides a system and method for representing quasi-periodic (“qp”) waveforms comprising, representing a plurality of limited decompositions of the qp waveform, wherein each decomposition includes a first and second amplitude value and at least one time value. In some embodiments, each of the decompositions is phase adjusted such that the arithmetic sum of the plurality of limited decompositions reconstructs the qp waveform. These decompositions are stored into a data structure having a plurality of attributes. Optionally, these attributes are used to reconstruct the qp waveform, or patterns or features of the qp wave can be determined by using various pattern-recognition techniques. Some embodiments provide a system that uses software, embedded hardware or firmware to carry out the above-described method. Some embodiments use a computer-readable medium to store the data structure and/or instructions to execute the method.Type: GrantFiled: February 23, 2006Date of Patent: April 20, 2010Assignee: Digital Intelligence, L.L.C.Inventors: Carlos A. Ricci, Vladimir V. Kovtun
-
Patent number: 7555429Abstract: Speech likeliness or a degree of speech is determined with a simple configuration or with a small amount of processing, and speech parts are separated from an input sound signal. The input sound signal is subjected to a waveform slicing process in frame units. The increase and decrease rate of a half wavelength in the frame is computed. The rate of a zero cross in the frame is computed. The increase and decrease rate of a half wavelength is computed by determining the rate of the portion where the upward half-wavelength or the downward half-wavelength of the waveform of the input sound signal changes to increase and decrease alternately or to decrease and increase alternately. The degree of speech is determined using each rate. Speech processing for separating or accentuating/attenuating speech and background noise in accordance with the degree of speech is performed on the sound signal for each frame.Type: GrantFiled: June 30, 2005Date of Patent: June 30, 2009Assignee: Sony CorporationInventors: Tetsujiro Kondo, Junichi Shima, Hiroshi Ichiki, Akihiko Arimitsu
-
Patent number: 7447279Abstract: A device, method, and computer readable medium are used in connection with providing an indication of zero crossings corresponding to a signal. The signal (113) is received. Noise is removed from the signal (103). In response to the signal with noise removed (115), pairs of points and a time value corresponding to each point are determined (105), wherein the points of each pair are proximate to a predetermined change in an amplitude of the signal. In response to the pairs and the corresponding time values (117), a zero crossing time is determined for each pair (107). A variation in the plurality of zero crossing times (119) is corrected (109). A signal (123) or indication representative of the corrected zero crossing times is output.Type: GrantFiled: January 31, 2005Date of Patent: November 4, 2008Assignee: Freescale Semiconductor, Inc.Inventor: David L. Wilson
-
Publication number: 20080154585Abstract: In a sound signal processing apparatus, a frame information generation section generates frame information of each frame of a sound signal. A storage stores the frame information generated by the frame information generation section. A first interval determination section determines a first utterance interval in the sound signal. A second interval determination section determines a second utterance interval based on the frame information of the first utterance interval stored in the storage such that the second utterance interval is made shorter than the first utterance interval and confined within the first utterance interval by trimming frames from either of a start point or an end point of the first utterance interval.Type: ApplicationFiled: December 21, 2007Publication date: June 26, 2008Applicant: Yamaha CorporationInventor: Yasuo Yoshioka
-
Publication number: 20080140394Abstract: An implementation of the present invention comprises a voice encoder and decoder method and system that uses voice excitation, eliminating the voice/unvoiced pitch tracking, and the first formant up to 2400 Hertz for synchronous and up to 1600 Hertz for asynchronous, does not use pulse code modulation encoding, but uses the zero crossings only of the first formant, frequency dividing by two and sampling at the formant frequency. The resulting combination uses half or less of the bit rate for excitation and the remainder for short-term spectrum analysis. The spectrum could be updated each 20 milliseconds using 49 bits for the spectrum frame and 49 bits for excitation and one frame bit for synchronous Asynchronous operation could be update at 21.25 milliseconds using 49 bits for the spectrum information and 34 bits for excitation with one bit for frame synchronization.Type: ApplicationFiled: February 15, 2008Publication date: June 12, 2008Inventor: Clyde Holmes
-
Patent number: 7363232Abstract: The present invention provides a method and system for processing an audio signal. According to an exemplary method, an audio signal such as a digital voice signal is received and divided into one or more individual unit cycles. An audio speed conversion operation is enabled by repeating or removing one or more of the individual unit cycles. In particular, repeating one or more of the individual unit cycles decreases audio speed, and removing one or more of the individual unit cycles increases audio speed.Type: GrantFiled: June 29, 2001Date of Patent: April 22, 2008Assignee: Thomson LicensingInventors: Magdy Megeid, Markus Inkamp
-
Patent number: 7277537Abstract: In a voice activity detection (VAD) device a method for defining tone signals comprises defining a threshold for zero amplitude change, calculating a zero crossing rate of a signal, extracting a set of parameters from a plurality of duration periods of the signal, defining a tolerance threshold between the plurality of duration periods when a zero amplitude change occurs, calculating a maximum difference between the plurality of duration periods, and comparing the maximum difference with the threshold. The method is implemented in the International Telecommunications Union (ITU) recommendation G.729 Annex B VAD.Type: GrantFiled: September 2, 2003Date of Patent: October 2, 2007Assignee: Texas Instruments IncorporatedInventor: Dunling Li
-
Patent number: 7253600Abstract: A sample-data analog circuit includes a level-crossing detector. The level-crossing detector controls sampling switches to provide a precise sample of the output voltage when the level-crossing detector senses the predetermined level crossing of the input signal. A multiple segment ramp waveform generator is used in the sample-data analog circuits. The ramp waveform generator includes an amplifier, a variable current source, and a voltage detection circuit coupled to the current source to control the change in the amplitude of the current. The ramp generator produces constant slope within each segment regardless of the load condition. The sample-data analog circuit also utilizes variable bandwidths and thresholds.Type: GrantFiled: July 18, 2006Date of Patent: August 7, 2007Assignee: Cambridge Analog Technology, LLCInventor: Hae-Seung Lee
-
Patent number: 7184952Abstract: A simple and efficient method for producing an obfuscated speech signal which may be used to mask a stream of speech, is disclosed. A speech signal representing the speech stream to be masked is obtained. The speech signal is then temporally partitioned into segments, preferably corresponding to phonemes within the speech stream. The segments are then stored in a memory, and some or all of the segments are subsequently selected, retrieved, and assembled into an obfuscated speech signal representing an unintelligible speech stream that, when combined with the speech signal or reproduced and combined with the speech stream, provides a masking effect. While the presently preferred embodiment finds application most readily in an open plan office, embodiments suitable for use in restaurants, classrooms, and in telecommunications systems are also disclosed.Type: GrantFiled: July 12, 2006Date of Patent: February 27, 2007Assignee: Applied Minds, Inc.Inventors: W. Daniel Hillis, Bran Ferren, Russel Howe
-
Patent number: 7020177Abstract: Briefly, in accordance with an embodiment of the invention, a method and apparatus to transfer information is provided, wherein the method includes transferring information between at least two wireless devices using a waveform that includes a first sinusoidal signal and a second sinusoidal signal, wherein the second sinusoidal signal has more zero-crossings than the first signal and wherein a duration of the first sinusoidal signal is less than a duration of the second sinusoidal signal.Type: GrantFiled: October 1, 2002Date of Patent: March 28, 2006Assignee: Intel CorporationInventors: David G. Leeper, David G. England
-
Patent number: 6993477Abstract: A signal processing device utilizes a stochastic approximation of a gradient descent algorithm for updating a transform. The signal processing device is configured to implement the transform for producing a desired transformed output signal, and the transform is updated using the stochastic approximation of the gradient algorithm based on received data associated with the signal being processed. The transform is represented in a reduced-parameter form, such as a Givens parameterized form or a Householder form, such that the reduced-parameter form for an N×N transform comprises fewer than N2 parameters. The updating process is implemented using computations involving the reduced-parameter form, and an adaptation of the transform is represented directly as one or more changes in the reduced-parameter form. The gradient algorithm may be configured to minimize a negative gradient of a pairwise energy compaction property of the transform.Type: GrantFiled: June 8, 2000Date of Patent: January 31, 2006Assignee: Lucent Technologies Inc.Inventor: Vivek K. Goyal
-
Patent number: 6832194Abstract: The present invention includes a novel audio recognition peripheral system and method. The audio recognition peripheral system comprises an audio recognition peripheral a programmable processor such as a microprocessor or microcontroller. In one embodiment, the audio recognition peripheral includes a feature extractor and vector processor. The feature extractor receives an audio signal and extracts recognition features. The extracted audio recognition features are transmitted to the programmable processor and processed in accordance with an audio recognition algorithm. During execution of the audio recognition algorithm, the programmable processor signals the audio recognition peripheral to perform vector operations. Thus, computationally intensive recognition operations are advantageously offloaded to the peripheral.Type: GrantFiled: October 26, 2000Date of Patent: December 14, 2004Assignee: Sensory, IncorporatedInventors: Forrest S. Mozer, Robert E. Savoie, William T. Teasley