Silence Decision Patents (Class 704/210)
-
Patent number: 8762158Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.Type: GrantFiled: August 5, 2011Date of Patent: June 24, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
-
Publication number: 20140160227Abstract: Methods and systems for communicating with rate control. A communication is sent and received from a first device to a second device over a network, wherein the communication comprises at least one audio stream and a second communication stream. A capacity of the network is probed at the first device for the sending and receiving the communication. A presence of a voice in the at least one audio stream is detected at the first device via a voice activity detection of the at least one audio stream. A rate limit is set for the sending and receiving the communication at the first device based on the capacity of the network and the detection of the presence of the at least one audio stream.Type: ApplicationFiled: December 6, 2012Publication date: June 12, 2014Applicant: TANGOME, INC.Inventors: Alexander Subbotin, Olivier Furon, Shaowei Su, Yevgeni Litvin, Xu Liu
-
Patent number: 8744846Abstract: Provided are a noise state determination method and an apparatus and a computer readable recording medium therefor. A noisy speech signal processing method according to the present invention includes calculating a transformed spectrum by transforming an input noisy speech signal to a frequency domain; calculating a smoothed magnitude spectrum by reducing magnitude differences of the transformed spectrum between neighboring frames; calculating a search spectrum which represents an estimated noise component of the smoothed magnitude spectrum; and calculating an identification ratio which represents a ratio of a noise component included in the input noisy speech signal, by using the smoothed magnitude spectrum and the search spectrum. Since a small amount of calculation is required and a large-capacity memory is not required, the present invention may be easily implemented as hardware or software.Type: GrantFiled: November 27, 2008Date of Patent: June 3, 2014Assignee: Transono Inc.Inventors: Sung Il Jung, Dong Gyung Ha
-
Patent number: 8725499Abstract: Disclosed configurations include systems, methods, and apparatus arranged to generate a sequence of spectral tilt values that is based on inactive frames of a speech signal. For each of a plurality of inactive frames of the speech signal, a transmit decision is made according to a change calculated among at least two corresponding values of the sequence. The outcome of the transmit decision determines whether a silence description is transmitted for the corresponding inactive frame.Type: GrantFiled: July 30, 2007Date of Patent: May 13, 2014Assignee: QUALCOMM IncorporatedInventors: Vivek Rajendran, Ananthapadmanabhan A. Kandhadai
-
Patent number: 8712768Abstract: A method, device, system, and computer program product expand narrowband speech signals to wideband speech signals. The method includes determining signal type information from a signal, obtaining characteristics for forming an upper band signal using the determined signal type information, determining signal noise information, using the determined signal noise information to modify the obtained characteristics for forming the upper band signal, and forming the upper band signal using the modified characteristics.Type: GrantFiled: May 25, 2004Date of Patent: April 29, 2014Assignee: Nokia CorporationInventors: Laura Laaksonen, Päivi Valve
-
Patent number: 8700390Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: October 7, 2013Date of Patent: April 15, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Patent number: 8687831Abstract: An external device for a hearing implant system and a hearing implant system having an external device is described. An external transmitter generates a radio-frequency inductive link signal to an implanted receiver including a sequence of data word segments which communicate data to the implanted receiver, and a sequence of data word pause segments between each data word segment which communicate energy without data to the implanted receiver. A data word pause controller controls the inductive link signal during the data word pause segments according to an energy management rule.Type: GrantFiled: October 26, 2012Date of Patent: April 1, 2014Assignee: Med-El Elektromedizinische Geraete GmbHInventors: Martin Stoffaneller, Peter Schleich, Thomas Schwarzenbeck
-
Patent number: 8682666Abstract: A computer implemented method, data processing system, apparatus and computer program product for determining current behavioral, psychological and speech styles characteristics of a speaker in a given situation and context, through analysis of current speech utterances of the speaker. The analysis calculates different prosodic parameters of the speech utterances, consisting of unique secondary derivatives of the primary pitch and amplitude speech parameters, and compares these parameters with pre-obtained reference speech data, indicative of various behavioral, psychological and speech styles characteristics. The method includes the formation of the classification speech parameters reference database, as well as the analysis of the speaker's speech utterances in order to determine the current behavioral, psychological and speech styles characteristics of the speaker in the given situation.Type: GrantFiled: May 7, 2012Date of Patent: March 25, 2014Assignee: Voicesense Ltd.Inventors: Yoav Degani, Yishai Zamir
-
Patent number: 8682662Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.Type: GrantFiled: August 13, 2012Date of Patent: March 25, 2014Assignee: Nokia CorporationInventors: Riitta Elina Niemistö, Päivi Marianna Valve
-
Patent number: 8676572Abstract: A computer-implemented system and method for enhancing audio to individuals participating in a conversation is provided. Audio data for individuals participating in one or more conversations is analyzed. Possible conversational configurations of the individuals are generated based on the audio data, and each possible conversational configuration includes one or more subconfigurations of at least two of the individuals. A probability weight is assigned to each of the subconfigurations and includes a likelihood that the individuals of that subconfiguration are participating in one of the conversations. A probability of each possible conversational configuration is determined by combining the probability weights for the subconfigurations of that possible conversational configuration. The possible conversational configuration with the highest probability is selected as a most probable configuration. The individuals participating in the conversations are determined based on the most probable configuration.Type: GrantFiled: March 14, 2013Date of Patent: March 18, 2014Assignee: Palo Alto Research Center IncorporatedInventors: Paul M. Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison G. Woodruff
-
Patent number: 8645133Abstract: Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode.Type: GrantFiled: February 7, 2013Date of Patent: February 4, 2014Assignee: Core Wireless Licensing S.a.r.l.Inventors: Kari Järvinen, Pasi Ojala, Ari Lakaniemi
-
Patent number: 8626498Abstract: A voice activity detection (VAD) system includes a first voice activity detector, a second voice activity detector and control logic. The first voice activity detector is included in a device and produces a first VAD signal. The second voice activity detector is located externally to the device and produces a second VAD signal. The control logic combines the first and second VAD signals into a VAD output signal. Voice activity may be detected based on the VAD output signal. The second VAD signal can be represented as a flag included in a packet containing digitized audio. The packet can be transmitted to the device from the externally located VAD over a wireless link.Type: GrantFiled: February 24, 2010Date of Patent: January 7, 2014Assignee: QUALCOMM IncorporatedInventor: Te-Won Lee
-
Patent number: 8626495Abstract: The invention relates to a method of identifying and correcting errors in a noisy binary mask. An object of the present invention is to provide a scheme for improving a binary mask representing speech. The problem is solved in that the method comprises a) providing a noisy binary mask comprising a binary representation of the power density of an acoustic signal comprising a target signal and a noise signal at a predefined number of discrete frequencies and a number of discrete time instances; b) providing a statistical model of a clean binary mask representing the power density of the target signal; and c) using the statistical model to detect and correct errors in the noisy binary mask. This has the advantage of providing an alternative and relatively simple way of improving an estimate of a binary mask representing a speech signal. The invention may e.g. be used for speech processing, e.g. in a hearing instrument.Type: GrantFiled: August 4, 2010Date of Patent: January 7, 2014Assignee: Oticon A/SInventors: Jesper Bünsow Boldt, Ulrik Kjems, Michael Syskind Pedersen, Mads Graesbøll Christensen, Søren Holdt Jensen
-
Publication number: 20130325456Abstract: A speech speed conversion factor determining device has a physical index calculation unit including a sound/silence judgment unit that distinguishes between sound and silent intervals of an input signal, a fundamental frequency calculation unit that calculates a fundamental frequency of the signal in the sound intervals and determines stable and unstable intervals, a frequency smoothing unit that smoothes the fundamental frequency in the stable intervals, a pseudo fundamental frequency calculation unit that calculates, for the intervals, a pseudo fundamental frequency by interpolation , and a fundamental frequency general shape connection unit that connects the smoothed and pseudo frequencies to obtain sampled values of a general shape of the frequency, such that the sampled values are output as an index, based on which conversion factor are calculated.Type: ApplicationFiled: January 27, 2012Publication date: December 5, 2013Applicant: NIPPON HOSO KYOKAIInventors: Tohru Takagi, Atsushi Imai, Nobumasa Seiyama, Reiko Saitou
-
Patent number: 8595018Abstract: The invention relates to a technique of operating a call control node controlling at least one section of a call path. The call path includes between two opposite edge nodes a multi-section harmonization path along which codec selection is to be harmonized. A method embodiment of the technique, wherein the call control node is a transfer node in the harmonization path between the edge nodes, comprises the steps of determining if the call control node is a transfer node of the harmonization path; determining if a codec used for the at least one section controlled by the call control node fulfills a predefined harmonization criterion; and providing, in case the used codec does not fulfill the harmonization criterion, a harmonization trigger indication to at least one of the edge nodes of the harmonization path for initiating harmonization.Type: GrantFiled: January 18, 2007Date of Patent: November 26, 2013Assignee: Telefonaktiebolaget L M Ericsson (publ)Inventors: Dirk Kampmann, Andreas Witzel, Karl Hellwig
-
Patent number: 8583428Abstract: Described is a multiple phase process/system that combines spatial filtering with regularization to separate sound from different sources such as the speech of two different speakers. In a first phase, frequency domain signals corresponding to the sensed sounds are processed into separated spatially filtered signals including by inputting the signals into a plurality of beamformers (which may include nullformers) followed by nonlinear spatial filters. In a regularization phase, the separated spatially filtered signals are input into an independent component analysis mechanism that is configured with multi-tap filters, followed by secondary nonlinear spatial filters. Separated audio signals are the provided via an inverse-transform.Type: GrantFiled: June 15, 2010Date of Patent: November 12, 2013Assignee: Microsoft CorporationInventors: Ivan Tashev, Lae-Hoon Kim, Alejandro Acero, Jason Scott Flaks
-
Patent number: 8577675Abstract: In one aspect thereof the invention provides a method for noise suppression of a speech signal that includes, for a speech signal having a frequency domain representation dividable into a plurality of frequency bins, determining a value of a scaling gain for at least some of said frequency bins and calculating smoothed scaling gain values. Calculating smoothed scaling gain values includes, for the at least some of the frequency bins, combining a currently determined value of the scaling gain and a previously determined value of the smoothed scaling gain. In another aspect a method partitions the plurality of frequency bins into a first set of contiguous frequency bins and a second set of contiguous frequency bins having a boundary frequency there between, where the boundary frequency differentiates between noise suppression techniques, and changes a value of the boundary frequency as a function of the spectral content of the speech signal.Type: GrantFiled: December 22, 2004Date of Patent: November 5, 2013Assignee: Nokia CorporationInventor: Milan Jelinek
-
Patent number: 8577674Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: December 12, 2012Date of Patent: November 5, 2013Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Publication number: 20130282367Abstract: This application relates to a voice activity detection (VAD) apparatus configured to provide a voice activity detection decision for an input audio signal. The VAD apparatus includes a state detector and a voice activity calculator. The state detector is configured to determine, based on the input audio signal, a current working state of the VAD apparatus among at least two different working states. Each of the at least two different working states is associated with a corresponding working state parameter decision set which includes at least one voice activity decision parameter. The voice activity calculator is configured to calculate a voice activity detection parameter value for the at least one voice activity decision parameter of the working state parameter decision set associated with the current working state, and to provide the voice activity detection decision by comparing the calculated voice activity detection parameter value with a threshold.Type: ApplicationFiled: June 24, 2013Publication date: October 24, 2013Inventor: Zhe Wang
-
Patent number: 8566107Abstract: Disclosed is a method of processing a signal, which includes receiving at least one of a first signal and a second signal, receiving mode information, and decoding the at least one of the first signal and the second signal using at least one of a first coding scheme and a second coding scheme according to the mode information. The mode information is information for indicating that a prescribed mode corresponds to one of at least three modes. The method includes detecting when a restricted mode change occurs and changing at least one mode when detecting a restricted mode change.Type: GrantFiled: October 15, 2008Date of Patent: October 22, 2013Assignees: LG Electronics Inc., Intellectual Discovery Co., Ltd.Inventors: Hyen-O Oh, Hong Goo Kang, Chang Heon Lee, Sang Wook Shin, Yang Won Jung
-
Patent number: 8560301Abstract: A language expression apparatus and a method based on a context and a intent awareness, are provided. The apparatus and method may recognize a context and an intent of a user and may generate a language expression based on the recognized context and the recognized intent, thereby providing an interpretation/translation service and/or providing an education service for learning a language.Type: GrantFiled: March 2, 2010Date of Patent: October 15, 2013Assignee: Samsung Electronics Co., Ltd.Inventor: Yeo Jin Kim
-
Publication number: 20130268265Abstract: The present invention relates to a method for processing an audio signal, and the method comprises the steps of: receiving an audio signal; determining a coding mode corresponding to a current frame, by receiving network information for indicating the coding mode; encoding the current frame of said audio signal according to said coding mode; and transmitting said encoded current frame, wherein said coding mode is determined by the combination of a bandwidth and bitrate, and said bandwidth includes two or more bands among narrowband, wideband, and super wideband.Type: ApplicationFiled: July 1, 2011Publication date: October 10, 2013Inventors: Gyuhyeok Jeong, Hyejeong Jeon, Lagyoung Kim, Byungsuk Lee, Ingyu Kang
-
Patent number: 8554564Abstract: A rule-based end-pointer isolates spoken utterances contained within an audio stream from background noise and non-speech transients. The rule-based end-pointer includes a plurality of rules to determine the beginning and/or end of a spoken utterance based on various speech characteristics. The rules may analyze an audio stream or a portion of an audio stream based upon an event, a combination of events, the duration of an event, or a duration relative to an event. The rules may be manually or dynamically customized depending upon factors that may include characteristics of the audio stream itself, an expected response contained within the audio stream, or environmental conditions.Type: GrantFiled: April 25, 2012Date of Patent: October 8, 2013Assignee: QNX Software Systems LimitedInventors: Phil Hetherington, Alex Escott
-
Patent number: 8554560Abstract: Discrimination between two classes comprises receiving a set of frames including an input signal and determining at least two different feature vectors for each of the frames. Discrimination between two classes further comprises classifying the two different feature vectors using sets of preclassifiers trained for at least two classes of events and from that classification, and determining values for at least one weighting factor. Discrimination between two classes still further comprises calculating a combined feature vector for each of the received frames by applying the weighting factor to the feature vectors and classifying the combined feature vector for each of the frames by using a set of classifiers trained for at least two classes of events.Type: GrantFiled: September 4, 2012Date of Patent: October 8, 2013Assignee: International Business Machines CorporationInventor: Zica Valsan
-
Patent number: 8548173Abstract: A sound volume correcting device includes: a variable gain unit controlling a gain of an input audio signal on the basis of a gain control signal; a voice average level detector detecting an average level of a human voice signal in the input audio signal; and a gain control signal generator generating the gain control signal for controlling the gain of the input audio signal using the average level of the human voice signal detected by the voice average level detector as a reference level and supplying the generated gain control signal to the variable gain unit.Type: GrantFiled: December 2, 2009Date of Patent: October 1, 2013Assignee: Sony CorporationInventor: Masayoshi Noguchi
-
Patent number: 8542983Abstract: A method of generating a summary of an audio/visual data stream is provided, the data stream comprising a plurality of consecutive frames having audio and visual properties. A plurality of shots of an audio/visual data stream are detected (step 204). A plurality of segments of the audio/visual data stream are determined (step 206), each segment comprising a plurality of the shots of the data stream having similar visual properties. A segment of the determined plurality of segments is selected (step 208). For each shot of said selected segment of said data stream, the audio in a plurality of consecutive frames which occur after the end of said shot is extracted (step 210). At least one of the shots is selected based on the extracted audio (step 212). A summary is generated to include the selected at least one of the shots (step 214).Type: GrantFiled: June 2, 2009Date of Patent: September 24, 2013Assignee: Koninklijke Philips N.V.Inventors: Milan Pastrnak, Pedro Fonseca
-
Patent number: 8489406Abstract: A stereo encoding method and apparatus are provided, so as to reduce distortion caused by delay adjustment. The stereo encoding method includes: extracting a current interchannel delay of a stereo signal and a previous delay adjacent to the current interchannel delay; performing adjustment frame judgment according to characteristics of the current stereo signal when the current delay and the previous delay are different; and performing delay adjustment on the stereo signal by using the current interchannel delay if it is judged that a frame where the current delay occurs is an adjustment frame.Type: GrantFiled: August 12, 2011Date of Patent: July 16, 2013Assignee: Huawei Technologies Co., Ltd.Inventors: Wenhai Wu, Yue Lang, Lei Miao, Zexin Liu, Chen Hu, Qing Zhang
-
Voice analysis device, voice analysis method, voice analysis program, and system integration circuit
Patent number: 8478587Abstract: A sound analysis device comprises: a sound parameter calculation unit operable to acquire an audio signal and calculate a sound parameter for each of partial audio signals, the partial audio signals each being the acquired audio signal in a unit of time; a category determination unit operable to determine, from among a plurality of environmental sound categories, which environmental sound category each of the partial audio signals belongs to, based on a corresponding one of the calculated sound parameters; a section setting unit operable to sequentially set judgement target sections on a time axis as time elapses, each of the judgment target sections including two or more of the units of time, the two or more of the units of time being consecutive; and an environment judgment unit operable to judge, based on a number of partial audio signals in each environmental sound category determined in at least a most recent judgment target section, an environment that surrounds the sound analysis device in at least theType: GrantFiled: March 13, 2008Date of Patent: July 2, 2013Assignee: Panasonic CorporationInventors: Takashi Kawamura, Ryouichi Kawanishi -
Patent number: 8463600Abstract: A system and method for automatically adjusting floor controls based on conversational characteristics is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold comprising a minimum number of timeslices for at least one of the current configuration and one of the possible configurations is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.Type: GrantFiled: February 27, 2012Date of Patent: June 11, 2013Assignee: Palo Alto Research Center IncorporatedInventors: Paul Masami Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison Gyle Woodruff
-
Publication number: 20130144614Abstract: An apparatus for extending the bandwidth of an audio signal, the apparatus being configured to: generate an excitation signal from an audio signal, wherein in the audio signal comprises a plurality of frequency components; extract a feature vector from the audio signal, wherein the feature vector comprises at least one frequency domain component feature and at least one time domain component feature; determine at least one spectral shape parameter from the feature vector, wherein the at least one spectral shape parameter corresponds to a sub band signal comprising frequency components which belong to a further plurality of frequency components; and generate the sub band signal by filtering the excitation signal through a filter bank and weighting the filtered excitation signal with the at least one spectral shape parameter.Type: ApplicationFiled: May 25, 2010Publication date: June 6, 2013Applicant: NOKIA CORPORATIONInventors: Ville Mikael Myllyla, Laura Laaksonen, Hannu Juhani Pulakka, Paavo Ilmari Alku
-
Patent number: 8452591Abstract: A device comprising an audio information processor to receive at least one audio stream encoded according to a first protocol by a remote network processing device, the audio stream having associated comfort noise information to indicate a level of background noise available for presentation during silence periods associated with the audio stream, the audio information processor to decode the received audio stream according to the first protocol and to encode the decoded audio stream according to a second protocol, and a background noise translator to convert the comfort noise information received with the audio stream into a format compatible with the second protocol.Type: GrantFiled: April 11, 2008Date of Patent: May 28, 2013Assignee: Cisco Technology, Inc.Inventors: Herbert Wildfeuer, Robert Simon
-
Patent number: 8438016Abstract: A client for silence-based adaptive real-time voice and video (SAVV) transmission methods and systems, detects the activity of a voice stream of conversational speech and aggressively transmits the corresponding video frames if silence in the sending or receiving voice stream has been detected, and adaptively generates and transmits key frames of the video stream according to characteristics of the conversational speech. In one aspect, a coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice encoder of the SAVV client and the user's instructions. In another aspect, the coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice decoder of the SAVV client and the user's instructions. In one example, the coordination management module adaptively generates a key video frame when silence is detected in the receiving voice stream.Type: GrantFiled: April 10, 2008Date of Patent: May 7, 2013Assignee: City University of Hong KongInventors: Weijia Jia, Lizhuo Zhang, Huan Li, Wenyan Lu
-
Patent number: 8433582Abstract: A method (100) includes receiving (101) an input digital audio signal comprising a narrow-band signal. The input digital audio signal is processed (102) to generate a processed digital audio signal. A high-band energy level corresponding to the input digital audio signal is estimated (103) based on a transition-band of the processed digital audio signal within a predetermined upper frequency range of a narrow-band bandwidth. A high-band digital audio signal is generated (104) based on the high-band energy level and an estimated high-band spectrum corresponding to the high-band energy level.Type: GrantFiled: February 1, 2008Date of Patent: April 30, 2013Assignee: Motorola Mobility LLCInventors: Tenkasi V. Ramabadran, Mark A. Jasiuk
-
Publication number: 20130103395Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: ApplicationFiled: December 12, 2012Publication date: April 25, 2013Applicant: AT&T INTELLECTUAL PROPERTY II, L.P.Inventor: AT&T INTELLECTUAL PROPERTY II, L.P.
-
Patent number: 8428953Abstract: An audio decoding device of the present invention includes: a decoding unit decoding a stream to a spectrum coefficient, and outputting stream information when a frame included in the stream cannot be decoded; an orthogonal transformation unit transforming the spectrum coefficient to a time signal; a correction unit generating a correction time signal based on an output waveform within a reference section that is in a section that overlaps between an error frame section to which the stream information is outputted and an adjacent frame section and that is a section in the middle of the adjacent frame section, when the decoding unit outputs the stream information: and an output unit generating the output waveform by synthesizing the correction time signal and the time signal.Type: GrantFiled: May 20, 2008Date of Patent: April 23, 2013Assignee: Panasonic CorporationInventors: Kojiro Ono, Takeshi Norimatsu, Yoshiaki Takagi, Takashi Katayama
-
Patent number: 8417524Abstract: Analyzing an audio interaction is provided. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided.Type: GrantFiled: February 11, 2010Date of Patent: April 9, 2013Assignee: International Business Machines CorporationInventors: Om D. Deshmukh, Chitra Dorai, Shailesh Joshi, Maureen E. Rzasa, Ashish Verma, Karthik Visweswariah, Gary J. Wright, Sai Zeng
-
Publication number: 20130073281Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.Type: ApplicationFiled: November 13, 2012Publication date: March 21, 2013Applicant: FUJITSU LIMITEDInventor: Fujitsu Limited
-
Patent number: 8391373Abstract: A method is provided for concealing a transmission error in a digital signal chopped into a plurality of successive frames associated with different time intervals in which, on reception, the signal may comprise erased frames and valid frames, the valid frames comprising information relating to the concealment of frame loss. The method is implemented during a hierarchical decoding using a core decoding and a transform-based decoding using windows introducing a time delay of less than a frame with respect to the core decoding. The method includes concealing a first set of missing samples for the erased frame, implemented in a first time interval; a step of concealing a second set of missing samples utilizing information of said valid frame and implemented in a second time interval; and a step of transition between the first and the second set of missing samples to obtain at least part of the missing frame.Type: GrantFiled: March 20, 2009Date of Patent: March 5, 2013Assignee: France TelecomInventors: David Virette, Pierrick Philippe, Balazs Kovesi
-
Patent number: 8380500Abstract: A spectrum calculating unit calculates, for each of the frames, a spectrum by performing a frequency analysis on an acoustic signal. An estimating unit estimates a noise spectrum. An energy calculating unit calculates an energy characteristic amount. An entropy calculating unit calculates a normalized spectral entropy value. A generating unit generates a characteristic vector based on the energy characteristic amounts and the normalized spectral entropy values that have been calculated for a plurality of frames. A likelihood calculating unit calculates a speech likelihood value of a target frame that corresponds to the characteristic vector. In a case where the speech likelihood value is larger than a threshold value, a judging unit judges that the target frame is a speech frame.Type: GrantFiled: September 22, 2008Date of Patent: February 19, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Koichi Yamamoto, Masami Akamine
-
Patent number: 8380522Abstract: A device is disclosed for compressing data contained in input frames to be compressed constituted of stream frames defining portions of TRAU and signaling frames that have to be transmitted within a communication network and each of which is constituted of at least a header containing control data representative at least of the type of stream frame and where applicable payload data, certain types containing critical and/or non-critical data. The device analyzes each TRAU or signaling frame header contained in successively received input frames in order to determine its type and generates periodically compressed frames to be transmitted that are divided into first and second sections of variable size. The first section contains critical data compressed synchronously and the second section contains non-critical data compressed asynchronously.Type: GrantFiled: December 14, 2004Date of Patent: February 19, 2013Assignee: Alcatel LucentInventors: Emmanuelle Chevallier, Jean Farineau, Jean-Noël Lignon, Christophe Gerrier, Xavier Denis, Christelle Aime
-
Patent number: 8374852Abstract: Disclosed is a code conversion method to convert a first code sequence conforming to a first speech coding scheme into a second code sequence conforming to a second speech coding scheme. The method includes the following steps. The first step discriminates whether the first code sequence corresponds to a speech part or to a non-speech part, and generates a numerical value that indicates the discrimination result as a control flag. The second step converts the first code sequence into the second code sequence and outputs said second code sequence, when the value of the control flag corresponds to the speech part. The third step outputs the second code sequence that corresponds to the value of the control flag, when the value of the control flag corresponds to the non-speech part.Type: GrantFiled: March 16, 2006Date of Patent: February 12, 2013Assignee: NEC CorporationInventor: Atsushi Murashima
-
Patent number: 8374860Abstract: Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode.Type: GrantFiled: September 29, 2011Date of Patent: February 12, 2013Assignee: Core Wireless Licensing S.A.R.L.Inventors: Kari Jarvinen, Pasi Ojala, Ari Lakaniemi
-
Patent number: 8370132Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.Type: GrantFiled: November 21, 2005Date of Patent: February 5, 2013Assignee: Verizon Services Corp.Inventor: Adrian E. Conway
-
Patent number: 8346543Abstract: A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: March 17, 2011Date of Patent: January 1, 2013Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Patent number: 8340960Abstract: Techniques for implementing vocoders in parallel digital signal processors are described. A preferred approach is implemented in conjunction with the BOPS® Manifold Array (ManArray™) processing architecture so that in an array of N parallel processing elements, N channels of voice communication are processed in parallel. Techniques for forcing vocoder processing of one data-frame to take the same number of cycles are described. Improved throughput and lower clock rates can be achieved.Type: GrantFiled: June 16, 2009Date of Patent: December 25, 2012Assignee: Altera CorporationInventors: Ali Soheil Sadri, Navin Jaffer, Anissim A. Silivra, Bin Huang, Matthew Plonski
-
Patent number: 8326612Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.Type: GrantFiled: April 5, 2010Date of Patent: December 4, 2012Assignee: Fujitsu LimitedInventors: Nobuyuki Washio, Shoji Hayakawa
-
Patent number: 8315865Abstract: A conversation detector and detection method is based on voice band energy detection. The detector is formed of a signal preconditioner, a comparator and an analysis unit. The comparator generates signal pulses reduced in resolution and sample rate (e.g., single bit data) and indicative of energy level and/or duration of activity detected in subject audio signals. The analysis unit determines from the generated signal pulses whether a conversation exists in the subject audio signal. The detector is also able to adapt to environmental noise change, automatically calibrate and operate in low power consumption mode.Type: GrantFiled: May 4, 2004Date of Patent: November 20, 2012Assignee: Hewlett-Packard Development Company, L.P.Inventor: Benjamin Kuris
-
Publication number: 20120278068Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and determining whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the determination criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.Type: ApplicationFiled: July 11, 2012Publication date: November 1, 2012Applicant: Huawei Technologies Co., Ltd.Inventor: Zhe Wang
-
Patent number: 8296132Abstract: The disclosure provides a method for noise generation, including: determining an initial value of a reconstructed parameter; determining a random value range based on the initial value of the reconstructed parameter; taking a value in the random value range randomly as a reconstructed noise parameter; and generating noise by using the reconstructed noise parameter. The disclosure also provides an apparatus for noise generation.Type: GrantFiled: March 26, 2010Date of Patent: October 23, 2012Assignee: Huawei Technologies Co., Ltd.Inventors: Deming Zhang, Jinliang Dai
-
Patent number: 8280731Abstract: A speech enhancement method operative for devices having limited available memory is described. The method is appropriate for very noisy environments and is capable of estimating the relative strengths of speech and noise components during both the presence as well as the absence of speech.Type: GrantFiled: March 14, 2008Date of Patent: October 2, 2012Assignee: Dolby Laboratories Licensing CorporationInventor: Rongshan Yu