Patents Examined by V. Paul Harper
-
Patent number: 7171365Abstract: In general, the present invention converts speech, preferably recorded on a portable recorder, to text, analyzes the text, and determines voice commands and times when the voice commands occurred. Task names are associated with voice commands and time segments. These time segments and tasks may be packaged as time increments and stored (e.g., in a file or database) for further processing. Preferably, phrase grammar rules are used when analyzing the text, as this helps to determine voice commands. Using phrase grammar rules also allows the text to contain a variety of topics, only some of which are pertinent to tracking time.Type: GrantFiled: February 16, 2001Date of Patent: January 30, 2007Assignee: International Business Machines CorporationInventors: James William Cooper, Donna Karen Byron
-
Patent number: 7167823Abstract: Paired image information and text information correlated to each other are retrieved as information sets. Frequency information on words used in text is extracted from text information in a group of information sets, and text information features are extracted based on frequency information. Text features are used to lay out information sets in a virtual space such that similar pieces of text are located close to each other, and images are displayed at those positions. Further, important words are extracted from those words extracted from text information in a group of information sets, and those words are laid out in the virtual space in the same manner as with information sets and displayed as labels.Type: GrantFiled: November 27, 2002Date of Patent: January 23, 2007Assignee: Fujitsu LimitedInventors: Susumu Endo, Yuusuke Uehara, Daiki Masumoto, Syuuichi Shiitani
-
Patent number: 7162418Abstract: A buffering process for real-time digital audio is provided to effect of network “jitter” from inconsistent network packet delivery rates. The buffering algorithm is particularly useful for audio data including distinct bursts separated by silence, such as speech. The process holds incoming audio packets in a queue until either: (a) the buffer contents meet a predetermined threshold; or (b) the end packet of a burst is received. The result is that silent periods between bursts may expand or decrease relative to the original audio pattern, allowing cumulative jitter to be played out as silence. The threshold is sized such that the deviation in silence is unnoticeable by a listener. In an optional embodiment, the process periodically adjusts the threshold to adapt to network conditions.Type: GrantFiled: November 15, 2001Date of Patent: January 9, 2007Assignee: Microsoft CorporationInventors: Ivan J. Leichtling, Ido Ben-Shachar
-
Patent number: 7158933Abstract: The present invention is generally directed to a system and method for enhancing speech using a multi-channel noise filtering process that is based on psychoacoustic masking effects. A speech enhancement/noise reduction scheme according to the present invention is designed to satisfy the psychoacoustic masking principle and to minimize the signal total distortion by exploiting multiple microphone signals to enhance the useful speech signal at reduced level of artifacts.Type: GrantFiled: May 10, 2002Date of Patent: January 2, 2007Assignee: Siemens Corporate Research, Inc.Inventors: Radu Victor Balan, Justinian Rosca
-
Patent number: 7155387Abstract: A method for reducing noise in a voice signal, and a voice operated system utilizing the same are presented. A noise component in a compressed digital signal representative of the voice signal is determined, and subtracted from the compressed digital signal.Type: GrantFiled: January 8, 2001Date of Patent: December 26, 2006Assignee: Art - Advanced Recognition Technologies Ltd.Inventor: Amir Globerson
-
Patent number: 7146312Abstract: There is disclosed in a packet switched network a method of encoding speech packets into blocks, each speech packet including a speech header and a payload comprised of a speech frame, wherein at least two speech frames are encoded into a single block. Each speech frame may be associated with a different user. An EDGE network utilizes such a method.Type: GrantFiled: March 28, 2000Date of Patent: December 5, 2006Assignee: Lucent Technologies Inc.Inventors: Christian Demetrescu, Konstantinos Samaras, Jian Jun Wu
-
Patent number: 7146314Abstract: Data handling dynamically responds to changing noise power conditions to separate valid data from noise. A reference power level acts as a threshold between dynamically assumed noise and valid data, and dynamically refers to the reference power level changing adaptively with the background noise. The introduction of dynamic noise control in VOX (Voice Activated Transmission) improves a VOX device operation in a noisy environment, even when the background noise profiles are changing. Processing is on a frame by frame basis for successive frames. The threshold is adaptively changed when a comparison of frame signal power to the threshold indicates speech or the absence of speech in the compared frame repeatedly and continuously for a period of time involving plural successive frames having no valid speech or noise above the threshold to correspondingly reduce or increase the threshold by changing the threshold to a value that is a function of the input signal power.Type: GrantFiled: December 20, 2001Date of Patent: December 5, 2006Assignee: Renesas Technology CorporationInventor: Yunbiao Wang
-
Patent number: 7133828Abstract: The present invention provides an audio analysis intelligence tool that provides ad-hoc search capabilities using spoken words as an organized data form. The present invention provides an SQL like interface to process and search audio data and combine it with other traditional data forms.Type: GrantFiled: October 20, 2003Date of Patent: November 7, 2006Assignee: SER Solutions, Inc.Inventors: Robert Scarano, Lawrence Mark
-
Patent number: 7133832Abstract: A lossless encoding apparatus encodes audio data and a lossless decoding apparatus restores the losslessly compression encoded audio data on a real-time basis, and a method therefor. The lossless encoding apparatus includes a lossless compression unit which losslessly compression encodes the audio data stored in an input buffer in units of predetermined data and outputs the encoded data in sequence, and an output buffer which stores the encoded audio data output from the lossless compression unit.Type: GrantFiled: October 28, 2003Date of Patent: November 7, 2006Assignee: Samsung Electronics Co., Ltd.Inventor: Jae-Hoon Heo
-
Patent number: 7130797Abstract: A method of locating a talker in a reverberant environment comprises receiving multiple audio signals from a microphone array that include direct path audio signal and reverberation signal components. The direct path audio signal components of the multiple audio signals are detected and are used to weight the multiple audio signals. A position estimate based on the weighted audio signals is then calculated. Periods of speech activity are detected and a final position estimate is generated during the periods of speech activity.Type: GrantFiled: August 15, 2002Date of Patent: October 31, 2006Assignee: Mitel Networks CorporationInventors: Franck Beaucoup, Michael Tetelbaum
-
Patent number: 7127402Abstract: A conversation manager processes a spoken utterance from a user of a computer that is directed to an application program hosted on the computer. The conversation manager includes a reasoning facility which accesses goal-directed rules stored in a rules base (e.g., database). The reasoning facility also has access to a conversational record that includes a record of previous utterances and a semantic analysis for each utterance. The reasoning facility processes a representation of the utterance by using the goal-directed rules. The reasoning facility uses means-end analysis to determine the proper rules to execute, and thus the script calls to make to achieve the goal of processing the utterance. While processing the utterance, the reasoning facility attempts to resolve any ambiguities in the representation of the utterance and to fill in any missing information that is needed to achieve its goal.Type: GrantFiled: January 10, 2002Date of Patent: October 24, 2006Assignee: International Business Machines CorporationInventors: Steven I. Ross, Elizabeth A. Brownholtz, Jeffrey G. MacAllister
-
Patent number: 7120579Abstract: A method and device for boosting an input signal to overcome noise. Both the input signal, S(t), and an estimate of the noise, N(t), are bandpassed in adjacent pass bands to produce signal and noise subbands. Preferably, the input signal is delayed before being bandpassed. The power envelopes of the signal subbands are converted to signal masking functions, (70), that incorporate the phenomena of forward and backward masking. Signal masking functions whose amplitudes are below the amplitudes of their frequency neighbors are nulled. Similarly, noise subbands whose powers are below the powers of neighboring noise subbands are nulled. The surviving signal masking functions are compared to the corresponding surviving noise power envelopes to determine the degree to which the surviving signal subbands must be amplified, (78), to overcome the noise. The surviving signal subbands are so amplified and summed to provide the output signal, S?(t).Type: GrantFiled: July 27, 2000Date of Patent: October 10, 2006Assignee: Clear Audio Ltd.Inventor: Zvi (Tsvika) Licht
-
Patent number: 7120577Abstract: A system and terminal for facilitating a “virtual presence” allows users on a communication network to simply begin speaking through other users. A system immediately detects the destination party's name, and begins routing the audio signal to a particular destination without any noticeable call set-up. Additionally, the system performs pitch corrected speed control in order to allow the detection and processing of a speech pattern without causing delay to an end user.Type: GrantFiled: January 9, 2003Date of Patent: October 10, 2006Assignee: Intel CorporationInventor: Howard Bubb
-
Patent number: 7117158Abstract: An IVR system may be designed by accepting designer inputs to generate, on a display screen, a flowchart of interconnected flowchart processing blocks and flowchart decision blocks that represent a process flow of processing steps and branches, respectively, in the IVR system. By allowing the designer to generate a flowchart of interconnected flowchart processing blocks and flowchart decision blocks on a single display screen, a potentially simplified graphical user interface may be provided for designing an IVR system. The flowchart of interconnected flowchart processing blocks and flowchart decision blocks may be executed, based on at least one designer input on a keypad image, to simulate or test the IVR system. A self-documenting audit trail may be provided during the design of the IVR system. These audit trails may be associated with a version of the IVR system, so that multiple versions of the system may be managed.Type: GrantFiled: April 25, 2002Date of Patent: October 3, 2006Assignee: Bilcare, Inc.Inventors: Phyllis Marie Dyer Weldon, Edwin Bruce Shankle, III, James Arthur Klein, Jr.
-
Patent number: 7117156Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.Type: GrantFiled: April 19, 2000Date of Patent: October 3, 2006Assignee: AT&T Corp.Inventor: David A. Kapilow
-
Patent number: 7113910Abstract: Methods of document expansion for a speech retrieval document by a recognizer. A database of vectors of automatic transcriptions of documents is accessed and the vectors are truncated by removing all terms that are not recognizable by the recognizer to create truncated vectors. Terms in the vectors are then weighted to associate the truncated vectors with the untruncated vectors. Terms not recognized by the recognizer are then added back to the weighted, truncated vectors. The retrieval effectiveness may then be measured.Type: GrantFiled: December 19, 2000Date of Patent: September 26, 2006Assignee: AT&T Corp.Inventors: Fernando Carlos Pereira, Amitabh Kumar Singhal
-
Patent number: 7110937Abstract: An application archive is searched for an existing translation for a text string in an application to be localized. The text string is associated with context information that identifies a location of the text string in the application. If an existing translation is found that matches the text string, and all, or alternately part of, the context information, the existing translation is logically linked to the text string. In one aspect, the existing translation is selected from multiple matches based on number of occurrences. In another aspect, the existing translation is submitted to a manual validation process.Type: GrantFiled: June 20, 2002Date of Patent: September 19, 2006Assignee: Siebel Systems, Inc.Inventors: Shu Lei, Sergey Parievsky, Mark Hastings
-
Patent number: 7103551Abstract: A described computer network includes a first computer system and a second computer system. The first computer system transmits screen image information and corresponding speech information to the second computer system. The screen image information includes information corresponding to a screen image intended for display within the first computer system. The speech information conveys a verbal description of the screen image. When the screen image includes one or more objects (e.g., menus, dialog boxes, icons, and the like) having corresponding semantic information, the speech information includes the corresponding semantic information. The second computer system responds to the speech information by producing an output (e.g., human speech via an audio output device, a tactile output via a Braille output device, and the like). The semantic information conveyed by the output allows a visually-impaired user of the second computer system to know intended purposes of the objects.Type: GrantFiled: May 2, 2002Date of Patent: September 5, 2006Assignee: International Business Machines CorporationInventors: Charles J. King, Hidemasa Muta, Richard Scott Schwerdtfeger, Andrea Snow-Weaver
-
Patent number: 7096183Abstract: A method is provided for customizing the speaking style of a speech synthesizer. The method includes: receiving input text; determining semantic information for the input text; determining a speaking style for rendering the input text based on the semantic information; and customizing the audible speech output of the speech synthesizer based on the identified speaking style.Type: GrantFiled: February 27, 2002Date of Patent: August 22, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventor: Jean-Claude Junqua
-
Patent number: 7089188Abstract: An electronic document searching system or word searching system which when given an input, expands the input as a function of acoustic similarity and/or word sequence occurrence frequency. Results of the system are alternative input words or phrases. The alternative input words or phrases are output from the system for further processing.Type: GrantFiled: March 27, 2002Date of Patent: August 8, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventors: Beth T. Logan, Jean-Manuel Van Thong, Pedro J. Moreno