Pattern Display Patents (Class 704/276)
  • Patent number: 6694297
    Abstract: The present invention has as its object to provide text information to a listener with voice when music is reproduced from a medium on which the text information is stored together with music data and to provide easily and smoothly use of the text information. The present invention is a text information read-out device for reading out text information from a medium on which text information is stored together with music data, including a text information extraction unit for extracting text information, a voice synthesizer obtaining voice data from the extracted text information, and a controller for controlling a read-out timing of the voice data in synchronism with reproduction of music data.
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: February 17, 2004
    Assignee: Fujitsu Limited
    Inventor: Tatsuhiro Sato
  • Patent number: 6687675
    Abstract: A portable device for displaying an image and generating an associated audible message to a user includes a compartment having an area for visually displaying an image to a user and a system for generating an audio message associated with the image from stored audio information. The system includes a memory for storing the audio information, a speech synthesizer and a speaker for converting the audio information to the audio message and a microprocessor connected between the memory and the speech synthesizer for controlling the generation of the audio message.
    Type: Grant
    Filed: May 31, 2000
    Date of Patent: February 3, 2004
    Inventor: Lurley Archambeau
  • Publication number: 20040015365
    Abstract: In the disclosed speech recognition based interactive information retrieval scheme, the recognition target words in the speech recognition database are divided into prioritized recognition target words that constitute a number of data that can be processed by the speech recognition processing in the prescribed processing time and that have relatively higher importance levels based on statistical information, and the other non-prioritized recognition target words. Then, the speech recognition processing for the speech input with respect to the prioritized recognition target words is carried out at higher priority, and a confirmation process is carried out when the recognition result satisfies a prescribed condition for judging that the retrieval key can be determined only by a confirmation process with the user.
    Type: Application
    Filed: July 15, 2003
    Publication date: January 22, 2004
    Inventors: Kumiko Ohmori, Masanobu Higashida, Noriko Mizusawa
  • Patent number: 6678661
    Abstract: A method for highlighting a desired portion in an audio sequence for use in a visual display challenged environment. The method includes storing the audio sequence in memory. Next, the user selects a desired portion of the audio sequence and the selected portion is distinguished from the remainder of the audio sequence by automatically varying an audio characteristic of the selected portion during playback, without permanently altering the selected portion. In a related embodiment, the audio characteristic that is varied is pitch of the selected portion.
    Type: Grant
    Filed: February 11, 2000
    Date of Patent: January 13, 2004
    Assignee: International Business Machines Corporation
    Inventors: Gordon James Smith, George Willard Van Leeuwen
  • Publication number: 20040006480
    Abstract: A method of presenting a multi-modal help dialog move to a user in a multi-modal dialog system is disclosed. The method comprises presenting an audio portion of the multi-modal help dialog move that explains available ways of user inquiry and presenting a corresponding graphical action performed on a user interface associated with the audio portion. The multi-modal help dialog move is context-sensitive and uses current display information and dialog contextual information to present a multi-modal help move that is currently related to the user. A user request or a problematic dialog detection module may trigger the multi-modal help move.
    Type: Application
    Filed: December 19, 2002
    Publication date: January 8, 2004
    Inventors: Patrick Ehlen, Helen Hastie, Michael Johnston
  • Publication number: 20040006481
    Abstract: A transcription tool assists a user in transcribing audio. The transcription tool includes an audio classification component that classifies an incoming audio stream based on whether portions of the audio stream contain speech data. The transcription tool plays the portions of the audio that contain speech data back to the user and skips the portions of the audio that do not contain speech data. Using a relatively simple command set, the user can control the transcription tool and annotate transcribed text.
    Type: Application
    Filed: July 2, 2003
    Publication date: January 8, 2004
    Inventors: Daniel Kiecza, Francis Kubala
  • Patent number: 6662161
    Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.
    Type: Grant
    Filed: September 7, 1999
    Date of Patent: December 9, 2003
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
  • Publication number: 20030220798
    Abstract: A user interface is described that informs the user as to the status of the operation of a voice recognition application. The user interface displays an indicator, such as a volume bar, each time that the voice recognition application records and identifies a volume event. The user interface also displays an indicator when the voice recognition application recognizes a volume event corresponding to a displayed volume event indicator. The interface thus confirms to a user that the voice recognition application is both recording and recognizing the words being spoken by the user. It also graphically informs the user of the delay the application is currently experiencing in recognizing the words that the user is speaking.
    Type: Application
    Filed: May 24, 2002
    Publication date: November 27, 2003
    Applicant: Microsoft Corporation
    Inventors: Philipp H. Schmid, Marieke Iwema, Robert L. Chambers, Adrian Garside
  • Patent number: 6636219
    Abstract: The present invention provides an easy to use tool for preparing animated characters for use on the Internet. Requiring only limited user input and selection, the system of the present invention automatically choreographs and synchronizes reusable animation components with dialog streams. Once generated, the resulting choreography may be embedded into a hypertext markup language (HTML) web page with an appropriate audio player plug-in to deliver any number of animated dialogues with minimal wait time and minimal developer effort.
    Type: Grant
    Filed: February 26, 1998
    Date of Patent: October 21, 2003
    Assignee: Learn.Com, Inc.
    Inventors: Richard Merrick, Michael Thenhaus, Wesley Bell, Mark Zartler
  • Publication number: 20030177014
    Abstract: A system and method for displaying text of a musical work in multiple languages as the work is performed or played, which include the steps of:
    Type: Application
    Filed: February 12, 2003
    Publication date: September 18, 2003
    Inventors: Robert E. Crawford, Sol Guber
  • Publication number: 20030171932
    Abstract: A method and apparatus for automatically controlling the operation of a speech recognition system without requiring unusual or unnatural activity of the speaker by passively determining if received sound is speech of the user before activating the speech recognition system. A video camera and microphone are located in a hand-held device. The video camera records a video image of the speaker's face, i.e., of speech articulators of the user such as the lips and/or mouth. The recorded characteristics of the articulators are analyzed to identify the sound that the articulators would be expected to make, as in “lip reading”. A microphone concurrently records the acoustic properties of received sound proximate the user. The recorded acoustic properties of the received sound are then compared to the characteristics of speech that would be expected to be generated by the recorded speech articulators to determine whether they match.
    Type: Application
    Filed: March 7, 2002
    Publication date: September 11, 2003
    Inventors: Biing-Hwang Juang, Jialin Zhong
  • Patent number: 6609090
    Abstract: A computer based system, computer program product, and method for managing geographically distributed assets. A computer based system, computer program product, and method are provided for managing geographically distributed electric power transmission assets. Mapping, routing, and asset location information is managed and combined with real time Global Positioning System (GPS) location information to alarm field maintenance and inspection crews prior to inadvertent entry into restricted areas, such as environmentally protected or otherwise restricted lands. Detailed information is maintained on the electric power transmission assets providing management insight into past performance, as well as predictive information as to future costs and performance.
    Type: Grant
    Filed: March 9, 2000
    Date of Patent: August 19, 2003
    Assignee: Public Service Company of New Mexico
    Inventors: Christopher W. Hickman, Theodor E. Kircher, III, Gathen L. Garcia
  • Publication number: 20030105639
    Abstract: The invention includes an apparatus and method of providing information using an information appliance coupled to a network. The method includes storing text files in a database at a remote location and converting, at the remote location, the text files into speech files. A portion of the speech files requested are downloaded to the information appliance and presented through an audio speaker. The speech files may include audio of electronic program guide (EPG) information, weather information, news information or other information. The method also includes converting the text files into speech files at the remote location using an English text-to-speech (TTS) synthesizer, a Spanish TTS synthesizer, or another language synthesizer. A voice personality may be selected to announce the speech files.
    Type: Application
    Filed: November 30, 2001
    Publication date: June 5, 2003
    Inventors: Saiprasad V. Naimpally, Vasanth Shreesha
  • Publication number: 20030055655
    Abstract: Certain disclosed methods and systems perform multiple different types of message recognition using a shared language model. Message recognition of a first type is performed responsive to a first type of message input (e.g., speech), to provide text data in accordance with both the shared language model and a first model specific to the first type of message recognition (e.g., an acoustic model). Message recognition of a second type is performed responsive to a second type of message input (e.g., handwriting), to provide text data in accordance with both the shared language model and a second model specific to the second type of message recognition (e.g., a model that determines basic units of handwriting conveyed by freehand input). Accuracy of both such message recognizers can be improved by user correction of misrecognition by either one of them. Numerous other methods and systems are also disclosed.
    Type: Application
    Filed: January 28, 2002
    Publication date: March 20, 2003
    Inventor: Edwin A. Suominen
  • Publication number: 20030046088
    Abstract: A comprehensive system is provided for designing a voice activated user interface (VA UI) having a semantic and syntactic structure adapted to the culture and conventions of spoken language for the intended users. The system decouples the content dimension of speech (semantics) and the manner-of-speaking dimension (syntax) in a systematic way. By decoupling these dimensions, the VA UI can be optimized with respect to each dimension independently and jointly. The approach is general across languages and encompasses universal variables of language and culture. Also provided are voice activated user interfaces with semantic and syntactic structures so adapted, as well as a prompting grammar and error handling methods adapted to such user interfaces.
    Type: Application
    Filed: August 13, 2002
    Publication date: March 6, 2003
    Applicant: Comverse Network Systems, Inc.
    Inventor: Matthew John Yuschik
  • Patent number: 6529870
    Abstract: A method and apparatus for identifying voice mail messages uses speaker identification to identify a voice mail message. The method preferably includes comparing the voice mail message to existing voice samples in order to determine a matching coefficient. If the matching coefficient is within an allowed range, the voice mail message is categorized as a matched voice mail message and a name indicator is coupled to the voice mail message. The apparatus includes a user interface, a processing unit, and a storage media. The user interface receives the voice mail message and allows access to the voice mail message by a voice mail recipient. The processing unit compares the voice mail message to the existing voice samples, determines the matching coefficient, and assigns the name indicator. The storage media stores the voice mail message.
    Type: Grant
    Filed: October 4, 1999
    Date of Patent: March 4, 2003
    Assignee: Avaya Technology Corporation
    Inventor: Rajendra Prasad Mikkilineni
  • Publication number: 20030040916
    Abstract: An animation system includes a voice engine which processes audio input signals, typically speech signals, and converts them to a digital signal for processing. The digital signal is analysed to generate a value characteristic of each sample of the input signal and which is related to the maximum amplitude of the sample. The voice engine compares each value obtained in this way to the number of possible predetermined value ranges corresponding to a predetermined graphic showing a mouth position, and thus matches the input speech signal to a variety of possible mouth positions. The mouth graphics are superimposed on an image of a character substantially in real-time, providing an animated display of a character with its mouth synchronised to the input speech signal.
    Type: Application
    Filed: August 2, 2001
    Publication date: February 27, 2003
    Inventor: Ronald Leslie Major
  • Publication number: 20030009342
    Abstract: Mark R. Haley has created software which uniquely converts text from any language into realistically sounding human speech in any language on any device such as off-the self computers (or servers) and it has the option to show related videos and photos (or any multimedia) which is associated with that text.
    Type: Application
    Filed: July 6, 2001
    Publication date: January 9, 2003
    Inventor: Mark R. Haley
  • Publication number: 20020194006
    Abstract: A visual speech system for converting emoticons into facial expressions on a displayable animated facial image. The system comprises: (1) a data import system for receiving text data that includes at least one emoticon string, wherein the at least one emoticon string is associated with a predetermined facial expression; and (2) a text-to-animation system for generating a displayable animated face image that can simulate at least one facial movement corresponding to the predetermined facial expression. The system is preferably implemented remotely over a network, such as in an on-line chat environment.
    Type: Application
    Filed: March 29, 2001
    Publication date: December 19, 2002
    Applicant: Koninklijke Philips Electronics N.V.
    Inventor: Kiran Challapali
  • Publication number: 20020184036
    Abstract: This invention discloses a system and method for providing a visible indication of speech, the system including a speech analyzer operative to receive input speech (10), and to provide a phoneme-based output indication (14) representing the input speech, and a visible display receiving the phoneme-based output indication (16) and providing an animated representation of the input speech based on the phoneme-based output indication (16).
    Type: Application
    Filed: May 30, 2002
    Publication date: December 5, 2002
    Inventor: Nachshon Margaliot
  • Publication number: 20020161587
    Abstract: A method and system for providing natural language processing in a communication system is disclosed. A voice request is generated with a remote terminal that is transmitted to a base station. A voice recognition application is then used to identify a plurality of words that are contained in the voice request. After the words are identified, a grammar associated with each word is also identified. Once the grammars have been identified, each word is categorized into a respective grammar category. A structured response is then generated to the voice request with a response generation application.
    Type: Application
    Filed: April 25, 2002
    Publication date: October 31, 2002
    Inventors: Ashton F. Pitts, Steve L. Dempsen, Vinny Wai-yan Che
  • Publication number: 20020152080
    Abstract: The invention creates a device for the entry of names into a navigation system (12). The selection probability is taken into consideration in entering the names. At least one statistically collected detail or a detail recorded by measurement technology concerning the local characteristics of the area designated by the name is used as a measure for the selection probability.
    Type: Application
    Filed: May 13, 2002
    Publication date: October 17, 2002
    Inventor: Jens Ehrke
  • Patent number: 6456274
    Abstract: A text processing system has display of text as a page of characters. The user interface comprises a set of editing actions that are activatable by mouse and/or keyboard actuations. Furthermore, secondary display modes are provided for audio or fax, which can be edited by at least a subset of the text editing actions. Representation of audio is as pseudo characters.
    Type: Grant
    Filed: October 16, 1997
    Date of Patent: September 24, 2002
    Assignee: U.S. Philips Corporation
    Inventor: Jan P. Van Hemert
  • Patent number: 6434525
    Abstract: A device is provided that generates the gestures and expressions of a human image on a computer without expending a great amount of labor. The words for the system response to the input of a user and the state of the dialogue are described in a dialogue flow memory unit, a dialogue flow analysis unit analyzes the spoken text of the flow, extracts the key words associated with a movement pattern by referring to a text movement association table, and the movement expression generation unit generates the movements corresponding to the movement pattern. In the generation of the movement, movement patterns determined in advance are selected according to the state of the dialogue written in the dialogue flow, and the movement pattern is determined or modified by the key words.
    Type: Grant
    Filed: May 26, 1999
    Date of Patent: August 13, 2002
    Assignee: NEC Corporation
    Inventors: Izumi Nagisa, Dai Kusui
  • Publication number: 20020077832
    Abstract: A computer system based method of analyzing an electronic document which document includes text and graphics and common reference symbols designate text components and respective graphics components the method comprising processing the document text and graphics into an index that identifies the text locations of reference symbols and graphic locations of reference symbols, and displaying (70) the text that includes at least some of the text reference symbols and/or displaying (68) at least some of the graphic reference symbols, and linking the common text and common graphic reference symbols such that user selection of a particular text reference symbol or graphic reference symbol causes display of a respective graphic segment or text segment that includes the selected common reference symbol. Other features include displaying a component list, selecting component identities to display graphic segments, using voice recognition for user control, and synthesized speech for audio text response.
    Type: Application
    Filed: November 2, 2001
    Publication date: June 20, 2002
    Inventors: Batchilo Leonid, Tsourikov Valery, Edward Dreyfus
  • Patent number: 6405167
    Abstract: An electronic interactive book allowing a child to learn the pronunciation of various words. The book will be provided with a permanent or impermanent display onto which a plurality of words, phrases or sentences would be provided. The electronic book includes a microphone as well as a speech recognition unit for recognizing a word that the child pronounces. A highlighting device such as a light emitting diode or a means for illuminating a particular word in a manner different than the surrounding words would be engaged when the child correctly pronounces that word.
    Type: Grant
    Filed: July 16, 1999
    Date of Patent: June 11, 2002
    Inventor: Mary Ann Cogliano
  • Patent number: 6397185
    Abstract: A pronunciation training system and methods are provided as a series of programmed routines stored on an item of removable storage media, and select information generated by a speech analysis engine to compute and display graphical representations of metrics useful to a student. The student selects from among a plurality of pre-recorded utterances spoken by a native speaker, and the student then records his or her pronunciation of the utterance. The software computes and displays graphical metrics for the native speaker's utterance and the student's utterance, in any of a variety of formats, on a side-by-side basis. The system also permits the student to repeat selected phrases and to monitor improvement by similarity between the graphical metrics.
    Type: Grant
    Filed: March 29, 1999
    Date of Patent: May 28, 2002
    Assignee: Betteraccent, LLC
    Inventors: Julia Komissarchik, Edward Komissarchik
  • Patent number: 6389396
    Abstract: A device for prosody generation at visual synthesis. A number of half-syllables are stored together with registered movement patterns in a face. When synthesizing speech, a number of half-syllables are put together into words and sentences. The words and sentences are given a stress and pattern of intonation corresponding to the intended language. In the face, a number of points and their movement patterns are further registered. In connection with the generation of words and sentences, the movement patterns of the different points are amplified depending on a given stress and sentence intonation. The given movement patterns are after that applied to a model, which is applied to a real face at which a life-like animation is obtained, at for instance a translation of a person's speech in a first language to a second language.
    Type: Grant
    Filed: November 29, 1999
    Date of Patent: May 14, 2002
    Assignee: Telia AB
    Inventor: Bertil Lyberg
  • Patent number: 6385580
    Abstract: A method of speech synthesis for reproduction of facial movements of a person with synchronized speech synthesis. The speech is put together of polyphones that are taken from a database. A databank is established containing polyphones with the polyphones belonging to facial movement patterns of the face of a first person. Polyphones from a second person are also registered and stored in the database. The duration of sound segments in corresponding polyphones in the databank and the database are compared and the facial movements in the databank are modified in accordance with the deviation. The modified facial movements are stored in the databank and related to the polyphone in question. These registered polyphones are then utilized to put together words and sentences at the same time as corresponding facial movements build up a face model from the modified facial movements in the databank.
    Type: Grant
    Filed: November 18, 1999
    Date of Patent: May 7, 2002
    Assignee: Telia AB
    Inventors: Bertil Lyberg, Mats Wiren
  • Publication number: 20020049599
    Abstract: An information presentation computer receiving news articles distributed from an information distribution computer performs voice synthesis based on text information included in received send data, outputs obtained synthetic voice, and displays an animation image imitating a speaker of synthetic voice. In addition, a text to be spoken by synthetic voice is displayed in a letter color corresponding to each animation image.
    Type: Application
    Filed: September 28, 2001
    Publication date: April 25, 2002
    Inventors: Kazue Kaneko, Hideo Kuboyama, Shinji Hisamoto
  • Patent number: 6351732
    Abstract: Audio signals are presented to a user by separating the audio signals into plural discrete frequency components extending from a low frequency to a high frequency, translating each of the frequency components into control signals, and applying the control signals to an array of light emitting devices for sensing by the user.
    Type: Grant
    Filed: February 7, 2001
    Date of Patent: February 26, 2002
    Inventors: Elmer H. Hara, Edward R. McRae
  • Publication number: 20010053970
    Abstract: A data collection and automatic database population system which combines global positioning system (GPS), speech recognition software, radio frequency (RF) communications, and geographic information system (GIS) to allow rapid capture of field data, asset tracking, and automatic transfer of the data to a GIS database. A pre-defined grammar allows observations to be continuously captured along with GPS location and time, and stored on the field mobile unit. A mobile unit's location is tracked in real time or post processed through wireless RF transmission of location information between the mobile unit and a central processing station. The captured data is electronically transferred to a central processing station for quality assurance and automatic population of the GIS database. The system provides for automatic correlation of field data with other GIS database layers. Tools to generate predefined or user defined reports, work orders, and general data queries allow exploitation of the GIS database.
    Type: Application
    Filed: June 12, 2001
    Publication date: December 20, 2001
    Inventors: Terry Edward Ford, John Anthony Yotka, Richard James Turek
  • Patent number: 6332123
    Abstract: A picture synthesizing apparatus, and method for synthesizing a moving picture of a person’s face having mouth-shape variations from a train of input characters, wherein the method steps comprise developing from the train of input character a train of phonemes, utilizing a speech synthesis technique outputting, for each phoneme, a corresponding vocal sound feature including articulation mode and its duration of each corresponding phoneme of the train of phonemes. Determining for each phoneme a mouth-shape feature corresponding to each phoneme on the basis of the corresponding vocal sound feature, the mouth-shape feature including the degree of opening of the mouth, the degree of roundness of the lips, the height of the lower jaw in a raised and a lowered position, and the degree to which the tongue is seen.
    Type: Grant
    Filed: January 19, 1994
    Date of Patent: December 18, 2001
    Assignee: Kokusai Denshin Denwa Kabushiki Kaisha
    Inventors: Masahide Kaneko, Atsushi Koike, Yoshinori Hatori, Seiichi Yamamoto, Norio Higuchi
  • Publication number: 20010039489
    Abstract: A data collection and automatic database population system which combines global positioning system (GPS), speech recognition software, radio frequency (RF) communications, and geographic information system (GIS) to allow rapid capture of field data, asset tracking, and automatic transfer of the data to a GIS database. A pre-defined grammar allows observations to be continuously captured along with GPS location and time, and stored on the field mobile unit. A mobile unit's location is tracked in real time or post processed through wireless RF transmission of location information between the mobile unit and a central processing station. The captured data is electronically transferred to a central processing station for quality assurance and automatic population of the GIS database. The system provides for automatic correlation of field data with other GIS database layers. Tools to generate predefined or user defined reports, work orders, and general data queries allow exploitation of the GIS database.
    Type: Application
    Filed: May 23, 2001
    Publication date: November 8, 2001
    Inventors: Terry Edward Ford, John Anthony Yotka, Richard James Turek,
  • Patent number: 6311160
    Abstract: The operation mode control section sets an operation mode flag that authorizes the loudspeaker to replay the speech input through the microphone while the speech is being compressed and encoded into speech data by the compression/encoding section and then further processed by the encoding processing section. When an order is issued by the user to confirm the input speech by means of the replay operation section during the encoding operation, the speech output control section receives a permit signal for reproducing the speech through the loudspeaker after expanding the speech data by means of the speech data expansion processing section. The encoding operation proceeds concurrently during the speech reproducing operation.
    Type: Grant
    Filed: October 1, 1998
    Date of Patent: October 30, 2001
    Assignee: Olympus Optical Co., Ltd.
    Inventor: Shinichi Imade
  • Patent number: 6308157
    Abstract: A method and system efficiently identifies voice commands for a user of a speech recognition system. The method involves a series of steps including: receiving input from a user; monitoring the computer system to log system events and ascertain a current system state; predicting a probable next event according to the current system state and logged events; and identifying acceptable voice commands to perform the next event. The system events include commands, system control activities, timed activities, and application activation. These events are statistically analyzed in light of the current system state to determine the probable next event. The voice commands for performing the probable next event are displayed to the user.
    Type: Grant
    Filed: June 8, 1999
    Date of Patent: October 23, 2001
    Assignee: International Business Machines Corp.
    Inventors: Ronald E. Vanbuskirk, James R. Lewis, Kerry A. Ortega, Huifang Wang, Amado Nassiff
  • Patent number: 6272457
    Abstract: A data collection and automatic database population system which combines global positioning system (GPS), speech recognition software, radio frequency (RF) communications and geographic information system (GIS) to allow rapid capture of field data, asset tracking and automatic transfer of the data to a GIS database. A pre-defined grammar allows observations to be continuously captured, GPS location and time to be captured and stored on the field unit. Other sensor data is combined with the observations and combined with GPS and time information. A mobile unit's location is tracked real time or post processed through wireless RF transmission of location information between the mobile unit and a central processing station. Real time position correction is provided by Differential GPS (DGPS). The captured data is electronically transferred to a central processing station for operator performed quality assurance and automatic population of the GIS database.
    Type: Grant
    Filed: September 16, 1996
    Date of Patent: August 7, 2001
    Assignee: Datria Systems, Inc.
    Inventors: Terry Edward Ford, John Anthony Yotka, Richard James Turek, Jr.
  • Patent number: 6269335
    Abstract: A method of identifying homophones of a word uttered by a user from at least a portion of existing words of a vocabulary of a speech recognition engine comprises the steps of: a user uttering the word; decoding the uttered word; computing respective measures between the decoded word and at least a portion of the other existing vocabulary words, the respective measures indicative of acoustic similarity between the word and the at least a portion of other existing words; if at least one measure is within a threshold range, indicating, to the user, results associated with the at least one measure, the results preferably including the decoded word and the other existing vocabulary word associated with the at least one measure; and the user preferably making a selection depending on the word the user intended to utter.
    Type: Grant
    Filed: August 14, 1998
    Date of Patent: July 31, 2001
    Assignee: International Business Machines Corporation
    Inventors: Abraham Ittycheriah, Stephane Herman Maes, Michael Daniel Monkowski, Jeffrey Scott Sorensen
  • Patent number: 6266641
    Abstract: A control program for controlling voice data processing by a computer causes the computer to selectively display a list display screen for displaying a list of file information concerning voice data of each file, and an operating screen, excluding the list of file information, for displaying an operating portion for performing voice data playback processing. The control program causes the computer to play back voice data of a file subsequent to a currently played-back file according to an order in the list displayed on the first screen, if a predetermined operation is executed on the second screen while the second screen is displayed.
    Type: Grant
    Filed: June 5, 1998
    Date of Patent: July 24, 2001
    Assignee: Olympus Optical Co., Ltd.
    Inventor: Hiroshi Takaya
  • Patent number: 6260018
    Abstract: An operation mode control section sets an operation mode flag so as to prohibit any speech replay operation using the loudspeaker when the printer is driven for a printing operation and permits the speech replay operation only after completing the printing operation. When a speech replay/input operation using the loudspeaker is specified by the replay operation section, the operation of the speech output control section is prohibited as the operation mode flag is set to prohibit any speech replay operation using the loudspeaker so that the loudspeaker is held to a standby state until a speech replay operation using the loudspeaker is permitted by the operation mode flag or the operation of the printer is terminated.
    Type: Grant
    Filed: September 29, 1998
    Date of Patent: July 10, 2001
    Assignee: Olympus Optical Co., Ltd.
    Inventor: Shinichi Imade
  • Patent number: 6260015
    Abstract: A method for correcting errors of a speech recognition application in natural speech recognition of words or phrases represented by characters can include the steps of selecting a character corresponding to an incorrectly recognized spoken word or phrase and observing a list of alternative characters. The method also can include replacing the incorrect character with one of the alternative characters, if the alternative characters correctly represent the incorrectly recognized word or phrase. If none of the alternative characters in the list correctly represents the spoken word or phrase, the step of drawing some features of a character correctly representing the incorrectly recognized word or phrase by moving a cursor can be included. Also included in the method can be the step of updating the list of alternative characters responsive to the drawing step; and, repeating the replacing, drawing and updating steps until a character correctly representing the word or phrase is selected from the updated list.
    Type: Grant
    Filed: September 3, 1998
    Date of Patent: July 10, 2001
    Assignee: International Business Machines Corp.
    Inventors: Huifang Wang, Ronald VanBuskirk, James R. Lewis, Oing Gong
  • Patent number: 6243685
    Abstract: A voice-operated, interactive message display system designed for inter-vehicle and extra-vehicle communications. The system includes one or more display units having a matrix of light-emitting elements to transmit a message to other vehicles either to the front, rear, side, or combination thereof. The display units are controlled by a central control unit having a voice recognition and voice synthesis system, which is used to interactively determine a message to display and on which displays to present it. Upon manual activation, the message system vocally prompts the user for the message to display and the parameters for its display. The system may be implemented in at least two different embodiments. In one embodiment a powerful vocabulary memory unit is used and the content of the message is run-time programmable. If a less expensive vocabulary unit is used, the user may choose from a series of preprogrammed messages.
    Type: Grant
    Filed: February 8, 1999
    Date of Patent: June 5, 2001
    Inventors: Henry L. Welch, Rick C. Bergman
  • Patent number: 6230139
    Abstract: Presenting audio signals to a user is comprised of receiving audio signals to be presented, separating the audio signals into plural discrete frequency components extending from a low frequency to a high frequency, translating each of the frequency components into control signals, and applying the control signals to an array of tactile transducers for sensing by the user.
    Type: Grant
    Filed: February 6, 1998
    Date of Patent: May 8, 2001
    Inventors: Elmer H. Hara, Edward R. McRae
  • Patent number: 6230132
    Abstract: In a method for real time speech input of a destination address into a navigation system, the speech statements that are entered by a user are recognized by a speech recognition device and classified in accordance with their recognition probability. The speech statement with the greatest recognition probability is identified as the input speech statement, with at least one speech statement being an admissible speech command that activates the operating functions of the navigation system associated with this speech command. (All the admissible speech statements being stored in at least one database.) According to the invention, at least one operating function of the navigation system comprises an input dialogue. Following activation of that operating function, depending on the input dialogue, at least one lexicon is generated in real time from the admissible speech statements stored in at least one database, and the generated lexicon is loaded as vocabulary into the speech recognition device.
    Type: Grant
    Filed: March 10, 1998
    Date of Patent: May 8, 2001
    Assignee: DaimlerChrysler AG
    Inventors: Fritz Class, Thomas Kuhn, Carsten-Uwe Moeller, Frank Reh, Gerhard Nuessle
  • Patent number: 6192339
    Abstract: In one embodiment of the method and apparatus for managing multiple speech applications, a common development platform and a common environment are provided. The common environment interfaces with the speech applications, receives information from an application information storage and a plurality of speech input sources, allows the speech applications to execute simultaneously and transitions from one said speech application to another seamlessly. In addition, the speech applications are developed based on the common development platform. Thus, application developers may utilize the common development platform to design and implement the speech applications independently.
    Type: Grant
    Filed: November 4, 1998
    Date of Patent: February 20, 2001
    Assignee: Intel Corporation
    Inventor: Cory W. Cox
  • Patent number: 6185538
    Abstract: A system for non-linearly editing video and audio information, uses a device for recognizing speech in the audio information and for generating a character sequence, particularly an ASCII character sequence, to produce an edit decision list (EDL). The generated character sequence is displayed on the display screen of an indicator. With reference to marked parts of the character sequence displayed on the display screen of the indicator, editing data is derived for the EDL.
    Type: Grant
    Filed: July 29, 1998
    Date of Patent: February 6, 2001
    Assignee: US Philips Corporation
    Inventor: Axel Schulz
  • Patent number: 6175820
    Abstract: A method for providing voice dynamics of human utterances converted to and represented by text within a data processing system. A plurality of predetermined parameters for recognition and representation of dynamics in human utterances are selected. An enhanced human speech recognition software program is created implementing the predetermined parameters on a data processing system. The enhanced software program includes an ability to monitor and record human voice dynamics and provide speech-to-text recognition. The dynamics in a human utterance is captured utilizing the enhanced human speech recognition software. The human utterance is converted into a textual representation utilizing the speech-to-text ability of the software. Finally, the dynamics are merged along with the textual representation of the human utterance to produce a marked-up text document on the data processing system.
    Type: Grant
    Filed: January 28, 1999
    Date of Patent: January 16, 2001
    Assignee: International Business Machines Corporation
    Inventor: Timothy Alan Dietz
  • Patent number: 6175862
    Abstract: The present invention extends a standard HTML browser to support a new data type, the Uniform Resource Locator Sequence (URLS). The URLS consists of a header and a sequence of URLs. The method of the present invention receives the URLS data then sequentially accesses the data of each URL comprising the URLS, obtains statistics on the response time to the requests for URLs, and times the calls for subsequent URLs in the sequence accordingly so that the arrival of the linked data nearly simulates actual streaming.
    Type: Grant
    Filed: June 17, 1998
    Date of Patent: January 16, 2001
    Assignee: International Business Machines Corporation
    Inventors: Jeane Shu-Chun Chen, Ephraim Feig
  • Patent number: 6173264
    Abstract: A reading system includes a computer and a mass storage device including software comprising instructions for causing a computer to accept an image file generated from optically scanning an image of a document. The software convert the image file into a converted text file that includes text information, and positional information associating the text with the position of its representation in the image file. The reading system has the ability therefore to display the image representation of the scanned image on a computer monitor and permit a user to control operation of the reader by with respect to the displayed image representation of the document by using the locational information associated with the converted text file. Also described are techniques for dual highlighting spoken text and a technique for determining the nearest word to a position selected by use of mouse or other pointing device operating on the image representation as displayed on the monitor.
    Type: Grant
    Filed: December 15, 1998
    Date of Patent: January 9, 2001
    Inventors: Raymond C. Kurzweil, Firdaus Bhathena
  • Patent number: 6151577
    Abstract: The subject invention concerns a system for phonological training a sound reception device (1), an operating device (5) for controlling the system, interpreting and processing devices (2), and presentation device (3).The presentation device (3) includes a display screen divided into a plurality of windows (11-17) for simultaneous presentation of a graphic reproduction of the desired sound as well as of the sound produced by the user and received by the sound reception device (1), and of an animated reproduction of speech device (1), and of an animated reproduction of speech organs. The system is adapted to reproduce the sound by fields(s) (41, 42, 51, 52), the longitudinal extension of the field(s) in one direction reflecting the time during which the sound is produced and the graphic display content within each field, such as colours, shading or the like, of the fields denoting the place of formation of the sound in the oral cavity.
    Type: Grant
    Filed: June 25, 1999
    Date of Patent: November 21, 2000
    Assignee: Ewa Braun
    Inventor: Ewa Braun