Pattern Display Patents (Class 704/276)
-
Publication number: 20100066554Abstract: A home appliance system includes a home appliance outputting product information as a sound and a mobile terminal confirming the product information based on the sound. The mobile terminal can receive the sound, convert the sound into the product information and output the product information to an external user and a repairman.Type: ApplicationFiled: September 1, 2009Publication date: March 18, 2010Inventors: Phal Jin LEE, Hoi Jin JEONG, Jong Hye HAN, Young Soo KIM, In Haeng CHO, Si Moon JEON
-
Patent number: 7680853Abstract: Search results are provided in a format that allows users to efficiently determine whether audio or video documents identified from a search query actually contain the words in the query. This is achieved by returning snippets of text around query term matches and allowing the user to play a segment of the audio signal by selecting a word in the snippet. In other embodiments, markers are placed on a timeline that represents the duration of the audio signal. Each marker represents a query term match and when selected causes the audio signal to begin to play near the temporal location represented by the marker.Type: GrantFiled: April 10, 2006Date of Patent: March 16, 2010Assignee: Microsoft CorporationInventors: Roger Peng Yu, Frank Torsten Seide, Kaijiang Chen
-
Patent number: 7676373Abstract: Displays a character string representing content of speech in synchronization with reproduction of the speech. An apparatus includes: a unit for obtaining scenario data representing the speech; a unit for dividing textual data resulting from recognition of the speech to generate pieces of recognition pieces of recognition data; a unit for detecting in the scenario data a character matching each character contained in each piece of recognition data for which no matching character string has been detected to detect in the scenario data a character string that matches the piece of recognition data; and a unit for setting the display timing of displaying each of character strings contained in the scenario data to the timing at which speech recognized as the piece of recognition data that matches the character string is reproduced.Type: GrantFiled: June 2, 2008Date of Patent: March 9, 2010Assignee: Nuance Communications, Inc.Inventors: Kohtaroh Miyamoto, Midori Shoji
-
Patent number: 7664649Abstract: A control apparatus for enabling a user to communicate by speech with a processor-controlled apparatus, 1) controls a display of text data which includes a speech link that can be activated by a spoken command, 2) determines the location of a cursor displayed on a display from gaze input information, 3) changes a shape of the cursor when the cursor is located over the speech link, and 4) outputs a prompt identifying speech commands that can be used to activate the speech link when the cursor is displayed on the display in a changed state for a predetermined time located over the speech link.Type: GrantFiled: April 5, 2007Date of Patent: February 16, 2010Assignee: Canon Kabushiki KaishaInventors: Uwe Helmut Jost, Yuan Shao
-
Patent number: 7650284Abstract: A method, system and apparatus for enabling voice clicks in a multimodal page. In accordance with the present invention, a method for enabling voice clicks in a multimodal page can include toggling a display of indicia binding selected user interface elements in the multimodal page to corresponding voice logic; and, processing a selection of the selected user interface elements in the multimodal page through different selection modalities. In particular, the toggling step can include toggling a display of both indexing indicia for the selected user interface elements, and also a text display indicating that a voice selection of the selected user interface elements is supported.Type: GrantFiled: November 19, 2004Date of Patent: January 19, 2010Assignee: Nuance Communications, Inc.Inventors: Charles W. Cross, Marc White
-
Patent number: 7643999Abstract: A system and method for positioning a software User Interface (UI) window on a display screen is provided, wherein the method includes displaying the software UI window on the display screen and identifying at least one suitable location on the display screen responsive to an active target window area of a target application UI window. The method further includes determining whether the software UI window is disposed at the at least one suitable location on the display screen and if the software UI window is disposed in a location other than the at least one suitable location on the display screen, positioning the software UI window at the at least one suitable location on the display screen.Type: GrantFiled: November 24, 2004Date of Patent: January 5, 2010Assignee: Microsoft CorporationInventors: Robert L. Chambers, Oliver Scholz, Oscar E. Murillo, David Mowatt
-
Patent number: 7624019Abstract: A system is configured to enable a user to assert voice-activated commands. When the user issues a non-ambiguous command, the system activates a corresponding control. The area of activity on the user interface is visually highlighted to emphasize to the user that what they spoke caused an action. In one specific embodiment, the highlighting involves floating text the user uttered to a visible user interface component.Type: GrantFiled: October 17, 2005Date of Patent: November 24, 2009Assignee: Microsoft CorporationInventor: Felix Andrew
-
Publication number: 20090281810Abstract: A method of visually presenting audio signals includes receiving an audio signal to be presented; generating a predetermined number of discrete frequency components from the audio signal; assigning a graphical object to each of the frequency components, each of the graphical objects being specified by a geometrical shape, a position information and a size information; and all of the graphical objects associated with all of the frequency components are displayed simultaneously on a graphic display. The system includes a microphone for generating audio signals; an audio interface unit for sampling the audio signals and transforming them into digital signals; a processing unit for translating digital signals into a predetermined number of discrete frequency components and for assigning a graphical object to each of the discrete frequency components; a video interface unit for generating a video signal; and a graphic display for displaying a sonogram based on the video signal.Type: ApplicationFiled: June 25, 2007Publication date: November 12, 2009Applicant: Ave-Fon Kft.Inventors: Istvan Sziklai, Istvan Hazman, Jozsef Imrek
-
Patent number: 7617108Abstract: A vehicle mounted control apparatus includes voice recognition section for recognizing a voice command input by an voice input device; and a control section that analyzes a cause of incapability of recognition of the voice command when the voice command cannot be recognized by the voice recognition section and gives a notice on result of the analysis.Type: GrantFiled: December 12, 2003Date of Patent: November 10, 2009Assignee: Mitsubishi Denki Kabushiki KaishaInventors: Tsutomu Matsubara, Masato Hirai
-
Publication number: 20090259475Abstract: A text edit apparatus which presents, based on language analysis information regarding a text, a portion of the text where voice quality may change when the text is read aloud has advantages of predicting likelihood of the voice quality change and judging whether or not the voice quality change will occur.Type: ApplicationFiled: June 5, 2006Publication date: October 15, 2009Inventors: Katsuyoshi Yamagami, Yumiko Kato, Shinobu Adachi
-
Arrangement for Creating and Using a Phonetic-Alphabet Representation of a Name of a Party to a Call
Publication number: 20090248421Abstract: A first party creates and edits a phonetic-alphabet representation of its name. The phonetic representation is conveyed to a second party as “caller-identification” information by messages that set up a call between the parties. The phonetic representation of the name is displayed to the second party, converted to speech, and/or converted to an alphabet of a language of the second party and then displayed to the second party.Type: ApplicationFiled: March 31, 2008Publication date: October 1, 2009Applicant: Avaya Inc.Inventors: Paul Roller Michaelis, David Mohler, Charles Wrobel -
Publication number: 20090231347Abstract: Natural inter-viseme animation of 3D head model driven by speech recognition is calculated by applying limitations to the velocity and/or acceleration of a normalized parameter vector, each element of which may be mapped to animation node outputs of a 3D model based on mesh blending and weighted by a mix of key frames.Type: ApplicationFiled: March 6, 2009Publication date: September 17, 2009Inventor: Masanori Omote
-
Publication number: 20090228799Abstract: A Method for visualizing audio data corresponding to a piece of music, comprising the steps of: determining a structure of said piece of music based on said audio data, said structure comprising music structure segments each having a music structure segment length; allocating a predetermined graphical object to said piece of music, said graphical object having a predetermined size; segmenting said graphical object into graphical segments, wherein each graphical segment has a size representing said music structure segment length; and displaying said graphical object and said graphical segments on a display.Type: ApplicationFiled: February 5, 2009Publication date: September 10, 2009Applicant: Sony CorporationInventors: Mathieu VERBEECK, Henning Solum
-
Patent number: 7574361Abstract: A user interface for a communication device includes a light emitting diode (LED) (200) providing both a transmit-carrier indicator and transmit-audio feedback to the user. By varying the intensity (202, 204, 206) and/or color spectrum (302, 304, 306) of the LED (200). relative to changes in transmitted audio, the user is provided with transmit-audio feedback. If LED (200) is a bi-color LED, then receive-audio feedback can also be indicated to the user by varying the second color's intensity and/or spectrum.Type: GrantFiled: September 30, 2005Date of Patent: August 11, 2009Assignee: Motorola, Inc.Inventors: David M. Yeager, Peter B. Gilmore, Deborah A. Gruenhagen, Charles E. Kline
-
Patent number: 7567908Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.Type: GrantFiled: January 13, 2004Date of Patent: July 28, 2009Assignee: International Business Machines CorporationInventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Daniel Mark Schumacher, Thomas J. Watson
-
Publication number: 20090171667Abstract: A method for assisting in the communication of a medical care provider and a patient is disclosed. The method may include displaying a first display section, the first display section including a plurality of anatomical features, each anatomical feature associated with an indicia indicating the location of the anatomical feature, the anatomical feature also associated with a first name provided in a first language and a second name provided in a second language name. The method may also include displaying a second display section, the second display section including a plurality of questions relating to patient intake, where each question provided in the first language and the second language.Type: ApplicationFiled: December 28, 2007Publication date: July 2, 2009Inventor: Carmen Hansen Rivera
-
Patent number: 7539618Abstract: A system for operating an electronic device enabling the same agent software to be used in common among a plurality of devices, where a car navigation system or audio system, when the agent software and voice recognition engine are transferred from a portable data terminal, runs the transferred agent software so as to display a simulated human animated character which converses with a user, recognizes speech obtained from that conversation by a voice recognition engine, prepares script reflecting the content of the conversation, and executes the prepared script to perform predetermined processing.Type: GrantFiled: November 22, 2005Date of Patent: May 26, 2009Assignee: DENSO CORPORATIONInventor: Ichiro Yoshida
-
Publication number: 20090132257Abstract: A system and a method for inputting edited translation words or sentence are provided to solve the problem that editing translation words or sentence and inputting the edited translation words or sentence cannot be performed by successive actions. In this system and method, the input words or sentence input into an input region by a user are intercepted and translated into translation words or sentence, and a function of editing the translation words or sentence is provided in a display region for displaying the translation words or sentence, thereby achieving the efficacy of inputting the edited translation words or sentence into the input region directly.Type: ApplicationFiled: November 19, 2007Publication date: May 21, 2009Applicant: INVENTEC CORPORATIONInventors: Chaucer Chiu, Jenny Xu
-
Publication number: 20090125312Abstract: Disclosed is a method for providing by a news information-providing server news information using a 3D character to a wireless communication terminal having accessed the news information-providing server through a wireless communication network, the method including the steps of: (a) generating voice information by converting news information received in real-time into voice data, and analyzing content of the voice information; (b) extracting mouth shape data and facial expression data corresponding to the content of the voice information analyzed at step (a); (c) applying the mouth shape data and facial expression data to the 3D character, and generating 3D character data by synthesizing the 3D character with an background image and/or background music; (d) generating 3D character news by synchronizing the voice information with the 3D character data; and (e) transmitting the 3D character news to the wireless communication terminal in a streaming mode.Type: ApplicationFiled: February 15, 2006Publication date: May 14, 2009Applicant: SK TELECOM CO., LTD.Inventors: Inseong HWANG, Jongmin KIM, Hoojong KIM, Wonhee SULL
-
Publication number: 20090099850Abstract: Methods, apparatus, products are disclosed for displaying speech for a user of a surface computer, the surface computer comprising a surface, the surface computer capable of receiving multi-touch input through the surface and rendering display output on the surface, that include: registering, by the surface computer, a plurality of users with the surface computer; allocating, by the surface computer to each registered user, a portion of the surface for interaction between that registered user and the surface computer; detecting, by the surface computer, a speech utterance from one of the plurality of users; determining, by the surface computer using a speech engine, speech text in dependence upon the speech utterance; creating, by the surface computer, display text in dependence upon the speech text; and rendering, by the surface computer, the display text on at least one of the allocated portions of the surface.Type: ApplicationFiled: October 10, 2007Publication date: April 16, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lydia M. Do, Pamela A. Nesbitt, Lisa A. Seacat
-
Patent number: 7505911Abstract: A handheld device with both large-vocabulary speech recognition and audio recoding allows users to switch between at least two of the following three modes: (1) recording audio without corresponding speech recognition; (2) recording with speech recognition; and (3) speech recognition without audio recording. A handheld device with both large-vocabulary speech recognition and audio recoding enables a user to select a portion of previously recorded sound and have speech recognition performed upon it. A system enables a user to search for a text label associated with portions of unrecognized recorded sound by uttering the label's words. A large-vocabulary system allows users to switch between playing back recorded audio and speech recognition with a single input, with successive audio playbacks automatically starting slightly before the end of prior playback. And a cell phone that allows both large-vocabulary speech recognition and audio recording and playback.Type: GrantFiled: December 5, 2004Date of Patent: March 17, 2009Inventors: Daniel L. Roth, Jordan R. Cohen, David F. Johnston, Edward W. Porter
-
Publication number: 20090007346Abstract: A method for controlling an information display using an avatar of a washing machine is disclosed, which displays all information associated with usage- and control-information of the washing machine via the avatar, and allows a user of the washing machine to easily recognize the usage- and control-information of the washing machine, resulting in increased information transmission characteristics of the washing machine. Basic usages of the washing machine, and various methods for displaying operation- and control-states of the washing machine will be indicated by at least one dynamic character, such that a user of the washing machine can easily recognize necessary information of the washing machine, resulting in not only increased information transmission characteristics of the washing machine but also emphasized entertainment elements required by modern consumers of the washing machine.Type: ApplicationFiled: June 27, 2006Publication date: January 8, 2009Applicant: LG ELECTRONICS INC.Inventors: Mi Kyung Ha, Gyeong Ho Moon, Sang Su Lee
-
Publication number: 20080319755Abstract: According to an aspect of an embodiment, an apparatus for converting text data into sound signal, comprises: a phoneme determiner for determining phoneme data corresponding to a plurality of phonemes and pause data corresponding to a plurality of pauses to be inserted among a series of phonemes in the text data to be converted into sound signal; a phoneme length adjuster for modifying the phoneme data and the pause data by determining lengths of the phonemes, respectively in accordance with a speed of the sound signal and selectively adjusting the length of at least one of the phonemes which is placed immediately after one of the pauses so that the at least one of the phonemes is relatively extended timewise as compared to other phonemes; and a output unit for outputting sound signal on the basis of the adjusted phoneme data and pause data by the phoneme length adjuster.Type: ApplicationFiled: June 24, 2008Publication date: December 25, 2008Applicant: FUJITSU LIMITEDInventors: Rika Nishiike, Hitoshi Sasaki, Nobuyuki Katae, Kentaro Murase, Takuya Noda
-
Publication number: 20080312930Abstract: According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text.Type: ApplicationFiled: August 18, 2008Publication date: December 18, 2008Applicant: AT&T Corp.Inventors: Andrea Basso, Mark Charles Beutnagel, Joern Ostermann
-
Publication number: 20080294431Abstract: Displays a character string representing content of speech in synchronization with reproduction of the speech. An apparatus includes: a unit for obtaining scenario data representing the speech; a unit for dividing textual data resulting from recognition of the speech to generate pieces of recognition pieces of recognition data; a unit for detecting in the scenario data a character matching each character contained in each piece of recognition data for which no matching character string has been detected to detect in the scenario data a character string that matches the piece of recognition data; and a unit for setting the display timing of displaying each of character strings contained in the scenario data to the timing at which speech recognized as the piece of recognition data that matches the character string is reproduced.Type: ApplicationFiled: June 2, 2008Publication date: November 27, 2008Inventors: Kohtaroh Miyamoto, Midori Shoji
-
Patent number: 7457756Abstract: A method of generating a time-frequency representation of a signal that preserves phase information by receiving the signal, calculating a joint time-frequency domain of the signal, estimating instantaneous frequencies of the joint time-frequency domain, modifying each estimated instantaneous frequency, if necessary, to correspond to a frequency of the joint time-frequency domain to which it most closely compares, redistributing the elements within the joint time-frequency domain according to the estimated instantaneous frequencies as modified, computing a magnitude for each element in the joint time-frequency domain as redistributed, and plotting the results as the time-frequency representation of the signal.Type: GrantFiled: June 9, 2005Date of Patent: November 25, 2008Assignee: The United States of America as represented by the Director of the National Security AgencyInventors: Douglas J. Nelson, David Charles Smith
-
Publication number: 20080228497Abstract: The invention describes a method for communication by means of a communication device (DS), in which synthesized speech (ss) is output from the communication device (DS), and in which light signals (ls) are output simultaneously with the synthesized speech (ss) in accordance with the semantic content of the synthesized speech (ss). Furthermore, an appropriate communication device (DS) is described.Type: ApplicationFiled: July 3, 2006Publication date: September 18, 2008Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.Inventors: Thomas Portele, Holger R. Scholl
-
Publication number: 20080228498Abstract: A portable information system that, when activated, presents certain data to a user, wherein the information is prepositioned within the portable system or provided to it. The information may be resident within fixed memory, added by means of a smart card, or wirelessly transmitted to the information system. The information conveyed to the user may be visual, such as a text screen or a video display as well as audible such as a play by play broadcast. The information may include local areas of interest or locations within the venue in which the apparatus is used.Type: ApplicationFiled: March 14, 2008Publication date: September 18, 2008Inventor: Samuel N. Gasque
-
Publication number: 20080221904Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. The processor reads first data comprising one or more parameters associated with noise-producing orifice images of sequences of at least three concatenated phonemes which correspond to an input stimulus. The processor reads, based on the first data. second data comprising images of a noise-producing entity. The processor generates an animated sequence of the noise-producing entity.Type: ApplicationFiled: May 19, 2008Publication date: September 11, 2008Applicant: AT&T Corp.Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
-
Patent number: 7406409Abstract: A system and method summarizes multimedia stored in a compressed multimedia file partitioned into a sequence of segments, where the content of the multimedia is, for example, video signals, audio signals, text, and binary data. An associated metadata file includes index information and an importance level for each segment. The importance information is continuous over as closed interval. An importance level threshold is selected in the closed interval, and only segments of the multimedia having a particular importance level greater than the importance level threshold are reproduced. The importance level can also be determined for fixed-length windows of multiple segments, or a sliding window. Furthermore, the importance level can be weighted by a factor, such as the audio volume.Type: GrantFiled: February 13, 2004Date of Patent: July 29, 2008Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Isao Otsuka, Ajay Divakaran, Masaharu Ogawa, Kazuhiko Nakane
-
Patent number: 7386437Abstract: A vehicle mounted translation system for providing language translation to a driver of a vehicle. The translation system may be associated with a vehicle navigation system. The translation system includes a translation device and a storage unit for storing language and translation information. The system further includes the ability to enter information to be translated into the system, data processing for retrieving a translation from storage based on the input of the first information, and the ability to provide the retrieved translation to the driver. Output of the translated information may be accomplished by, speech-to-speech and/or text-to-speech conversion of words and/or a text or image output to a visual display.Type: GrantFiled: August 13, 2004Date of Patent: June 10, 2008Assignee: Harman Becker Automotive Systems GmbHInventor: Christian Brülle-Drews
-
Publication number: 20080114594Abstract: The present invention generally relates to a method and system for providing VoIP clients with the ability to confirm accuracy in conversation data over a digital communication channel. More specifically, a method and system is provided for verifying, via a visual representation whether a receiving client captures accurate information from a particular portion of the digital voice conversation. In response to a triggering event, a visual representation, including information extracted from the particular portion of the digital voice conversation, may be generated for verifying the accuracy. Based on the needs of the clients engaging in the conversation, one or more visual representations and corresponding verifications can be exchanged. In this manner, a multi-tiered oral agreement with authentication may be generated over a digital communication channel.Type: ApplicationFiled: November 14, 2006Publication date: May 15, 2008Applicant: MICROSOFT CORPORATIONInventors: Scott C. Forbes, Linda Criddle, David Milstein, Lon-Chan Chu, Kuansan Wang, David A. Howell
-
Patent number: 7366671Abstract: A speech displaying system and method can display playing progress by waveform and synchronously display text of a speech file using rolling subtitles when playing the speech file. After the speech file is loaded via a loading module, a sentence unit determining module partitions content of the speech file into a plurality of sentence units to produce a list of sentence units. A subtitle highlighting speed calculating module calculates a speed of highlighting every single letter or character contained in the subtitles in the sentence unit index for a sentence unit. A subtitle rolling module displays content of the list of sentence units. When the speech file is played, the subtitles in the sentence unit index are clearly marked, and every letter or character of the subtitles is highlighted. A waveform displaying module marks positions of sentence pauses and playing progress on an oscillogram for the speech file by lines.Type: GrantFiled: December 6, 2004Date of Patent: April 29, 2008Assignee: Inventec CorporationInventors: Jenny Xu, Chaucer Chiu
-
Patent number: 7366670Abstract: Facial animation in MPEG-4 can be driven by a text stream and a Facial Animation Parameters (FAP) stream. Text input is sent to a TTS converter that drives the mouth shapes of the face. FAPs are sent from an encoder to the face over the communication channel. Disclosed are codes bookmarks in the text string transmitted to the TTS converter. Bookmarks are placed between and inside words and carry an encoder time stamp. The encoder time stamp does not relate to real-world time. The FAP stream carries the same encoder time stamp found in the bookmark of the text. The system reads the bookmark and provides the encoder time stamp as well as a real-time time stamp to the facial animation system. The facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.Type: GrantFiled: August 11, 2006Date of Patent: April 29, 2008Assignee: AT&T Corp.Inventors: Andrea Basso, Mark Charles Beutnagel, Joern Ostermann
-
Patent number: 7356470Abstract: A multi-mail system and method is disclosed in which a sender may convey and a recipient can realize emotional aspects associated with substantive content of a multi-mail message by receiving a message that is more than textual in nature. Voice recognition technology and programmatic relation of sound and graphics may be used to produce a talking image. In one embodiment, the image may include the user's own visual and/or audio likeness. In an alternate embodiment, the image may comprise any available visual and/or audio display selected by the user. The multi-mail message may be inputted by a user in a text format and transposed into a format including the selected image and/or voice. In an alternate embodiment, a spoken message may be converted into a format including the selected image and/or voice. The formatted messages are then stored and/or transmitted via an email system or some other electronic network.Type: GrantFiled: October 18, 2005Date of Patent: April 8, 2008Inventors: Adam Roth, Geoffrey O'Sullivan, Barclay A. Dunn
-
Patent number: 7353177Abstract: A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.Type: GrantFiled: September 28, 2005Date of Patent: April 1, 2008Assignee: AT&T Corp.Inventors: Eric Cosatto, Hans Peter Graf, Thomas M. Isaacson, Volker Franz Storm
-
Patent number: 7349852Abstract: A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.Type: GrantFiled: September 28, 2005Date of Patent: March 25, 2008Assignee: AT&T Corp.Inventors: Eric Cosatto, Peter Graf Hans, Thomas M. Isaacson, Franz Storm Volker
-
Patent number: 7349851Abstract: A speech recognition system having a user interface that provides both visual and auditory feedback to a user is described. In one aspect, a response time in which to receive an audible utterance is initiated. A graphic representing the response time is displayed. A first sound is played when an audible utterance is recognized. The graphic is changed to indicate passage of the response time such that the graphic diminishes in size from an original size with the passage of time. Responsive to recognizing an utterance, the graphic is presented in the original size. Responsive to expiration of the response time before the audible utterance has been recognized, a second sound is emitted to indicate that the speech recognition system has entered a dormant state.Type: GrantFiled: March 21, 2005Date of Patent: March 25, 2008Assignee: Microsoft CorporationInventors: Sarah E. Zuberec, Cynthia DuVal, Benjamin N. Rabelos
-
Patent number: 7333865Abstract: The invention aligns two wide-bandwidth, high resolution data streams, in a manner that retains the full bandwidth of the data streams, by using magnitude-only spectrograms as inputs into the cross-correlation and sampling the cross-correlation at a coarse sampling rate that is the final alignment quantization period. The invention also enables selection of stable and distinctive audio segments for cross-correlation by evaluating the energy in local audio segments and the variance in energy among nearby audio segments.Type: GrantFiled: January 3, 2006Date of Patent: February 19, 2008Assignee: YesVideo, Inc.Inventors: Michele M. Covell, Harold G. Sampson
-
Publication number: 20080027731Abstract: A computerized method of teaching spoken language skills includes receiving multiple user utterances into a computer system, receiving criteria for pronunciation errors, analyzing the user utterances to detect pronunciation errors according to basic sound units and Pronunciation error criteria, and providing feedback to the user in accordance with the analysis.Type: ApplicationFiled: April 12, 2005Publication date: January 31, 2008Applicant: Burlington English Ltd.Inventor: Zeev Shpiro
-
Patent number: 7321854Abstract: The present method incorporates audio and visual cues from human gesticulation for automatic recognition. The methodology articulates a framework for co-analyzing gestures and prosodic elements of a person's speech. The methodology can be applied to a wide range of algorithms involving analysis of gesticulating individuals. The examples of interactive technology applications can range from information kiosks to personal computers. The video analysis of human activity provides a basis for the development of automated surveillance technologies in public places such as airports, shopping malls, and sporting events.Type: GrantFiled: September 19, 2003Date of Patent: January 22, 2008Assignee: The Penn State Research FoundationInventors: Rajeev Sharma, Mohammed Yeasin, Sanshzar Kettebekov
-
Publication number: 20070288243Abstract: A game machine has an LCD, a cross key, and a memory. The memory stores a plurality of words and a plurality of translations each corresponding to the plurality of words. A computer of the game machine displays at least one of the plurality of words on the LCD, and updatedly displays on the LCD a word which is being displayed on the LCD by being changed to another word in response to an operation of a certain push portion of a cross key. The translation of the word which is thus being displayed on the LCD is updatedly displayed in place of the word during a period from accepting an operation of another push portion of the cross key to cancelling the operation, and when the operation is cancelled, the word which was displayed on the LCD before the operation is displayed in place of the translation which is being displayed on the LCD.Type: ApplicationFiled: April 6, 2007Publication date: December 13, 2007Applicant: Nintendo Co., Ltd.Inventors: Shinya Takahashi, Toshiaki Suzuki
-
Patent number: 7299188Abstract: A method and apparatus for generating a pronunciation score by receiving a user phrase intended to conform to a reference phrase and processing the user phrase in accordance with at least one of an articulation-scoring engine, a duration scoring engine and an intonation-scoring engine to derive thereby the pronunciation score. The scores provided by the various scoring engines are adapted to provide a visual and/or numerical feedback that provides information pertaining to correctness or incorrectness in one or more speech-features such as intonation, articulation, voicing, phoneme error and relative word duration. Such useful interactive feedback will allow a user to quickly identify the problem area and take remedial action in reciting “tutor” sentences or phrases.Type: GrantFiled: February 10, 2003Date of Patent: November 20, 2007Assignee: Lucent Technologies Inc.Inventors: Sunil K. Gupta, ZiYi Lu, Prabhu Raghavan, Zulfiquar Sayeed, Aravind Sethuraman, Chetan Vinchhi
-
Patent number: 7292986Abstract: A graphical user interface provides a graphical volume meter indicating the volume of the user's speech and a speech recognition meter showing the progress of a speech recognizer. The graphical volume meter and recognition meter are both located near each other on the display such that the user can focus on both meters at the same time. One aspect of the present invention is that a speech recognition meter is placed on the display near the insertion point where the user intends their speech to take effect. Thus, the user does not have to divert their view from the insertion point in order to check the progress of the speech recognizer.Type: GrantFiled: October 20, 1999Date of Patent: November 6, 2007Assignee: Microsoft CorporationInventors: Daniel S. Venolia, Scott D. Quinn
-
Patent number: 7272563Abstract: A user conducts a telephone conversation without speaking. It does this by moving the participant in the public situation to a quiet mode of communication (e.g., keyboard, buttons, touchscreen). All the other participants are allowed to continue using their usual audible technology (e.g., telephones) over the existing telecommunications infrastructure. The quiet user interface transforms the user's silent input selections into equivalent audible signals that may be directly transmitted to the other parties in the conversation.Type: GrantFiled: December 30, 2005Date of Patent: September 18, 2007Assignee: Fuji Xerox Co., Ltd.Inventor: Lester D. Nelson
-
Patent number: 7240009Abstract: A Control apparatus for controlling communication between a user and at least one processor controlled device, such as a printer or copier, capable of carrying out at least one task. The control includes a processor configured to conduct a dialog with the user to determine the task that the user wishes the device to carry out; instruct the device to carry out the determined task; receive event information related to events; determine whether the user is involved with another task when the event information is received; identify interrupt status information associated with at least one of the event for which event information is received and said other task; determine whether or not the user can be interrupted on the basis of the identified interrupt status information; and advise the user of received event information.Type: GrantFiled: September 25, 2001Date of Patent: July 3, 2007Assignee: Canon Kabushiki KaishaInventors: Uwe Helmut Jost, Yuan Shao
-
Patent number: 7227994Abstract: A method and apparatus for finding a reference pattern (RP) with K elements imbedded in an input pattern IP with repeating substrings uses dual pointers to point to elements in the RP to compare with input elements sequentially clocked from the IP. The dual pointers are loaded with a pointer address corresponding to the first reference element in the RP and the pointer addresses are either incremented to the next position or are reset back to the address of the first reference element in response to results of comparing the reference element they access to the presently clocked input element and results of comparing their respective pointer addresses.Type: GrantFiled: March 20, 2003Date of Patent: June 5, 2007Assignee: International Business Machines CorporationInventors: Kerry A. Kravec, Ali G. Saidi, Jan M. Slyfield, Pascal R. Tannhof
-
Control apparatus for enabling a user to communicate by speech with a processor-controlled apparatus
Patent number: 7212971Abstract: A control apparatus controls the display of text data which includes a speech link that can be activated by spoken command. The shape of a pointing device cursor displayed on a display is then changed by the apparatus when the pointing device cursor is located over the speech link included in displayed text data. The apparatus is arranged to output a prompt identifying speech commands that can be used to activate the speech link if the pointing device cursor is displayed on a display located over the speech link in a changed state for a predetermined time.Type: GrantFiled: December 18, 2002Date of Patent: May 1, 2007Assignee: Canon Kabushiki KaishaInventors: Uwe Helmut Jost, Yuan Shao -
Patent number: 7197455Abstract: A client 2 includes a transmission unit 2d for transmitting the input speech information over a network to a server system 3 and an output unit 2b for receiving the contents selection information from the server system 3 over the network to output the received information. The server system 3 includes a prepared information storage unit 9b for memorizing one or more pieces of the preparation information pertinent to each contents, from one contents to another, and an information preparing server 7 for preparing the contents selection information based on the speech information received from the client 2 over the network and on the preparation information to send the so-prepared contents selection information to the client 3 over the network.Type: GrantFiled: March 3, 2000Date of Patent: March 27, 2007Assignee: Sony CorporationInventors: Fukuharu Sudo, Makoto Akabane, Toshitada Doi
-
Patent number: RE41002Abstract: An electronic communications system for the deaf includes a video apparatus for observing and digitizing the facial, body and hand and finger signing motions of a deaf person, an electronic translator for translating the digitized signing motions into words and phrases, and an electronic output for the words and phrases. The video apparatus desirably includes both a video camera and a video display which will display signing motions provided by translating spoken words of a hearing person into digitized images. The system may function as a translator by outputting the translated words and phrases as synthetic speech at the deaf person's location for another person at that location, and that person's speech may be picked up, translated, and displayed as signing motions on a display in the video apparatus.Type: GrantFiled: June 23, 2000Date of Patent: November 24, 2009Inventor: Raanan Liebermann