Transformation Of Speech Into A Nonaudible Representation, E.g., Speech Visualization, Speech Processing For Tactile Aids, Etc. (epo) Patents (Class 704/E21.019)

E Subclasses

Synchronization of speech with image or synthesis of the lips movement from speech, e.g., for "talking heads," etc.(epo) (Class 704/E21.02)

Device, method and computer program product to assist visually impaired people in sensing voice direction

Patent number: 8766765

Abstract: A device, method and computer program product that provides tactile feedback to a visually impaired person to assist the visually impaired person in maintaining eye contact with another person when speaking to the other person. The eyeglass-based tactile response device includes a frame that has sensors mounted thereto, and a track that interconnects eyepieces on the device. In one example, a motor and a wheel are coupled to the track and are driven across the track depending on the amount of feedback to be provided to the user regarding how much the user should turn his or her head.

Type: Grant

Filed: September 14, 2012

Date of Patent: July 1, 2014

Inventor: Hassan Wael Hamadallah
Electronic device with remote control function

Patent number: 8547208

Abstract: An electronic device includes a storage unit, a sound signal receiving unit, a remote signal generating unit, a processing unit, and a transmitting unit. The control table records RC devices, sound parameters, and control commands. Each RC device is associated with at least one sound parameter. Each sound parameter corresponds to one control command. The sound signal receiving unit receives sound signals. The processing unit determines a sound parameter according to the received sound signals, searches in the control table to determine the control command the sound parameter corresponding to according to a selected RC device, and controls the remote signal generating unit to generate a remote signal corresponding to the determined control command. The transmitting unit transmits the remote signal to the selected RC device.

Type: Grant

Filed: December 1, 2010

Date of Patent: October 1, 2013

Assignee: Hon Hai Precision Industry Co., Ltd.

Inventors: Ping-Yang Chuang, Ying-Chuan Yu
Information Access and Device Control Using Mobile Phones and Audio in the Home Environment

Publication number: 20130183944

Abstract: Embodiments of the present invention are directed toward systems, methods and devices for improving information access to and device control in a home automation environment. Functionality of multiple household device, such as lights, sound, entertainment, HVAC, and communication devices can be activated via voice commands. The voice commands are detected by a nearby control device and relayed via a network communication medium to another control device to which the desired device or system that the user wants to operate is connected. Each control device, disposed throughout the home, can detect a voice command intended for another control box and household device and relay the voice command to the intended control box. In such systems, a user can initiate a telephone call by saying a voice command to a local control box that will forward on the control signal to a mobile phone connected to another control box.

Type: Application

Filed: February 8, 2012

Publication date: July 18, 2013

Applicant: SENSORY, INCORPORATED

Inventors: Todd F. Mozer, Forrest S. Mozer
MULTI-CORE PROCESSING FOR PARALLEL SPEECH-TO-TEXT PROCESSING

Publication number: 20130166285

Abstract: This specification describes technologies relating to multi core processing for parallel speech-to-text processing. In some implementations, a computer-implemented method is provided that includes the actions of receiving an audio file; analyzing the audio file to identify portions of the audio file as corresponding to one or more audio types; generating a time-ordered classification of the identified portions, the time-ordered classification indicating the one or more audio types and position within the audio file of each portion; generating a queue using the time-ordered classification, the queue including a plurality of jobs where each job includes one or more identifiers of a portion of the audio file classified as belonging to the one or more speech types; distributing the jobs in the queue to a plurality of processors; performing speech-to-text processing on each portion to generate a corresponding text file; and merging the corresponding text files to generate a transcription file.

Type: Application

Filed: December 10, 2008

Publication date: June 27, 2013

Applicant: Adobe Systems Incorporated

Inventors: Walter Chang, Michael J. Welch
ACTION GENERATION BASED ON VOICE DATA

Publication number: 20130144610

Abstract: An automated technique is disclosed for processing audio data and generating one or more actions in response thereto. In particular embodiments, the audio data can be obtained during a phone conversation and post-call actions can be provided to the user with contextually relevant entry points for completion by an associated application. Audio transcription services available on a remote server can be leveraged. The entry points can be generated based on keyword recognition in the transcription and passed to the application in the form of parameters.

Type: Application

Filed: December 5, 2011

Publication date: June 6, 2013

Applicant: Microsoft Corporation

Inventors: Clif Gordon, Kerry D. Woolsey
VISUALIZING, NAVIGATING AND INTERACTING WITH AUDIO CONTENT

Publication number: 20130054250

Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.

Type: Application

Filed: August 29, 2012

Publication date: February 28, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amit Anil Nanavati, Nitendra Rajput
VISUALIZING, NAVIGATING AND INTERACTING WITH AUDIO CONTENT

Publication number: 20130054249

Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.

Type: Application

Filed: August 24, 2011

Publication date: February 28, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amit Anil Nanavati, Nitendra Rajput
SYSTEM AND METHOD FOR EMERGENCY MESSAGE PREVIEW AND TRANSMISSION

Publication number: 20130041646

Abstract: In accordance with the embodiments of the present invention, a system and method for enabling preview, editing, and transmission of emergency notification messages is provided. The system includes a controller, a microphone, and a speech-to-text engine for receiving an audio message input to the microphone and for convert the audio message to a text message. The resulting text message is displayed on a local display, where a user can edit the message via a text editor. Text and/or audio notification devices are provided for displaying the edited text data as a text message. Other embodiments are disclosed and claimed.

Type: Application

Filed: August 10, 2011

Publication date: February 14, 2013

Applicant: SIMPLEXGRINNELL LP

Inventors: Daniel G. Farley, Matthew Farley, John R. Haynes
METHOD AND APPARATUS FOR SOCIAL NETWORK COMMUNICATION OVER A MEDIA NETWORK

Publication number: 20130024187

Abstract: A system that incorporates teachings of the present disclosure may include, for example, transmitting a request to initiate a communication session with a member device of a social network, activating a speech capture element, maintaining activation of the speech capture element in accordance with a pattern of prior speech messages, detecting a speech message at the activated speech capture element, and transmitting the detected speech message, or a derivative thereof, to the member device of the social network. Other embodiments are disclosed.

Type: Application

Filed: July 18, 2011

Publication date: January 24, 2013

Applicant: AT&T Intellectual Property I, LP

Inventors: HISAO CHANG, David Mornhineway
Systems and methods for picture based communication

Publication number: 20120330669

Abstract: Embodiments disclosed relate to communication, and more particularly to picture based communication systems and methods. It is proposed that the techniques described in the present invention will allow systems to be created rapidly for a large number of languages. The present system also has a number of other benefits, which are of use to people who may not necessarily be disabled. For example, the present system could be incorporated into software running on PCs and mobile devices as a part of a message composition system; this will allow language-independent messages to be constructed, which can be de-constructed into any language on the receiver's side. Techniques discussed in this invention would also be of assistance in allowing people with language difficulties, dyslexia or illiteracy to communicate effectively.

Type: Application

Filed: December 8, 2011

Publication date: December 27, 2012

Inventor: Ajit Narayanan
SCRIPTING SUPPORT FOR DATA IDENTIFIERS, VOICE RECOGNITION AND SPEECH IN A TELNET SESSION

Publication number: 20120226499

Abstract: Methods of adding data identifiers and speech/voice recognition functionality are disclosed. A telnet client runs one or more scripts that add data identifiers to data fields in a telnet session. The input data is inserted in the corresponding fields based on data identifiers. Scripts run only on the telnet client without modifications to the server applications. Further disclosed are methods for providing speech recognition and voice functionality to telnet clients. Portions of input data are converted to voice and played to the user. A user also may provide input to certain fields of the telnet session by using his voice. Scripts running on the telnet client convert the user's voice into text and is inserted to corresponding fields.

Type: Application

Filed: May 9, 2012

Publication date: September 6, 2012

Applicant: WAVELINK CORPORATION

Inventors: LAMAR JOHN VAN WAGENEN, BRANT DAVID THOMSEN, SCOTT ALLEN CADDES
SYNCHRONISE AN AUDIO CURSOR AND A TEXT CURSOR DURING EDITING

Publication number: 20120158405

Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to speech data (SD) just played back marked by link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) makes it possible to synchronize text cursor (TC) with audio cursor (AC) or audio cursor (AC) with text cursor (TC) so the positioning of the respective cursor (AC, TC) is simplified considerably.

Type: Application

Filed: February 13, 2012

Publication date: June 21, 2012

Applicant: Nuance Communications Austria GmbH

Inventor: Wolfgang Gschwendtner
Assisted Media Presentation

Publication number: 20120116778

Abstract: A system and method is disclosed that uses screen reader like functionality to speak information presented on a graphical user interface displayed by a media presentation system, including information that is not navigable by a remote control device. Information can be spoken in an order that follows a relative importance of the information based on a characteristic of the information or the location of the information within the graphical user interface. A history of previously spoken information is monitored to avoid speaking information more than once for a given graphical user interface. A different pitch can be used to speak information based on a characteristic of the information. Information that is not navigable by the remote control device can be spoken after time delay. Voice prompts can be provided for a remote-driven virtual keyboard displayed by the media presentation system. The voice prompts can be spoken with different voice pitches.

Type: Application

Filed: November 4, 2010

Publication date: May 10, 2012

Applicant: APPLE INC.

Inventors: Christopher B. Fleizach, Reginald Dean Hudson, Eric Taylor Seymour
HEAD-MOUNTED TEXT DISPLAY SYSTEM AND METHOD FOR THE HEARING IMPAIRED

Publication number: 20120078628

Abstract: The head-mounted text display system for the hearing impaired is a speech-to-text system, in which spoken words are converted into a visual textual display and displayed to the user in passages containing a selected number of words. The system includes a head-mounted visual display, such as eyeglass-type dual liquid crystal displays or the like, and a controller. The controller includes an audio receiver, such as a microphone or the like, for receiving spoken language and converting the spoken language into electrical signals. The controller further includes a speech-to-text module for converting the electrical signals representative of the spoken language to a textual data signal representative of individual words. A transmitter associated with the controller transmits the textual data signal to a receiver associated with the head-mounted display. The textual data is then displayed to the user in passages containing a selected number of individual words.

Type: Application

Filed: September 28, 2010

Publication date: March 29, 2012

Inventor: MAHMOUD M. GHULMAN
myMedicalpen

Publication number: 20120029907

Abstract: A digital pen designed to assist users in spelling words as they write. The invention is an electronic pen with a speaker located near the top of the device. A microphone may be located directly under the speaker in the form of a small screened concave or convex aperture. A switch on the back of the pen allows the user to choose between three settings: Medical Dictionary (D), Off (O), and Prescription Drug List (P). The device works by the user speaking the desired word into the microphone. The word will then appear on the illuminated digital display screen which lights up. The pen asks the user to confirm or deny the displayed word. The user says “yes” or “no” into the microphone. If denied, the pen displays another word until the correct word is located. Once confirmed, the pen will audibly and visibly spell the word one letter at a time as the user writes. The pen may be switched to the prescription drug list mode as needed.

Type: Application

Filed: December 30, 2010

Publication date: February 2, 2012

Inventors: Angela Loggins, Tamara S. Loggins
Audiovisual (AV) Device and Control Method Thereof

Publication number: 20120016666

Abstract: According to one embodiment, an AV device comprises a receiving section, a processing section, a storage section and a control section. The receiving section receives a digital voice signal. The processing section applies a predetermined signal processing operation to the digital voice signal received by the receiving section. The storage section stores information indicating time required for the signal processing operation at the processing section, and when a voice has been set in a mute state, stores the information indicating the time required for the signal processing operation by the processing section which is rewritten into a value that cannot be taken in general. The control section outputs information stored in the storage section upon an external request. Other embodiments are also described.

Type: Application

Filed: September 23, 2011

Publication date: January 19, 2012

Inventors: Takanobu Mukaide, Masahiko Mawatari
CONTEXTUAL TAGGING OF RECORDED DATA

Publication number: 20110304774

Abstract: Embodiments are disclosed that relate to the automatic tagging of recorded content. For example, one disclosed embodiment provides a computing device comprising a processor and memory having instructions executable by the processor to receive input data comprising one or more of a depth data, video data, and directional audio data, identify a content-based input signal in the input data, and apply one or more filters to the input signal to determine whether the input signal comprises a recognized input. Further, if the input signal comprises a recognized input, then the instructions are executable to tag the input data with the contextual tag associated with the recognized input and record the contextual tag with the input data.

Type: Application

Filed: June 11, 2010

Publication date: December 15, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Stephen Latta, Christopher Vuchetich, Matthew Eric Haigh, JR., Andrew Robert Campbell, Darren Bennett, Relja Markovic, Oscar Omar Garza Santos, Kevin Geisner, Kudo Tsunoda
ELECTRONIC READING DEVICE

Publication number: 20110301937

Abstract: The present invention provides an electronic reading device. At the device, a voice is captured by a capturing unit, and then the reference information stored in a storing unit is received by a processing unit for converting the voice to a visual image signal based on the reference information. Afterwards, the visual image corresponding to the visual image signal is shown on a display unit. Therefore, the device provides the function of speech recognition anywhere and anytime and is suitable for prolonged use due to the features of power saving and easy reading.

Type: Application

Filed: February 24, 2011

Publication date: December 8, 2011

Applicant: E INK HOLDINGS INC.

Inventors: TZU-MING WANG, KAI-CHENG CHUANG
ON THE ROAD GROUPS

Publication number: 20110300840

Abstract: A mobile or in-vehicle communication system and method facilitate communication among groups. The system and method also facilitate the creation of such groups. The system and method may convert speech from one member of the group to text for distribution to other members of the group, for whom the text is converted to audible speech.

Type: Application

Filed: June 7, 2011

Publication date: December 8, 2011

Inventor: Otman A. Basir
FREE-FORM ENTRIES DURING PAYMENT PROCESSES

Publication number: 20110237301

Abstract: Various methods and systems are provided that allow a user to perform a free-form action, such as making a mark on a device, speaking into a device, and/or moving the device, to cause a step to be performed that conventionally was performed by the user having to locate and select a button or link on the device.

Type: Application

Filed: March 23, 2010

Publication date: September 29, 2011

Applicant: eBay INC.

Inventors: AMOL BHASKER PATEL, SURAJ SATHEESAN MENON
Interactive Speech Preparation

Publication number: 20110231194

Abstract: In an embodiment, a method of interactive speech preparation is disclosed. The method may include or comprise displaying an interactive speech application on a display device, wherein the interactive speech application has a text display window. The method may also include or comprise accessing text stored in an external storage device over a communication network, and displaying the text within the text display window while capturing video and audio data with video and audio data capturing devices, respectively.

Type: Application

Filed: December 16, 2010

Publication date: September 22, 2011

Inventor: Steven Lewis
MULTI-MODAL INPUT SYSTEM FOR A VOICE-BASED MENU AND CONTENT NAVIGATION SERVICE

Publication number: 20110205149

Abstract: A system and method for providing voice prompts that identify task selections from a list of task selections in a vehicle, where the user employs an input device, such as a scroll wheel, to activate a particular task and where the speed of the voice prompt increases and decreases depending on how fast the user rotates the scroll wheel.

Type: Application

Filed: February 24, 2010

Publication date: August 25, 2011

Applicant: GM GLOBAL TECNOLOGY OPERATIONS, INC.

Inventor: Alfred C. Tom
USER PROFILING FOR VOICE INPUT PROCESSING

Publication number: 20110208524

Abstract: This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device.

Type: Application

Filed: February 25, 2010

Publication date: August 25, 2011

Applicant: Apple Inc.

Inventor: Allen P. Haughay
Method and Apparatus for Communication Between a Vehicle Based Computing System and a Remote Application

Publication number: 20110195659

Abstract: A vehicle-based computing apparatus includes a computer processor in communication with persistent and non-persistent memory. The apparatus also includes a local wireless transceiver in communication with the computer processor and configured to communicate wirelessly with a wireless device located at the vehicle. The processor is operable to receive, through the wireless transceiver, a connection request sent from a nomadic wireless device, the connection request including at least a name of an application seeking to communicate with the processor. The processor is further operable to receive at least one secondary communication from the nomadic device, once the connection request has been processed. The secondary communication is at least one of a speak alert command, a display text command, a create phrase command, and a prompt and listen command.

Type: Application

Filed: February 5, 2010

Publication date: August 11, 2011

Applicant: FORD GLOBAL TECHNOLOGIES, LLC

Inventors: David P. Boll, Nello Joseph Santori, Joseph N. Ross, Mark Shaker, Micah J. Kaiser, Brian Woogeun Joh, Mark Schunder
Integration of Embedded and Network Speech Recognizers

Publication number: 20110184740

Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.

Type: Application

Filed: June 7, 2010

Publication date: July 28, 2011

Applicant: Google Inc.

Inventors: Alexander GRUENSTEIN, William J. Byrne
QUANTIZER, ENCODER, AND THE METHODS THEREOF

Publication number: 20110125495

Abstract: Disclosed are a quantizer, encoder, and the methods thereof, wherein the computational load is reduced when the values related to the transform coefficients of the principal component analysis transform are quantized when a principal component analysis transform is applied to code stereo.

Type: Application

Filed: June 18, 2009

Publication date: May 26, 2011

Applicant: PANASONIC CORPORATION

Inventors: Toshiyuki Morii, Hiroyuki Ehara, Koji Yoshida
SYSTEM AND METHOD FOR INTEGRATING VOICE WITH A MEDICAL DEVICE

Publication number: 20110098544

Abstract: There is provided a system and method for integrating voice with a medical device. More specifically, in one embodiment, there is provided a medical device comprising a speech recognition system configured to receive a processed voice, compare the processed voice to a speech database, identify a command for the medical device corresponding to the processed voice based on the comparison, and execute the identified medical device command.

Type: Application

Filed: December 30, 2010

Publication date: April 28, 2011

Applicant: NELLCOR PURITAN BENNETT LLC

Inventors: Jayesh Shah, Scott Amundson
Method for Dynamically Adjusting Audio Decoding Process

Publication number: 20110099020

Abstract: A method for dynamically arranging DSP tasks. The method comprises receiving an audio bit stream, checking a remaining execution time as the DSP transforms the audio information into spectral information, simplifying the step of transforming the audio information when the DSP detects that the remaining execution time is shorter then a predetermined interval, and skipping one section of the audio information and decoding the remaining section when the execution time is less than a predetermined interval.

Type: Application

Filed: January 4, 2011

Publication date: April 28, 2011

Applicant: MEDIATEK INC.

Inventors: Chih-Chiang Chuang, Pei-Yun Kuo
APPARATUS AND METHOD OF MANUFACTURING ARTICLE USING SOUND

Publication number: 20110093274

Abstract: Disclosed is an apparatus and method of manufacturing an article using sound that modifies sound waveforms for sound of living things (including human voice) in various shapes and manufactures articles corresponding to the shapes. An apparatus for manufacturing an article using sound generates a sampling waveform based on the sound waveform. Next, the sampling waveform is converted into a two-dimensional image file and the two-dimensional image is again converted into a three-dimensional image file. Thereafter, an article is manufactured based on the two-dimensional or three-dimensional image file. According to the invention, the apparatus and method of manufacturing an article using sound manufactures an article based on the sampling waveform generated by sampling the sound waveform, thereby manufacturing a simplified article.

Type: Application

Filed: May 16, 2008

Publication date: April 21, 2011

Inventor: Kwanyoung Lee
Communication System and Method for Representing Information in a Communication

Publication number: 20110087493

Abstract: The invention relates to a communication system having a display unit (2) and a virtual being (3) that can be visually represented on the display unit (2) and that is designed for communication by means of natural speech with a natural person, wherein at least one interaction symbol (6, 7) that can be represented on the display unit (2) and by means of which the natural speech dialog between the virtual being (3) and the natural person is supported such that an achieved dialog state can be indicated and/or additional information depending on the dialog state achieved and/or information can he redundantly invoked, The invention further relates to a method for representing information of a communication between a virtual being and a natural person.

Type: Application

Filed: May 15, 2009

Publication date: April 14, 2011

Inventors: Stefan Sellschopp, Valentin Nicolescu, Helmut Krcmar
Device and Method for a Bandwidth Extension of an Audio Signal

Publication number: 20110054885

Abstract: For a bandwidth extension of an audio signal, in a signal spreader the audio signal is temporally spread by a spread factor greater than 1. The temporally spread audio signal is then supplied to a demicator to decimate the temporally spread version by a decimation factor matched to the spread factor. The band generated by this decimation operation is extracted and distorted, and finally combined with the audio signal to obtain a bandwidth extended audio signal. A phase vocoder in the filterbank implementation or transformation implementation may be used for signal spreading.

Type: Application

Filed: January 20, 2009

Publication date: March 3, 2011

Inventors: Frederik Nagel, Sascha Disch, Max Neuendorf
PRINTED AUDIO FORMAT AND PHOTOGRAPH WITH ENCODED AUDIO

Publication number: 20110043832

Abstract: A printed audio format includes a printed encoding of an audio signal, and a plurality of spaced-apart and parallel rails. The printed encoding of the audio signal is located between the plurality of rails and each rail comprises at least one marker. The printed encoding comprises a first portion and a second portion, each portion comprises a plurality of code frames, and each frame represents a time segment of an audio signal. The first portion encodes a first time period of the audio signal and the second portion encodes a second time period of the audio signal. The second portion is encoded in reverse order with respect to the first portion so that the joining part is on the same end of both portions.

Type: Application

Filed: October 29, 2010

Publication date: February 24, 2011

Applicant: Creative Technology Ltd

Inventors: Wong Hoo Sim, Desmond Toh Onn Hii, Tur We Chan, Chin Fang Lim, Willie Png, Morgun Phay
TRANSFORM CODING OF SPEECH AND AUDIO SIGNALS

Publication number: 20110035212

Abstract: In a method of perceptual transform coding of audio signals in a telecommunication system, performing the steps of determining transform coefficients representative of a time to frequency transformation of a time segmented input audio signal; determining a spectrum of perceptual sub-bands for said input audio signal based on said determined transform coefficients; determining masking thresholds for each said sub-band based on said determined spectrum; computing scale factors for each said sub-band based on said determined masking thresholds, and finally adapting said computed scale factors for each said sub-band to prevent energy loss for perceptually relevant sub-bands.

Type: Application

Filed: August 26, 2008

Publication date: February 10, 2011

Applicant: Telefonaktiebolaget L M Ericsson (publ)

Inventors: Manuel Briand, Anisse Taleb
EXTRACTING GEOGRAPHIC INFORMATION FROM TV SIGNAL TO SUPERIMPOSE MAP ON IMAGE

Publication number: 20110001878

Abstract: A TV uses optical character recognition (OCR) to extract text from a TV image and/or voice recognition to extract text from the TV audio and if a geographic place name is recognized, displays a relevant map in a picture-in-picture window on the TV. The user may be given the option of turning the map feature on and off, defining how long the map is displayed, and defining the scale of the map to be displayed.

Type: Application

Filed: July 2, 2009

Publication date: January 6, 2011

Inventors: Libiao Jiang, Yang Yu
Facility for Processing Verbal Feedback and Updating Digital Video Recorder(DVR) Recording Patterns

Publication number: 20110004477

Abstract: A method, a system and a computer program product for using speech/voice recognition technology to update digital video recorder (DVR) program recording patterns, based on program viewer/listener feedback. A speech controlled pattern modification (SCPM) utility utilizes a DVR recording sub-system integrated with speech processing functionality to compare control phrases with phrases uttered by a viewer. If a control phrase matches a phrase uttered by the viewer, the SCPM utility modifies the DVR recording patterns, according to a set of pre-programmed governing rules. For example, the SCPM utility may avoid modifying the recording patterns for programs within a list of “favorite” programs but may modify the recording patterns for programs excluded from the list. The SCPM utility determines priority of the uttered phrases by identifying users and retrieving a preset priority level of the identified users. The priority level is then used to control changes to the recording patterns.

Type: Application

Filed: July 2, 2009

Publication date: January 6, 2011

Applicant: International Business Machines Corporation

Inventors: Ravi P. Bansal, Mike V. Macias, Saidas T. Kottawar, Salil P. Gandhi, Sandip D. Mahajan
VOICE ENABLED MEDIA PRESENTATION SYSTEMS AND METHODS

Publication number: 20100333163

Abstract: Various embodiments facilitate voice control of a receiving device, such as a set-top box. In one embodiment, a voice enabled media presentation system (“VEMPS”) includes a receiving device and a remote-control device having an audio input device. The VEMPS is configured to obtain audio data via the audio input device, the audio data received from a user and representing a spoken command to control the receiving device. The VEMPS is further configured to determine the spoken command by performing speech recognition on the obtained audio data, and to control the receiving device based on the determined command. This abstract is provided to comply with rules requiring an abstract, and it is submitted with the intention that it will not be used to interpret or limit the scope or meaning of the claims.

Type: Application

Filed: June 25, 2009

Publication date: December 30, 2010

Applicant: ECHOSTAR TECHNOLOGIES L.L.C.

Inventor: Curtis N. Daly
METHOD AND APPARATUS FOR PLAYING PICTURES

Publication number: 20100312559

Abstract: A method of playing pictures comprises the steps of: receiving (11) a voice message; extracting (12) a key feature from the voice message; selecting (13) pictures by matching the key feature with pre-stored picture information; generating (14) a picture-voice sequence by integrating the selected pictures and the voice message; and playing (15) the picture-voice sequence. An electronic apparatus comprises a processing unit for implementing the different steps of the method.

Type: Application

Filed: December 11, 2008

Publication date: December 9, 2010

Applicant: Koninklijke Philips Electronics N.V.

Inventors: Sheng Jin, Xin Chen, Yang Peng, Ningjiang Chen, Yunji Xia
CONTEXT AWARE, SPEECH-CONTROLLED INTERFACE AND SYSTEM

Publication number: 20100250253

Abstract: A speech-directed user interface system includes at least one speaker for delivering an audio signal to a user and at least one microphone for capturing speech utterances of a user. An interface device interfaces with the speaker and microphone and provides a plurality of audio signals to the speaker to be heard by the user. A control circuit is operably coupled with the interface device and is configured for selecting at least one of the plurality of audio signals as a foreground audio signal for delivery to the user through the speaker. The control circuit is operable for recognizing speech utterances of a user and using the recognized speech utterances to control the selection of the foreground audio signal.

Type: Application

Filed: March 27, 2009

Publication date: September 30, 2010

Inventor: Yangmin Shen
FACIAL EXPRESSION REPRESENTATION APPARATUS

Publication number: 20100211397

Abstract: An avatar facial expression representation technology is provided. The avatar facial expression representation technology estimates changes in emotion and emphasis in a user's voice from vocal information, and changes in mouth shape of the user from pronunciation information of the voice. The avatar facial expression technology tracks a user's facial movements and changes in facial expression from image information and may represent avatar facial expressions based on the result of the these operations. Accordingly, the avatar facial expressions can be obtained which are similar to actual facial expressions of the user.

Type: Application

Filed: January 28, 2010

Publication date: August 19, 2010

Inventors: Chi-youn PARK, Young-Kyoo HWANG, Jung-bae KIM
INDICATING METHOD FOR SPEECH RECOGNITION SYSTEM

Publication number: 20100198583

Abstract: The present invention relates to an indicating method for speech recognition system, comprising a multimedia electronic product and a speech recognition device. The steps of this method include: users enter voice commands into a voice input unit and convert these commands into speech signals, which are acquired and stored by a recording unit, converted by a microprocessor into a volume indicating oscillogram, and then displayed by a display module. At the same time, compliance with speech recognition conditions will be decided in that process.

Type: Application

Filed: February 4, 2009

Publication date: August 5, 2010

Applicant: AIBELIVE CO., LTD.

Inventors: Chen-Wei Su, Chun-Ping Fang, Min-Ching Wu
Data-Driven Voice User Interface

Publication number: 20100145710

Abstract: A method for developing a voice user interface for a statistical semantic system is described. A set of semantic meanings is defined that reflect semantic classification of a user input dialog. Then, a set of speech dialog prompts is automatically developed from an annotated transcription corpus for directing user inputs to corresponding final semantic meanings. The statistical semantic system may be a call routing application where the semantic meanings are call routing destinations.

Type: Application

Filed: December 8, 2008

Publication date: June 10, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventor: Real Tremblay
COMMUNICATION SYSTEM FOR SPEECH DISABLED INDIVIDUALS

Publication number: 20100063822

Abstract: A communication system that is specifically designed for the needs of speech impaired individuals, particularly aphasia victims, makes use of a speech generating mobile terminal communication device (SGMTD) (12) that is designed to be hand held and operated by a speech disabled individual. The SGMTD includes a database of audio files that are accessed to generate full sentences in response to single word or short phrase entries selected from a plurality of menus by the disabled user. A second, companion mobile terminal device (COMTD) (14) enables a caregiver to communicate with the speech disabled individual's SGMTD to assist the individual in communicating with the caregiver by causing the SGMTD to switch to a particular menu or list from which the caregiver wants the disabled individual to make a selection. The SGMTD also includes software that enables the device to communicate with other SGMTDs via wireless communications and thereby simulate a verbal conversation between speech impaired individuals.

Type: Application

Filed: April 21, 2008

Publication date: March 11, 2010

Inventors: Daniel C. O'Brien, Edward T. Buchholz
METHOD AND APPARATUS FOR SCROLLING TEXT DISPLAY OF VOICE CALL OR MESSAGE DURING VIDEO DISPLAY SESSION

Publication number: 20100057466

Abstract: A method and communication device disclosed includes displaying a video on a display, converting voice audio data to textual data by applying voice-to-text conversion, and displaying the textual data as scrolling text displayed along with the video on the display and either above, below or across the video. The method may further include receiving a voice call indication from a network, providing the voice call indication to a user interface where the voice call indication corresponds to an incoming voice call; and receiving a user input for receiving the voice call and displaying the voice call as scrolling text. In another embodiment, a method includes displaying application related data on a display; converting voice audio data to textual data by applying voice-to-text conversion; converting the textual data to a video format; and displaying the textual data as scrolling text over the application related data on said display.

Type: Application

Filed: September 17, 2008

Publication date: March 4, 2010

Applicant: ATI Technologies ULC

Inventors: Dinesh Kumar Garg, Manish Poddar
SYSTEM AND METHOD FOR CUSTOMIZED PROMPTING

Publication number: 20100049528

Abstract: A method for providing an audible prompt to a user within a vehicle. The method includes retrieving one or more data files from a memory device. The data files define certain characteristics of an audio prompt. The method also includes creating the audio prompt from the data files and outputting the audio prompt as an audio signal.

Type: Application

Filed: January 4, 2008

Publication date: February 25, 2010

Inventors: Mark Zeinstra, Richard J. Chutorash, Jeffrey Golden, Jon M. Skekloff
ENCODING DEVICE AND ENCODING METHOD

Publication number: 20100049512

Abstract: Disclosed is an encoding device and others capable of suppressing quantization distortion while suppressing increase of a bit rate when encoding audio or the like. In the device, a dynamic range calculation unit (12) calculates a dynamic range of an input spectrum as an index indicating a peak of the input spectrum, a pulse quantity decision unit (13) decides the number of pulses of a vector candidate outputted from a shape codebook (14), and a shape codebook (14) outputs a vector candidate having the number of pulses decided by the pulse quantity decision unit (13) according to control from the search unit (17) by using a vector candidate element {?1, 0, +1}.

Type: Application

Filed: December 14, 2007

Publication date: February 25, 2010

Applicant: PANASONIC CORPORATION

Inventors: Masahiro Oshikiri, Tomofumi Yamanashi
METHOD AND SYSTEM OF GENERATING REFERENCE VARIATIONS FOR DIRECTORY ASSISTANCE DATA

Publication number: 20100049502

Abstract: Methods and systems of performing user input recognition are disclosed. A digital directory comprising listings is accessed. Metadata information is associated with individual listings describing the individual listings. The metadata information is modified to generate transformed metadata information. Therefore, the transformed metadata information is generated as a function of context information relating to a typical user interaction with the listings. Information is generated for aiding in an automated user input recognition process based on the transformed metadata information.

Type: Application

Filed: November 2, 2009

Publication date: February 25, 2010

Applicant: Microsoft Corporation

Inventors: Kyle Oppenheim, David Mitby, Nick Kibre
TELECONFERENCE TERMINAL APPARATUS, RELAYING APPARATUS, AND TELECONFERENCING SYSTEM

Publication number: 20090313012

Abstract: A teleconference terminal apparatus (200) including: an input unit (201) which receives a speech signal; an analyzing unit (202) which calculates a target size on a predetermined segment basis of a speech signal; a coding unit (203) which codes the speech signal to generate a data stream, so that the coded data size on a predetermined segment basis becomes the target size corresponding to each of predetermined segments; a stream transmitting unit (204) which transmits to a network the generated data stream; a receiving unit (205) which receives the data stream transmitted from another terminal apparatus; a filtering unit (206) which determines whether or not segment data is to be decoded on a basis of a data size for each predetermined segment in the received data stream, the segment data being included in the data stream; a decoding unit (207) which decodes segment data determined to be decoded to generate a speech signal; and an output unit (209) which outputs the speech signal generated by the decoding uni

Type: Application

Filed: October 24, 2008

Publication date: December 17, 2009

Inventor: Kojiro Ono
METHOD AND AN APPARATUS FOR IDENTIFYING FRAME TYPE

Publication number: 20090313011

Abstract: A method for identifying a frame type is disclosed. The present invention includes receiving current frame type information, obtaining previously received previous frame type information, generating frame identification information of a current frame using the current frame type information and the previous frame type information, and identifying the current frame using the frame identification information. And, a method for identifying a frame type is disclosed. The present invention includes receiving a backward type bit corresponding to current frame type information, obtaining a forward type bit corresponding to previous frame type information, generating frame identification information of a current frame by placing the backward type bit at a first position and placing the forward type bit at a second position.

Type: Application

Filed: May 8, 2009

Publication date: December 17, 2009

Applicant: LG Electronics INC.

Inventors: Sang Bae CHON, Lae Hoon Kim, Koeng Mo Sung
METHOD AND SYSTEM FOR ENCODING LANGUAGES

Publication number: 20090306978

Abstract: A method of encoding and decoding languages for international communication. A set of core words may be encoded, although the full vocabulary of the language might also be covered. The result is particularly suitable for use by people in relation to the keypad of a mobile phone, but may also be implemented in translation or communication software to create a language database for example. The encoding includes assigning digital symbols to selected words in the language, assigning alphanumeric representations to the digital symbols, and assigning pronounceable elements to the alphanumeric representations.

Type: Application

Filed: November 2, 2006

Publication date: December 10, 2009

Applicant: LISTED VENTURES PTY LTD

Inventor: Robert Andrew McMahon McNeilly
SYSTEM AND METHOD FOR VISUALLY PRESENTING AUDIO SIGNALS

Publication number: 20090281810

Abstract: A method of visually presenting audio signals includes receiving an audio signal to be presented; generating a predetermined number of discrete frequency components from the audio signal; assigning a graphical object to each of the frequency components, each of the graphical objects being specified by a geometrical shape, a position information and a size information; and all of the graphical objects associated with all of the frequency components are displayed simultaneously on a graphic display. The system includes a microphone for generating audio signals; an audio interface unit for sampling the audio signals and transforming them into digital signals; a processing unit for translating digital signals into a predetermined number of discrete frequency components and for assigning a graphical object to each of the discrete frequency components; a video interface unit for generating a video signal; and a graphic display for displaying a sonogram based on the video signal.

Type: Application

Filed: June 25, 2007

Publication date: November 12, 2009

Applicant: Ave-Fon Kft.

Inventors: Istvan Sziklai, Istvan Hazman, Jozsef Imrek

1 2 next