Transformation Of Speech Into A Nonaudible Representation, E.g., Speech Visualization, Speech Processing For Tactile Aids, Etc. (epo) Patents (Class 704/E21.019)
  • Patent number: 8766765
    Abstract: A device, method and computer program product that provides tactile feedback to a visually impaired person to assist the visually impaired person in maintaining eye contact with another person when speaking to the other person. The eyeglass-based tactile response device includes a frame that has sensors mounted thereto, and a track that interconnects eyepieces on the device. In one example, a motor and a wheel are coupled to the track and are driven across the track depending on the amount of feedback to be provided to the user regarding how much the user should turn his or her head.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: July 1, 2014
    Inventor: Hassan Wael Hamadallah
  • Patent number: 8547208
    Abstract: An electronic device includes a storage unit, a sound signal receiving unit, a remote signal generating unit, a processing unit, and a transmitting unit. The control table records RC devices, sound parameters, and control commands. Each RC device is associated with at least one sound parameter. Each sound parameter corresponds to one control command. The sound signal receiving unit receives sound signals. The processing unit determines a sound parameter according to the received sound signals, searches in the control table to determine the control command the sound parameter corresponding to according to a selected RC device, and controls the remote signal generating unit to generate a remote signal corresponding to the determined control command. The transmitting unit transmits the remote signal to the selected RC device.
    Type: Grant
    Filed: December 1, 2010
    Date of Patent: October 1, 2013
    Assignee: Hon Hai Precision Industry Co., Ltd.
    Inventors: Ping-Yang Chuang, Ying-Chuan Yu
  • Publication number: 20130183944
    Abstract: Embodiments of the present invention are directed toward systems, methods and devices for improving information access to and device control in a home automation environment. Functionality of multiple household device, such as lights, sound, entertainment, HVAC, and communication devices can be activated via voice commands. The voice commands are detected by a nearby control device and relayed via a network communication medium to another control device to which the desired device or system that the user wants to operate is connected. Each control device, disposed throughout the home, can detect a voice command intended for another control box and household device and relay the voice command to the intended control box. In such systems, a user can initiate a telephone call by saying a voice command to a local control box that will forward on the control signal to a mobile phone connected to another control box.
    Type: Application
    Filed: February 8, 2012
    Publication date: July 18, 2013
    Applicant: SENSORY, INCORPORATED
    Inventors: Todd F. Mozer, Forrest S. Mozer
  • Publication number: 20130166285
    Abstract: This specification describes technologies relating to multi core processing for parallel speech-to-text processing. In some implementations, a computer-implemented method is provided that includes the actions of receiving an audio file; analyzing the audio file to identify portions of the audio file as corresponding to one or more audio types; generating a time-ordered classification of the identified portions, the time-ordered classification indicating the one or more audio types and position within the audio file of each portion; generating a queue using the time-ordered classification, the queue including a plurality of jobs where each job includes one or more identifiers of a portion of the audio file classified as belonging to the one or more speech types; distributing the jobs in the queue to a plurality of processors; performing speech-to-text processing on each portion to generate a corresponding text file; and merging the corresponding text files to generate a transcription file.
    Type: Application
    Filed: December 10, 2008
    Publication date: June 27, 2013
    Applicant: Adobe Systems Incorporated
    Inventors: Walter Chang, Michael J. Welch
  • Publication number: 20130144610
    Abstract: An automated technique is disclosed for processing audio data and generating one or more actions in response thereto. In particular embodiments, the audio data can be obtained during a phone conversation and post-call actions can be provided to the user with contextually relevant entry points for completion by an associated application. Audio transcription services available on a remote server can be leveraged. The entry points can be generated based on keyword recognition in the transcription and passed to the application in the form of parameters.
    Type: Application
    Filed: December 5, 2011
    Publication date: June 6, 2013
    Applicant: Microsoft Corporation
    Inventors: Clif Gordon, Kerry D. Woolsey
  • Publication number: 20130054250
    Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.
    Type: Application
    Filed: August 29, 2012
    Publication date: February 28, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Amit Anil Nanavati, Nitendra Rajput
  • Publication number: 20130054249
    Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.
    Type: Application
    Filed: August 24, 2011
    Publication date: February 28, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Amit Anil Nanavati, Nitendra Rajput
  • Publication number: 20130041646
    Abstract: In accordance with the embodiments of the present invention, a system and method for enabling preview, editing, and transmission of emergency notification messages is provided. The system includes a controller, a microphone, and a speech-to-text engine for receiving an audio message input to the microphone and for convert the audio message to a text message. The resulting text message is displayed on a local display, where a user can edit the message via a text editor. Text and/or audio notification devices are provided for displaying the edited text data as a text message. Other embodiments are disclosed and claimed.
    Type: Application
    Filed: August 10, 2011
    Publication date: February 14, 2013
    Applicant: SIMPLEXGRINNELL LP
    Inventors: Daniel G. Farley, Matthew Farley, John R. Haynes
  • Publication number: 20130024187
    Abstract: A system that incorporates teachings of the present disclosure may include, for example, transmitting a request to initiate a communication session with a member device of a social network, activating a speech capture element, maintaining activation of the speech capture element in accordance with a pattern of prior speech messages, detecting a speech message at the activated speech capture element, and transmitting the detected speech message, or a derivative thereof, to the member device of the social network. Other embodiments are disclosed.
    Type: Application
    Filed: July 18, 2011
    Publication date: January 24, 2013
    Applicant: AT&T Intellectual Property I, LP
    Inventors: HISAO CHANG, David Mornhineway
  • Publication number: 20120330669
    Abstract: Embodiments disclosed relate to communication, and more particularly to picture based communication systems and methods. It is proposed that the techniques described in the present invention will allow systems to be created rapidly for a large number of languages. The present system also has a number of other benefits, which are of use to people who may not necessarily be disabled. For example, the present system could be incorporated into software running on PCs and mobile devices as a part of a message composition system; this will allow language-independent messages to be constructed, which can be de-constructed into any language on the receiver's side. Techniques discussed in this invention would also be of assistance in allowing people with language difficulties, dyslexia or illiteracy to communicate effectively.
    Type: Application
    Filed: December 8, 2011
    Publication date: December 27, 2012
    Inventor: Ajit Narayanan
  • Publication number: 20120226499
    Abstract: Methods of adding data identifiers and speech/voice recognition functionality are disclosed. A telnet client runs one or more scripts that add data identifiers to data fields in a telnet session. The input data is inserted in the corresponding fields based on data identifiers. Scripts run only on the telnet client without modifications to the server applications. Further disclosed are methods for providing speech recognition and voice functionality to telnet clients. Portions of input data are converted to voice and played to the user. A user also may provide input to certain fields of the telnet session by using his voice. Scripts running on the telnet client convert the user's voice into text and is inserted to corresponding fields.
    Type: Application
    Filed: May 9, 2012
    Publication date: September 6, 2012
    Applicant: WAVELINK CORPORATION
    Inventors: LAMAR JOHN VAN WAGENEN, BRANT DAVID THOMSEN, SCOTT ALLEN CADDES
  • Publication number: 20120158405
    Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to speech data (SD) just played back marked by link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) makes it possible to synchronize text cursor (TC) with audio cursor (AC) or audio cursor (AC) with text cursor (TC) so the positioning of the respective cursor (AC, TC) is simplified considerably.
    Type: Application
    Filed: February 13, 2012
    Publication date: June 21, 2012
    Applicant: Nuance Communications Austria GmbH
    Inventor: Wolfgang Gschwendtner
  • Publication number: 20120116778
    Abstract: A system and method is disclosed that uses screen reader like functionality to speak information presented on a graphical user interface displayed by a media presentation system, including information that is not navigable by a remote control device. Information can be spoken in an order that follows a relative importance of the information based on a characteristic of the information or the location of the information within the graphical user interface. A history of previously spoken information is monitored to avoid speaking information more than once for a given graphical user interface. A different pitch can be used to speak information based on a characteristic of the information. Information that is not navigable by the remote control device can be spoken after time delay. Voice prompts can be provided for a remote-driven virtual keyboard displayed by the media presentation system. The voice prompts can be spoken with different voice pitches.
    Type: Application
    Filed: November 4, 2010
    Publication date: May 10, 2012
    Applicant: APPLE INC.
    Inventors: Christopher B. Fleizach, Reginald Dean Hudson, Eric Taylor Seymour
  • Publication number: 20120078628
    Abstract: The head-mounted text display system for the hearing impaired is a speech-to-text system, in which spoken words are converted into a visual textual display and displayed to the user in passages containing a selected number of words. The system includes a head-mounted visual display, such as eyeglass-type dual liquid crystal displays or the like, and a controller. The controller includes an audio receiver, such as a microphone or the like, for receiving spoken language and converting the spoken language into electrical signals. The controller further includes a speech-to-text module for converting the electrical signals representative of the spoken language to a textual data signal representative of individual words. A transmitter associated with the controller transmits the textual data signal to a receiver associated with the head-mounted display. The textual data is then displayed to the user in passages containing a selected number of individual words.
    Type: Application
    Filed: September 28, 2010
    Publication date: March 29, 2012
    Inventor: MAHMOUD M. GHULMAN
  • Publication number: 20120029907
    Abstract: A digital pen designed to assist users in spelling words as they write. The invention is an electronic pen with a speaker located near the top of the device. A microphone may be located directly under the speaker in the form of a small screened concave or convex aperture. A switch on the back of the pen allows the user to choose between three settings: Medical Dictionary (D), Off (O), and Prescription Drug List (P). The device works by the user speaking the desired word into the microphone. The word will then appear on the illuminated digital display screen which lights up. The pen asks the user to confirm or deny the displayed word. The user says “yes” or “no” into the microphone. If denied, the pen displays another word until the correct word is located. Once confirmed, the pen will audibly and visibly spell the word one letter at a time as the user writes. The pen may be switched to the prescription drug list mode as needed.
    Type: Application
    Filed: December 30, 2010
    Publication date: February 2, 2012
    Inventors: Angela Loggins, Tamara S. Loggins
  • Publication number: 20120016666
    Abstract: According to one embodiment, an AV device comprises a receiving section, a processing section, a storage section and a control section. The receiving section receives a digital voice signal. The processing section applies a predetermined signal processing operation to the digital voice signal received by the receiving section. The storage section stores information indicating time required for the signal processing operation at the processing section, and when a voice has been set in a mute state, stores the information indicating the time required for the signal processing operation by the processing section which is rewritten into a value that cannot be taken in general. The control section outputs information stored in the storage section upon an external request. Other embodiments are also described.
    Type: Application
    Filed: September 23, 2011
    Publication date: January 19, 2012
    Inventors: Takanobu Mukaide, Masahiko Mawatari
  • Publication number: 20110304774
    Abstract: Embodiments are disclosed that relate to the automatic tagging of recorded content. For example, one disclosed embodiment provides a computing device comprising a processor and memory having instructions executable by the processor to receive input data comprising one or more of a depth data, video data, and directional audio data, identify a content-based input signal in the input data, and apply one or more filters to the input signal to determine whether the input signal comprises a recognized input. Further, if the input signal comprises a recognized input, then the instructions are executable to tag the input data with the contextual tag associated with the recognized input and record the contextual tag with the input data.
    Type: Application
    Filed: June 11, 2010
    Publication date: December 15, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Stephen Latta, Christopher Vuchetich, Matthew Eric Haigh, JR., Andrew Robert Campbell, Darren Bennett, Relja Markovic, Oscar Omar Garza Santos, Kevin Geisner, Kudo Tsunoda
  • Publication number: 20110301937
    Abstract: The present invention provides an electronic reading device. At the device, a voice is captured by a capturing unit, and then the reference information stored in a storing unit is received by a processing unit for converting the voice to a visual image signal based on the reference information. Afterwards, the visual image corresponding to the visual image signal is shown on a display unit. Therefore, the device provides the function of speech recognition anywhere and anytime and is suitable for prolonged use due to the features of power saving and easy reading.
    Type: Application
    Filed: February 24, 2011
    Publication date: December 8, 2011
    Applicant: E INK HOLDINGS INC.
    Inventors: TZU-MING WANG, KAI-CHENG CHUANG
  • Publication number: 20110300840
    Abstract: A mobile or in-vehicle communication system and method facilitate communication among groups. The system and method also facilitate the creation of such groups. The system and method may convert speech from one member of the group to text for distribution to other members of the group, for whom the text is converted to audible speech.
    Type: Application
    Filed: June 7, 2011
    Publication date: December 8, 2011
    Inventor: Otman A. Basir
  • Publication number: 20110237301
    Abstract: Various methods and systems are provided that allow a user to perform a free-form action, such as making a mark on a device, speaking into a device, and/or moving the device, to cause a step to be performed that conventionally was performed by the user having to locate and select a button or link on the device.
    Type: Application
    Filed: March 23, 2010
    Publication date: September 29, 2011
    Applicant: eBay INC.
    Inventors: AMOL BHASKER PATEL, SURAJ SATHEESAN MENON
  • Publication number: 20110231194
    Abstract: In an embodiment, a method of interactive speech preparation is disclosed. The method may include or comprise displaying an interactive speech application on a display device, wherein the interactive speech application has a text display window. The method may also include or comprise accessing text stored in an external storage device over a communication network, and displaying the text within the text display window while capturing video and audio data with video and audio data capturing devices, respectively.
    Type: Application
    Filed: December 16, 2010
    Publication date: September 22, 2011
    Inventor: Steven Lewis
  • Publication number: 20110205149
    Abstract: A system and method for providing voice prompts that identify task selections from a list of task selections in a vehicle, where the user employs an input device, such as a scroll wheel, to activate a particular task and where the speed of the voice prompt increases and decreases depending on how fast the user rotates the scroll wheel.
    Type: Application
    Filed: February 24, 2010
    Publication date: August 25, 2011
    Applicant: GM GLOBAL TECNOLOGY OPERATIONS, INC.
    Inventor: Alfred C. Tom
  • Publication number: 20110208524
    Abstract: This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device.
    Type: Application
    Filed: February 25, 2010
    Publication date: August 25, 2011
    Applicant: Apple Inc.
    Inventor: Allen P. Haughay
  • Publication number: 20110195659
    Abstract: A vehicle-based computing apparatus includes a computer processor in communication with persistent and non-persistent memory. The apparatus also includes a local wireless transceiver in communication with the computer processor and configured to communicate wirelessly with a wireless device located at the vehicle. The processor is operable to receive, through the wireless transceiver, a connection request sent from a nomadic wireless device, the connection request including at least a name of an application seeking to communicate with the processor. The processor is further operable to receive at least one secondary communication from the nomadic device, once the connection request has been processed. The secondary communication is at least one of a speak alert command, a display text command, a create phrase command, and a prompt and listen command.
    Type: Application
    Filed: February 5, 2010
    Publication date: August 11, 2011
    Applicant: FORD GLOBAL TECHNOLOGIES, LLC
    Inventors: David P. Boll, Nello Joseph Santori, Joseph N. Ross, Mark Shaker, Micah J. Kaiser, Brian Woogeun Joh, Mark Schunder
  • Publication number: 20110184740
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Application
    Filed: June 7, 2010
    Publication date: July 28, 2011
    Applicant: Google Inc.
    Inventors: Alexander GRUENSTEIN, William J. Byrne
  • Publication number: 20110125495
    Abstract: Disclosed are a quantizer, encoder, and the methods thereof, wherein the computational load is reduced when the values related to the transform coefficients of the principal component analysis transform are quantized when a principal component analysis transform is applied to code stereo.
    Type: Application
    Filed: June 18, 2009
    Publication date: May 26, 2011
    Applicant: PANASONIC CORPORATION
    Inventors: Toshiyuki Morii, Hiroyuki Ehara, Koji Yoshida
  • Publication number: 20110098544
    Abstract: There is provided a system and method for integrating voice with a medical device. More specifically, in one embodiment, there is provided a medical device comprising a speech recognition system configured to receive a processed voice, compare the processed voice to a speech database, identify a command for the medical device corresponding to the processed voice based on the comparison, and execute the identified medical device command.
    Type: Application
    Filed: December 30, 2010
    Publication date: April 28, 2011
    Applicant: NELLCOR PURITAN BENNETT LLC
    Inventors: Jayesh Shah, Scott Amundson
  • Publication number: 20110099020
    Abstract: A method for dynamically arranging DSP tasks. The method comprises receiving an audio bit stream, checking a remaining execution time as the DSP transforms the audio information into spectral information, simplifying the step of transforming the audio information when the DSP detects that the remaining execution time is shorter then a predetermined interval, and skipping one section of the audio information and decoding the remaining section when the execution time is less than a predetermined interval.
    Type: Application
    Filed: January 4, 2011
    Publication date: April 28, 2011
    Applicant: MEDIATEK INC.
    Inventors: Chih-Chiang Chuang, Pei-Yun Kuo
  • Publication number: 20110093274
    Abstract: Disclosed is an apparatus and method of manufacturing an article using sound that modifies sound waveforms for sound of living things (including human voice) in various shapes and manufactures articles corresponding to the shapes. An apparatus for manufacturing an article using sound generates a sampling waveform based on the sound waveform. Next, the sampling waveform is converted into a two-dimensional image file and the two-dimensional image is again converted into a three-dimensional image file. Thereafter, an article is manufactured based on the two-dimensional or three-dimensional image file. According to the invention, the apparatus and method of manufacturing an article using sound manufactures an article based on the sampling waveform generated by sampling the sound waveform, thereby manufacturing a simplified article.
    Type: Application
    Filed: May 16, 2008
    Publication date: April 21, 2011
    Inventor: Kwanyoung Lee
  • Publication number: 20110087493
    Abstract: The invention relates to a communication system having a display unit (2) and a virtual being (3) that can be visually represented on the display unit (2) and that is designed for communication by means of natural speech with a natural person, wherein at least one interaction symbol (6, 7) that can be represented on the display unit (2) and by means of which the natural speech dialog between the virtual being (3) and the natural person is supported such that an achieved dialog state can be indicated and/or additional information depending on the dialog state achieved and/or information can he redundantly invoked, The invention further relates to a method for representing information of a communication between a virtual being and a natural person.
    Type: Application
    Filed: May 15, 2009
    Publication date: April 14, 2011
    Inventors: Stefan Sellschopp, Valentin Nicolescu, Helmut Krcmar
  • Publication number: 20110054885
    Abstract: For a bandwidth extension of an audio signal, in a signal spreader the audio signal is temporally spread by a spread factor greater than 1. The temporally spread audio signal is then supplied to a demicator to decimate the temporally spread version by a decimation factor matched to the spread factor. The band generated by this decimation operation is extracted and distorted, and finally combined with the audio signal to obtain a bandwidth extended audio signal. A phase vocoder in the filterbank implementation or transformation implementation may be used for signal spreading.
    Type: Application
    Filed: January 20, 2009
    Publication date: March 3, 2011
    Inventors: Frederik Nagel, Sascha Disch, Max Neuendorf
  • Publication number: 20110043832
    Abstract: A printed audio format includes a printed encoding of an audio signal, and a plurality of spaced-apart and parallel rails. The printed encoding of the audio signal is located between the plurality of rails and each rail comprises at least one marker. The printed encoding comprises a first portion and a second portion, each portion comprises a plurality of code frames, and each frame represents a time segment of an audio signal. The first portion encodes a first time period of the audio signal and the second portion encodes a second time period of the audio signal. The second portion is encoded in reverse order with respect to the first portion so that the joining part is on the same end of both portions.
    Type: Application
    Filed: October 29, 2010
    Publication date: February 24, 2011
    Applicant: Creative Technology Ltd
    Inventors: Wong Hoo Sim, Desmond Toh Onn Hii, Tur We Chan, Chin Fang Lim, Willie Png, Morgun Phay
  • Publication number: 20110035212
    Abstract: In a method of perceptual transform coding of audio signals in a telecommunication system, performing the steps of determining transform coefficients representative of a time to frequency transformation of a time segmented input audio signal; determining a spectrum of perceptual sub-bands for said input audio signal based on said determined transform coefficients; determining masking thresholds for each said sub-band based on said determined spectrum; computing scale factors for each said sub-band based on said determined masking thresholds, and finally adapting said computed scale factors for each said sub-band to prevent energy loss for perceptually relevant sub-bands.
    Type: Application
    Filed: August 26, 2008
    Publication date: February 10, 2011
    Applicant: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Manuel Briand, Anisse Taleb
  • Publication number: 20110001878
    Abstract: A TV uses optical character recognition (OCR) to extract text from a TV image and/or voice recognition to extract text from the TV audio and if a geographic place name is recognized, displays a relevant map in a picture-in-picture window on the TV. The user may be given the option of turning the map feature on and off, defining how long the map is displayed, and defining the scale of the map to be displayed.
    Type: Application
    Filed: July 2, 2009
    Publication date: January 6, 2011
    Inventors: Libiao Jiang, Yang Yu
  • Publication number: 20110004477
    Abstract: A method, a system and a computer program product for using speech/voice recognition technology to update digital video recorder (DVR) program recording patterns, based on program viewer/listener feedback. A speech controlled pattern modification (SCPM) utility utilizes a DVR recording sub-system integrated with speech processing functionality to compare control phrases with phrases uttered by a viewer. If a control phrase matches a phrase uttered by the viewer, the SCPM utility modifies the DVR recording patterns, according to a set of pre-programmed governing rules. For example, the SCPM utility may avoid modifying the recording patterns for programs within a list of “favorite” programs but may modify the recording patterns for programs excluded from the list. The SCPM utility determines priority of the uttered phrases by identifying users and retrieving a preset priority level of the identified users. The priority level is then used to control changes to the recording patterns.
    Type: Application
    Filed: July 2, 2009
    Publication date: January 6, 2011
    Applicant: International Business Machines Corporation
    Inventors: Ravi P. Bansal, Mike V. Macias, Saidas T. Kottawar, Salil P. Gandhi, Sandip D. Mahajan
  • Publication number: 20100333163
    Abstract: Various embodiments facilitate voice control of a receiving device, such as a set-top box. In one embodiment, a voice enabled media presentation system (“VEMPS”) includes a receiving device and a remote-control device having an audio input device. The VEMPS is configured to obtain audio data via the audio input device, the audio data received from a user and representing a spoken command to control the receiving device. The VEMPS is further configured to determine the spoken command by performing speech recognition on the obtained audio data, and to control the receiving device based on the determined command. This abstract is provided to comply with rules requiring an abstract, and it is submitted with the intention that it will not be used to interpret or limit the scope or meaning of the claims.
    Type: Application
    Filed: June 25, 2009
    Publication date: December 30, 2010
    Applicant: ECHOSTAR TECHNOLOGIES L.L.C.
    Inventor: Curtis N. Daly
  • Publication number: 20100312559
    Abstract: A method of playing pictures comprises the steps of: receiving (11) a voice message; extracting (12) a key feature from the voice message; selecting (13) pictures by matching the key feature with pre-stored picture information; generating (14) a picture-voice sequence by integrating the selected pictures and the voice message; and playing (15) the picture-voice sequence. An electronic apparatus comprises a processing unit for implementing the different steps of the method.
    Type: Application
    Filed: December 11, 2008
    Publication date: December 9, 2010
    Applicant: Koninklijke Philips Electronics N.V.
    Inventors: Sheng Jin, Xin Chen, Yang Peng, Ningjiang Chen, Yunji Xia
  • Publication number: 20100250253
    Abstract: A speech-directed user interface system includes at least one speaker for delivering an audio signal to a user and at least one microphone for capturing speech utterances of a user. An interface device interfaces with the speaker and microphone and provides a plurality of audio signals to the speaker to be heard by the user. A control circuit is operably coupled with the interface device and is configured for selecting at least one of the plurality of audio signals as a foreground audio signal for delivery to the user through the speaker. The control circuit is operable for recognizing speech utterances of a user and using the recognized speech utterances to control the selection of the foreground audio signal.
    Type: Application
    Filed: March 27, 2009
    Publication date: September 30, 2010
    Inventor: Yangmin Shen
  • Publication number: 20100211397
    Abstract: An avatar facial expression representation technology is provided. The avatar facial expression representation technology estimates changes in emotion and emphasis in a user's voice from vocal information, and changes in mouth shape of the user from pronunciation information of the voice. The avatar facial expression technology tracks a user's facial movements and changes in facial expression from image information and may represent avatar facial expressions based on the result of the these operations. Accordingly, the avatar facial expressions can be obtained which are similar to actual facial expressions of the user.
    Type: Application
    Filed: January 28, 2010
    Publication date: August 19, 2010
    Inventors: Chi-youn PARK, Young-Kyoo HWANG, Jung-bae KIM
  • Publication number: 20100198583
    Abstract: The present invention relates to an indicating method for speech recognition system, comprising a multimedia electronic product and a speech recognition device. The steps of this method include: users enter voice commands into a voice input unit and convert these commands into speech signals, which are acquired and stored by a recording unit, converted by a microprocessor into a volume indicating oscillogram, and then displayed by a display module. At the same time, compliance with speech recognition conditions will be decided in that process.
    Type: Application
    Filed: February 4, 2009
    Publication date: August 5, 2010
    Applicant: AIBELIVE CO., LTD.
    Inventors: Chen-Wei Su, Chun-Ping Fang, Min-Ching Wu
  • Publication number: 20100145710
    Abstract: A method for developing a voice user interface for a statistical semantic system is described. A set of semantic meanings is defined that reflect semantic classification of a user input dialog. Then, a set of speech dialog prompts is automatically developed from an annotated transcription corpus for directing user inputs to corresponding final semantic meanings. The statistical semantic system may be a call routing application where the semantic meanings are call routing destinations.
    Type: Application
    Filed: December 8, 2008
    Publication date: June 10, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventor: Real Tremblay
  • Publication number: 20100063822
    Abstract: A communication system that is specifically designed for the needs of speech impaired individuals, particularly aphasia victims, makes use of a speech generating mobile terminal communication device (SGMTD) (12) that is designed to be hand held and operated by a speech disabled individual. The SGMTD includes a database of audio files that are accessed to generate full sentences in response to single word or short phrase entries selected from a plurality of menus by the disabled user. A second, companion mobile terminal device (COMTD) (14) enables a caregiver to communicate with the speech disabled individual's SGMTD to assist the individual in communicating with the caregiver by causing the SGMTD to switch to a particular menu or list from which the caregiver wants the disabled individual to make a selection. The SGMTD also includes software that enables the device to communicate with other SGMTDs via wireless communications and thereby simulate a verbal conversation between speech impaired individuals.
    Type: Application
    Filed: April 21, 2008
    Publication date: March 11, 2010
    Inventors: Daniel C. O'Brien, Edward T. Buchholz
  • Publication number: 20100057466
    Abstract: A method and communication device disclosed includes displaying a video on a display, converting voice audio data to textual data by applying voice-to-text conversion, and displaying the textual data as scrolling text displayed along with the video on the display and either above, below or across the video. The method may further include receiving a voice call indication from a network, providing the voice call indication to a user interface where the voice call indication corresponds to an incoming voice call; and receiving a user input for receiving the voice call and displaying the voice call as scrolling text. In another embodiment, a method includes displaying application related data on a display; converting voice audio data to textual data by applying voice-to-text conversion; converting the textual data to a video format; and displaying the textual data as scrolling text over the application related data on said display.
    Type: Application
    Filed: September 17, 2008
    Publication date: March 4, 2010
    Applicant: ATI Technologies ULC
    Inventors: Dinesh Kumar Garg, Manish Poddar
  • Publication number: 20100049528
    Abstract: A method for providing an audible prompt to a user within a vehicle. The method includes retrieving one or more data files from a memory device. The data files define certain characteristics of an audio prompt. The method also includes creating the audio prompt from the data files and outputting the audio prompt as an audio signal.
    Type: Application
    Filed: January 4, 2008
    Publication date: February 25, 2010
    Inventors: Mark Zeinstra, Richard J. Chutorash, Jeffrey Golden, Jon M. Skekloff
  • Publication number: 20100049512
    Abstract: Disclosed is an encoding device and others capable of suppressing quantization distortion while suppressing increase of a bit rate when encoding audio or the like. In the device, a dynamic range calculation unit (12) calculates a dynamic range of an input spectrum as an index indicating a peak of the input spectrum, a pulse quantity decision unit (13) decides the number of pulses of a vector candidate outputted from a shape codebook (14), and a shape codebook (14) outputs a vector candidate having the number of pulses decided by the pulse quantity decision unit (13) according to control from the search unit (17) by using a vector candidate element {?1, 0, +1}.
    Type: Application
    Filed: December 14, 2007
    Publication date: February 25, 2010
    Applicant: PANASONIC CORPORATION
    Inventors: Masahiro Oshikiri, Tomofumi Yamanashi
  • Publication number: 20100049502
    Abstract: Methods and systems of performing user input recognition are disclosed. A digital directory comprising listings is accessed. Metadata information is associated with individual listings describing the individual listings. The metadata information is modified to generate transformed metadata information. Therefore, the transformed metadata information is generated as a function of context information relating to a typical user interaction with the listings. Information is generated for aiding in an automated user input recognition process based on the transformed metadata information.
    Type: Application
    Filed: November 2, 2009
    Publication date: February 25, 2010
    Applicant: Microsoft Corporation
    Inventors: Kyle Oppenheim, David Mitby, Nick Kibre
  • Publication number: 20090313012
    Abstract: A teleconference terminal apparatus (200) including: an input unit (201) which receives a speech signal; an analyzing unit (202) which calculates a target size on a predetermined segment basis of a speech signal; a coding unit (203) which codes the speech signal to generate a data stream, so that the coded data size on a predetermined segment basis becomes the target size corresponding to each of predetermined segments; a stream transmitting unit (204) which transmits to a network the generated data stream; a receiving unit (205) which receives the data stream transmitted from another terminal apparatus; a filtering unit (206) which determines whether or not segment data is to be decoded on a basis of a data size for each predetermined segment in the received data stream, the segment data being included in the data stream; a decoding unit (207) which decodes segment data determined to be decoded to generate a speech signal; and an output unit (209) which outputs the speech signal generated by the decoding uni
    Type: Application
    Filed: October 24, 2008
    Publication date: December 17, 2009
    Inventor: Kojiro Ono
  • Publication number: 20090313011
    Abstract: A method for identifying a frame type is disclosed. The present invention includes receiving current frame type information, obtaining previously received previous frame type information, generating frame identification information of a current frame using the current frame type information and the previous frame type information, and identifying the current frame using the frame identification information. And, a method for identifying a frame type is disclosed. The present invention includes receiving a backward type bit corresponding to current frame type information, obtaining a forward type bit corresponding to previous frame type information, generating frame identification information of a current frame by placing the backward type bit at a first position and placing the forward type bit at a second position.
    Type: Application
    Filed: May 8, 2009
    Publication date: December 17, 2009
    Applicant: LG Electronics INC.
    Inventors: Sang Bae CHON, Lae Hoon Kim, Koeng Mo Sung
  • Publication number: 20090306978
    Abstract: A method of encoding and decoding languages for international communication. A set of core words may be encoded, although the full vocabulary of the language might also be covered. The result is particularly suitable for use by people in relation to the keypad of a mobile phone, but may also be implemented in translation or communication software to create a language database for example. The encoding includes assigning digital symbols to selected words in the language, assigning alphanumeric representations to the digital symbols, and assigning pronounceable elements to the alphanumeric representations.
    Type: Application
    Filed: November 2, 2006
    Publication date: December 10, 2009
    Applicant: LISTED VENTURES PTY LTD
    Inventor: Robert Andrew McMahon McNeilly
  • Publication number: 20090281810
    Abstract: A method of visually presenting audio signals includes receiving an audio signal to be presented; generating a predetermined number of discrete frequency components from the audio signal; assigning a graphical object to each of the frequency components, each of the graphical objects being specified by a geometrical shape, a position information and a size information; and all of the graphical objects associated with all of the frequency components are displayed simultaneously on a graphic display. The system includes a microphone for generating audio signals; an audio interface unit for sampling the audio signals and transforming them into digital signals; a processing unit for translating digital signals into a predetermined number of discrete frequency components and for assigning a graphical object to each of the discrete frequency components; a video interface unit for generating a video signal; and a graphic display for displaying a sonogram based on the video signal.
    Type: Application
    Filed: June 25, 2007
    Publication date: November 12, 2009
    Applicant: Ave-Fon Kft.
    Inventors: Istvan Sziklai, Istvan Hazman, Jozsef Imrek