Speech Recognition (epo) Patents (Class 704/E15.001)

E Subclasses

Assessment or evaluation of speech recognition systems (epo) (Class 704/E15.002)

Language recognition (epo) (Class 704/E15.003)

Feature extraction for speech recognition; selection of recognition unit (epo) (Class 704/E15.004)

Segmentation or word limit detection (epo) (Class 704/E15.005)

Word boundary detection (EPO) (Class 704/E15.006)

Creation of reference templates; training of speech recognition systems, e.g., adaption to the characteristics of the speaker's voice, etc. (epo) (Class 704/E15.007)

Speech classification or search (epo) (Class 704/E15.014)

Speech recognition techniques for robustness in adverse environments, e.g., in noise, of stress induced speech, etc. (epo) (Class 704/E15.039)

Procedures used during a speech recognition process, e.g., man-machine dialogue, etc. (epo) (Class 704/E15.04)

Speech recognition using nonacoustical features, e.g., position of the lips, etc. (epo) (Class 704/E15.041)

Using position of the lips, movement of the lips, or face analysis (EPO) (Class 704/E15.042)

Speech to text systems (epo) (Class 704/E15.043)

Constructional details of speech recognition systems (epo) (Class 704/E15.046)

Device and Method for Superimposing Patterns on Images in Real-Time, Particularly for Guiding by Localisation

Publication number: 20110060347

Abstract: The invention relates to a device for superimposing known patterns, characteristic of a region, on (real) images of said region. The device comprises, a memory in which patterns are stored, which are representative of a selected region, of known position and orientation with relation to a common reference and processing means, for determining a pattern representative of the selected portion in the memory, on receipt of the designation of at least one portion of an observed image of the selected region, taken at a selected angle and at least one representative attribute of said region, taking account of the attribute selected, then superimposing the determined pattern on the selected portion of the image taking account of the selected angle.

Type: Application

Filed: October 20, 2010

Publication date: March 10, 2011

Applicant: Intuitive Surgical Operations, Inc.

Inventors: Eve Coste-Maniere, Thierry Vieville, Fabien Mourgues
Personal calendar Updates and hand held communication devices

Publication number: 20110054788

Abstract: Methods and systems are provided for updating calendar and appointments based on a conversation on the hand held device. The software application and/or device shall sense the end of the call and prompt the user to decide if the call resulted in an appointment. If he answers in the affirmative, he is taken to the calendar application within the device to update his calendar with the appointment.

Type: Application

Filed: September 1, 2009

Publication date: March 3, 2011

Inventor: Vic Iyer
ELECTRONIC SHOPPING ASSISTANT WITH SUBVOCAL CAPABILITY

Publication number: 20110054904

Abstract: A mobile device suitable for use by a user in a store includes a subvocal message (SVM) module to detect an SVM from the user. The SVM includes data that indicates an item in the store. A transmitter transmits a request after detecting the SVM. The request includes information indicating the item. A receiver receives a reply. The reply includes information responsive to the request. An output device provides the responsive information to the user. The request may include a request for item position information, item price information, or item inventory information. The mobile device may detect the SVM via a subvocal sensor coupled to the user. The subvocal sensor may be in contact with the user in proximity to a vocal cord of the user. The subvocal sensor may be connected to the mobile device wirelessly or via a wire.

Type: Application

Filed: August 28, 2009

Publication date: March 3, 2011

Applicant: STERLING COMMERCE, INC.

Inventor: Charles Stanley Fenton
APPARATUS AND METHOD FOR AUDIO MAPPING

Publication number: 20110054890

Abstract: A mobile phone, and corresponding method, which is arranged to detect sounds of different types and to indicate to a user the direction from which those sounds are coming from. The mobile phone includes a microphone for recording sound and a display for providing feedback to the user. The phone also includes a sound mapping program which is arranged to interpret the sound recorded by the microphone and to provide an audio map of detected sounds. This is presented to the user on the display.

Type: Application

Filed: August 25, 2009

Publication date: March 3, 2011

Applicant: Nokia Corporation

Inventors: Aarne Vesa Pekka Ketola, Panu Marten Jesper Johansson
GESTURE-BASED INFORMATION AND COMMAND ENTRY FOR MOTOR VEHICLE

Publication number: 20110050589

Abstract: A method of receiving input from a user includes providing a surface within reach of a hand of the user. A plurality of locations on the surface that are touched by the user are sensed. An alphanumeric character having a shape most similar to the plurality of touched locations on the surface is determined. The user is audibly or visually informed of the alphanumeric character and/or a word in which the alphanumeric character is included. Feedback is received from the user regarding whether the alphanumeric character and/or word is an alphanumeric character and/or word that the user intended to be determined in the determining step.

Type: Application

Filed: August 28, 2009

Publication date: March 3, 2011

Applicant: ROBERT BOSCH GMBH

Inventors: BAOSHI YAN, Fuliang Weng, Liu Ren, You-Chi Cheng, Zhongnan Shen
Voice activated finding device

Publication number: 20110050412

Abstract: Selected objects may be located by pushing a button on a keypad of a base unit, or by giving an oral command thereto. A receiver microprocessor, loaded with a unique electronic address, is attachable to each object. A base unit PROM is loaded with a library of digitized voice command templates. A user's command to find a lost object is received by a microphone, and digitized. The digitized command is compared with the templates using pattern recognition algorithms, which may utilize a Hidden Markov Model. When matched, the base unit processor causes radio transmission of RF interrogation packets targeted at the unique address corresponding to the lost object. A receiver chip detects the interrogation packets, and compares the transmitted unique address with the address stored in its microprocessor. Where matched, the microprocessor modulates a sounding device to direct the user to the lost object.

Type: Application

Filed: August 4, 2010

Publication date: March 3, 2011

Inventors: Cynthia Wittman, Gabe Neiser, Michael Keating
VOICE INTERACTIVE SERVICE SYSTEM AND METHOD FOR PROVIDING DIFFERENT SPEECH-BASED SERVICES

Publication number: 20110054905

Abstract: A voice interactive service system provides different speech-based services to a plurality of users. Using a communication terminal, the services are accessed via a telecommunication network through service-specific connectivity ports. The system comprises processing cores which have different configurations of speech processing resources for performing different services. For performing a requested service, a connection module establishes a connection between the respective connectivity port and a processing core having a configuration of speech processing resources suitable for performing the requested service. Because of the service-specific resourcing of cores, there is no need for requesting and allocating processing resources from external resource servers. Moreover, the port-dedicated resourcing of the cores ensures that a successful access to a connectivity port leads to a successful provision of the requested service.

Type: Application

Filed: August 26, 2010

Publication date: March 3, 2011

Applicant: me2me AG

Inventors: Roger LAGADEC, Patrik ESTERMANN, Luciano BUTERA
AUTOMATIC SOUND RECOGNITION BASED ON BINARY TIME FREQUENCY UNITS

Publication number: 20110046948

Abstract: The invention relates to a method of automatic sound recognition. The object of the present invention is to provide an alternative scheme for automatically recognizing sounds, e.g. human speech. The problem is solved by providing a training database comprising a number of models, each model representing a sound element in the form of a binary mask comprising binary time frequency (TF) units which indicate the energetic areas in time and frequency of the sound element in question, or of characteristic features or statistics extracted from the binary mask; providing an input signal comprising an input sound element; estimating the input sound element based on the models of the training database to provide an output sound element. The method has the advantage of being relatively simple and adaptable to the application in question. The invention may e.g. be used in devices comprising automatic sound recognition, e.g. for sound, e.g. voice control of a device, or in listening devices, e.g.

Type: Application

Filed: August 4, 2010

Publication date: February 24, 2011

Inventor: Michael Syskind PEDERSEN
DEVICE, METHOD AND SYSTEM FOR DETECTING UNWANTED CONVERSATIONAL MEDIA SESSION

Publication number: 20110046949

Abstract: Some embodiments of the invention relate to a method and a system for detecting unwanted conversational media session data. In accordance with one aspect of the invention, a method of detecting unwanted conversation media session data according to some embodiments of the invention may include calculating two or more progressive similarity scores each with respect to a different instant during a progress of a real-time conversational media session, wherein each of said scores is associated with a similarity between the conversational media session's media data that was available at the associated instant and a reference data item corresponding to media data of a previous conversational media session, and evaluating progressive similarity between the real-time conversational media session and the reference data item based upon the two or more progressive similarity scores.

Type: Application

Filed: November 2, 2010

Publication date: February 24, 2011

Applicant: Commtouch Software Ltd.

Inventors: Aharon Satt, Amir Lev
VOICE TRIGGERING CONTROL DEVICE AND METHOD THEREOF

Publication number: 20110046962

Abstract: A voice triggering control device for enabling a data collection host which assembled on it comprises a processing unit, a speaker, a control module, a power supply module and a housing containing the elements disclosed above. The control device controls the processing unit to output a high-frequency audio signal which is corresponded to an act command Then, broadcasting a high-frequency audio through the speaker, wherein the high-frequency audio is generated by the high-frequency audio signal, and the data collection host is enabled to perform the act command while receiving and decoding the high-frequency audio. Thereby, making the triggering control device enabling the data collection host proceed a functional action by the high-frequency audio can solve the contact fault problem in the prior art.

Type: Application

Filed: September 16, 2009

Publication date: February 24, 2011

Applicant: ASKEY COMPUTER CORP.

Inventors: Ting-Lin Chang, Ching-Feng Hsieh
TREND DISCOVERY IN AUDIO SIGNALS

Publication number: 20110044447

Abstract: Techniques for processing data representative of text associated with one or more content sources to generate a specification of a set of keyphrases of interest; processing a first set of audio signals collected during a first time period to generate first data characterizing putative occurrences of one or more keyphrases of the set in the first set of audio signals; evaluating the first data to generate keyphrase-specific comparison values for the first set of audio signals; deriving first trending data between the first set of audio signals and a second set of audio signals based in part on an analysis of the keyphrase-specific comparison values for the first set of audio signals relative to stored keyphrase-specific baseline values; and generating a visual representation of at least some of the first trending data and causing the visual representation of the first trending data to be presented on a display terminal.

Type: Application

Filed: August 21, 2009

Publication date: February 24, 2011

Applicant: Nexidia Inc.

Inventors: Robert W. Morris, Marsal Gavalda, Peter S. Cardillo, Jon A. Arrowood
TECHNIQUES FOR PERSONAL SECURITY VIA MOBILE DEVICES

Publication number: 20110039514

Abstract: Techniques for achieving personal security via mobile devices are presented. A portable mobile communication device, such as a phone or a personal digital assistant (PDA), is equipped with geographic positioning capabilities and is equipped with audio and visual devices. A panic mode of operation can be automatically detected in which real time audio and video for an environment surrounding the portable communication device are captured along with a geographic location for the portable communication device. This information is streamed over the Internet to a secure site where it can be viewed in real time and/or later inspected.

Type: Application

Filed: August 13, 2009

Publication date: February 17, 2011

Inventors: Sandeep Patnaik, Saheednanda Singh, Anilkumar Bolleni
Voice Control Device and Voice Control Method and Display Device

Publication number: 20110040563

Abstract: A voice control device for a display device includes a voice receiver for receiving a voice signal, a voice recognition unit coupled to the voice receiver for recognizing the voice signal to generate a recognition result, a function decision unit coupled to the voice recognition unit for selecting an operating function from a plurality of operating functions according to the recognition result, and an execution unit coupled to the function decision unit for controlling the display device to perform the operating function.

Type: Application

Filed: February 5, 2010

Publication date: February 17, 2011

Inventors: Xie-Ren Hsu, Kuang-Feng Sung
METHOD AND MEANS FOR DECODING BACKGROUND NOISE INFORMATION

Publication number: 20110040560

Abstract: A basic idea of the invention is to ascertain information on the course of the bit rate switching during an active speech phase. According to the invention, during the speech phase, information on the percentage proportion of broadband active speech frames in comparison to narrowband active speech frames is compiled on the part of the decoder. A high percentage proportion of broadband active speech frames indicates that a broadband use is preferred on the part of the codec and therefore a need exists for synthesizing noise information in broadband form during a DTX phase.

Type: Application

Filed: February 2, 2009

Publication date: February 17, 2011

Inventors: Panji Setiawan, Stefan Schandl, Herve Taddei
Monitoring An Audience Participation Distribution

Publication number: 20110035221

Abstract: Apparatus for monitoring an audience participation distribution at an event comprising a speech activity module operable to generate speech data representing speech detected at the event, a speaker identification module operable to determine, using the speech data, a first speaker who has contributed to the detected speech, and a processing unit operable to generate speaker data representing a value for the time that the first speaker has contributed to the detected speech and to output distribution data based on the speaker data representing a measure of the participation for the first speaker at the event.

Type: Application

Filed: August 7, 2009

Publication date: February 10, 2011

Inventors: Tong Zhang, Hui Chao, Xuemei Zhang
Multimodal Teleconferencing

Publication number: 20110032845

Abstract: Multimodal teleconferencing including receiving, by a multimodal teleconferencing module, a speech utterance from one of a plurality of participants in the multimodal teleconference; identifying the participant making the speech utterance as a current speaker; retrieving, by the multimodal teleconferencing module from accounts for the current speaker, content for display to the current speaker; retrieving, by the multimodal teleconferencing module from accounts for the current speaker, content for display to one or more other participants in the multimodal teleconference; providing, by the multimodal teleconferencing module to a multimodal teleconferencing client for display to the current speaker, an identification of the speaker and the content retrieved for the speaker; and providing, by the multimodal teleconferencing module to one or more of multimodal teleconferencing clients for display to the other participants, an identification of the current speaker with the content retrieved for the one or more ot

Type: Application

Filed: August 5, 2009

Publication date: February 10, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR.
SYSTEM AND METHOD FOR ADDRESS RECOGNITION AND CORRECTION

Publication number: 20110035224

Abstract: A system, method, and computer-readable medium for parcel address recognition. A method includes receiving an address input and producing candidate address results corresponding to the address input. The method includes receiving operational scheme knowledge describing the mode of operation of a parcel processing system, and receiving at least one operational rule corresponding to the operational scheme knowledge. The method includes applying the at least one operational rule to the candidate address results and producing and storing a finalized result according to the operational rule and the candidate address results.

Type: Application

Filed: July 30, 2010

Publication date: February 10, 2011

Inventor: Stanley W. Sipe
METHOD, DEVICE AND SYSTEM FOR SPEECH RECOGNITION

Publication number: 20110035215

Abstract: Disclosed is a method and apparatus for signal processing and signal pattern recognition. According to some embodiments of the present invention, events in the signal to be processed/recognized may be used to pace or clock the operation of one or more processing elements. The detected events may be based on signal energy level measurements. The processing/recognition elements may be neuron models. The signal to be processed/recognized may be a speech signal.

Type: Application

Filed: August 28, 2008

Publication date: February 10, 2011

Inventors: Haim Sompolinsky, Robert Guetig
AUTOMATED COMMUNICATION INTEGRATOR

Publication number: 20110035220

Abstract: An apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command.

Type: Application

Filed: August 5, 2009

Publication date: February 10, 2011

Applicant: Verizon Patent and Licensing Inc.

Inventor: Robert E. Opaluch
Speech & Music Discriminator for Multi-Media Application

Publication number: 20110029308

Abstract: The present invention relates to means and methods of classifying speech and music signals in voice communication systems, devices, telephones, and methods, and more specifically, to systems, devices, and methods that automate control when either speech or music is detected over communication links. The present invention provides a novel system and method for monitoring the audio signal, analyze selected audio signal components, compare the results of analysis with a pre-determined threshold value, and classify the audio signal either as speech or music.

Type: Application

Filed: June 10, 2010

Publication date: February 3, 2011

Inventors: Alon Konchitsky, Alberto D. Berstein, Sandeep Kulakcherla, William Martin Ribble, Kevin Fitzgerald, Don Seferovich
METHOD AND SYSTEM FOR AUTHENTICATING TELEPHONE CALLERS AND AVOIDING UNWANTED CALLS

Publication number: 20110026699

Abstract: A service that handles incoming telephone calls without bothering the telephone subscriber is disclosed. The service permits a call to go through to a subscriber if the service determines that the call is not unwanted and the caller has been unauthenticated. The authentication is based on challenging the caller to prove its identity rather than relying on caller ID displays. Prospective callers pre-register with the service providing caller account information. When a caller is issued a challenge, the caller may prove its authenticity by supplying the challenge back to the service along with its registered information.

Type: Application

Filed: July 30, 2009

Publication date: February 3, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Arnon Amir, Nimrod Megiddo
SYSTEM AND METHOD FOR MOBILE AUTOMATIC SPEECH RECOGNITION

Publication number: 20110029307

Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.

Type: Application

Filed: October 8, 2010

Publication date: February 3, 2011

Applicant: AT&T Intellectual Property II, L.P. via transfer form AT&T Corp.

Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
SPEECH DEVICE, SPEECH CONTROL PROGRAM, AND SPEECH CONTROL METHOD

Publication number: 20110022390

Abstract: In order to speak numerals in a manner readily comprehensible to a user, a speech device includes a voice synthesis portion 55 which, when a given character string includes a numeral made up of a plurality of digits, speaks the numeral in either a first speech method in which the numeral is read aloud as individual digits or a second speech method in which the numeral is read aloud as a full number, a user definition table 81, an association table 83, a region table 84, and a digit number table 87 which associate a type of a character string with either the first speech method or the second speech method, a process executing portion 53 which executes a process to thereby output data, and a speech control portion 51 which generates a character string on the basis of the output data and causes the voice synthesis portion 55 to speak the generated character string in one of the first and second speech methods that is associated with the type of the output data.

Type: Application

Filed: February 4, 2009

Publication date: January 27, 2011

Applicant: SANYO ELECTRIC CO., LTD.

Inventors: Kinya Otani, Naoki Hirose
METHODS AND SYSTEMS FOR SEARCHING AUDIO RECORDS

Publication number: 20110019805

Abstract: Methods and systems are provided for searching audio records. Certain embodiments of the invention may be applied to search audio records containing a user's voice for instances where a specific sound, such as a word or phrase, is vocalized by the user. An audio sample is provided by recording the user vocalizing the sound. The audio sample is compared with the audio records to locate matches to the audio sample. In some embodiments, the audio records comprise recordings of calls between a near-end caller and a far-end caller, and the audio sample is a recording of a sound spoken by the near-end caller. The same input device may be used to record both the audio sample and the audio records.

Type: Application

Filed: January 14, 2009

Publication date: January 27, 2011

Applicant: ALGO COMMUNICATION PRODUCTS LTD.

Inventor: Paul William Zoehner
WIND TURBINE CONTROL SYSTEM AND METHOD FOR INPUTTING COMMANDS TO A WIND TURBINE CONTROLLER

Publication number: 20110022384

Abstract: A method and a control system are provided for inputting commands to a wind turbine controller during a service or maintenance procedure. A command orally input by a user is transformed into an electrical signal representing the orally input command. The electrical signal is transformed into an input command signal which is further transformed into a reproduction signal. A user is provided the reproduction signal along with a confirmation request in a form recognized by a user, such as visually or speech representation. After the user confirms the request, a signal based on the input command is sent to the wind tower controller.

Type: Application

Filed: August 22, 2008

Publication date: January 27, 2011

Inventor: Michael Jensen
METHOD AND SYSTEM FOR SPEECH RECOGNITION USING SOCIAL NETWORKS

Publication number: 20110022388

Abstract: In an example embodiment, there is disclosed an apparatus comprising an audio interface configured to receive an audio signal, a data interface is configured to communicate with at least one social graph, and logic is coupled to the audio interface and the data interface. The logic is configured to identify a calling party. The logic is further configured to acquire data representative of a called party from the audio signal. The logic is configured to initiate a search of the at least one social graph for the data representative of the called party to identify the called party responsive to acquiring the data representative of the called party.

Type: Application

Filed: July 27, 2009

Publication date: January 27, 2011

Inventors: Sung Fong Solomon WU, Aaron TONG, Sam C. LEE
METHOD AND EQUIPMENT OF PATTERN RECOGNITION, ITS PROGRAM AND ITS RECORDING MEDIUM

Publication number: 20110022385

Abstract: The present invention provides a method and equipment of pattern recognition capable of efficiently pruning partial hypotheses without lowering recognition accuracy, its pattern recognition program, and its recording medium. In a second search unit, a likelihood calculation unit calculates an acoustic likelihood by matching time series data of acoustic feature parameters against a lexical tree stored in a second database and an acoustic model stored in a third database to determine an accumulated likelihood by accumulating the acoustic likelihood in a time direction. A self-transition unit causes each partial hypothesis to make a self-transition in a search process. An LR transition unit causes each partial hypothesis to make an RL transition. A reward attachment unit adds a reward R(x) in accordance with the number of reachable words to each partial hypothesis to raise the accumulated likelihood. A pruning unit excludes partial hypotheses with less likelihood from search targets.

Type: Application

Filed: July 22, 2010

Publication date: January 27, 2011

Applicant: KDDI CORPORATION

Inventor: Tsuneo Kato
METHOD AND SYSTEM FOR IMPROVING SPEECH RECOGNITION ACCURACY BY USE OF GEOGRAPHIC INFORMATION

Publication number: 20110022292

Abstract: A method for speech recognition includes providing a source of geographical information within a vehicle. The geographical information pertains to a current location of the vehicle, a planned travel route of the vehicle, a map displayed within the vehicle, and/or a gesture marked by a user on a map. Words spoken within the vehicle are recognized by use of a speech recognition module. The recognizing is dependent upon the geographical information.

Type: Application

Filed: July 27, 2009

Publication date: January 27, 2011

Applicant: Robert Bosch GmbH

Inventors: Zhongnan Shen, Fuliang Weng, Zhe Feng
INFORMATION PROCESSING SYSTEM AND INFORMATION PROCESSING METHOD

Publication number: 20110022392

Abstract: A framework is provided which performs location-based analysis using an individual feature such as a stress level obtained based on biological information.

Type: Application

Filed: December 18, 2009

Publication date: January 27, 2011

Inventor: Takashi IWAMOTO
MULTIMODE USER INTERFACE OF A DRIVER ASSISTANCE SYSTEM FOR INPUTTING AND PRESENTATION OF INFORMATION

Publication number: 20110022393

Abstract: In a method for multimode information input and/or adaptation of the display of a display and control device, input signals of different modality are detected which are supplied via the device to a voice recognition unit, thus initiating a desired function and/or display as an output signal, which are displayed on the device and/or output by voice output. Touch and/or gesture input signals are provided on or to the device for selection of an object intended for interaction and activation of the voice recognition unit and for the vocabulary which is provided for interaction to be restricted with the selection of the object and/or activation of the voice recognition unit as a function of the selected object, on the basis of which a voice command from the restricted vocabulary is added to the selected object as an information input and/or for adaptation of the display, via the voice recognition unit.

Type: Application

Filed: November 12, 2008

Publication date: January 27, 2011

Inventors: Christoph Wäller, Moritz Neugebauer, Thomas Fabian, Ulrike Wehling, Günter Horna, Markus Missall
ACOUSTIC SOURCE SEPARATION

Publication number: 20110015924

Abstract: A method of separating a mixture of acoustic signals from a plurality of sources comprises: providing pressure signals indicative of time-varying acoustic pressure in the mixture; defining a series of time windows; and for each time window: a) providing from the pressure signals a series of sample values of measured directional pressure gradient; b) identifying different frequency components of the pressure signals c) for each frequency component defining an associated direction; and d) from the frequency components and their associated directions generating a separated signal for one of the sources.

Type: Application

Filed: October 17, 2008

Publication date: January 20, 2011

Inventors: Banu Gunel Hacihabiboglu, Huseyin Hacihabiboglu, Ahmet Kondoz
TEXT PROCESSING METHOD FOR A DIGITAL CAMERA

Publication number: 20110014944

Abstract: Embodiments disclose a technique to recognize text in a current frame of an image in a view finder of a digital camera. In accordance with the technique, text at a marker (e.g. a cursor or cross hairs) associated with the view finder is recognized and a lookup is performed based on the recognized text. Advantageously, the lookup yields useful information e.g. a translation of a recognized word that is displayed in the viewfinder adjacent to the text. The current frame is not captured by a user. As the user moves the camera to position a new word at the marker, the view finder is updated to provide lookup results associated with the new word. Lookups may be performed of a bilingual dictionary, a monolingual dictionary, a reference book, a travel guide, etc. Embodiments of the invention also cover digital cameras or mobile devices that implement the aforementioned technique.

Type: Application

Filed: July 13, 2010

Publication date: January 20, 2011

Applicant: Abbyy Software Ltd.

Inventor: BORIS SAMOYLOV
METHOD FOR SONG SEARCHING BY VOICE

Publication number: 20110015932

Abstract: The present invention relates to a method for song searching by voice, especially the method with which users can complete settings and then start searching, so that the users' voices of search conditions will be acquired to make voice recognition, and the recognition results will be compared with the instruction data and song attribute data in the voice recognition database to obtain comparison data. If the comparison data do not correspond with the preset conditions, the next search condition generated from the comparison data will be broadcast with voice, and the users are allowed to speak out the next search condition to make comparisons of search conditions in the next process. If the comparison data correspond with the preset conditions, one or more song files will be read according to the comparison data and will be given a preview.

Type: Application

Filed: September 4, 2009

Publication date: January 20, 2011

Inventors: Chen-Wei SU, Tsung-Han Tsai, Chun-Ping Fang
USE OF MULTIPLE SPEECH RECOGNITION SOFTWARE INSTANCES

Publication number: 20110010170

Abstract: A wireless communication device is disclosed that accepts recorded audio data from an end-user. The audio data can be in the form of a command requesting user action. Likewise, the audio data can be converted into a text file. The audio data is reduced to a digital file in a format that is supported by the device hardware, such as a .wav, .mp3, .vnf file, or the like. The digital file is sent via secured or unsecured wireless communication to one or more server computers for further processing. In accordance with an important aspect of the invention, the system evaluates the confidence level of the of the speech recognition process. If the confidence level is high, the system automatically builds the application command or creates the text file for transmission to the communication device.

Type: Application

Filed: September 17, 2010

Publication date: January 13, 2011

Inventors: Stephen S. Burns, Mickey W. Kowitz
QUESTION AND ANSWER DATABASE EXPANSION APPARATUS AND QUESTION AND ANSWER DATABASE EXPANSION METHOD

Publication number: 20110010177

Abstract: A question and answer database expansion apparatus includes: a question and answer database in which questions and answers corresponding to the questions are registered in association with each other, a first speech recognition unit which carries out speech recognition for an input sound signal by using a language model based on the question and answer database, and outputs a first speech recognition result as the recognition result, a second speech recognition unit which carries out speech recognition for the input sound signal by using a language model based on a large vocabulary database, and outputs a second speech recognition result as the recognition result, and a question detection unit which detects an unregistered utterance, which is not registered in the question and answer database, from the input sound based on the first speech recognition result and the second speech recognition result, and outputs the detected unregistered utterance.

Type: Application

Filed: July 8, 2010

Publication date: January 13, 2011

Applicant: HONDA MOTOR CO., LTD.

Inventors: Mikio NAKANO, Kotaro FUNAKOSHI, Hiromi NARIMATSU
Panoramic Attention For Humanoid Robots

Publication number: 20110004341

Abstract: A robot using less storage and computational resources to embody panoramic attention. The robot includes a panoramic attention module with multiple levels that are hierarchically structured to process different levels of information. The top-level of the panoramic attention module receives information about entities detected from the environment of the robot and maps the entities to a panoramic map maintained by the robot. By mapping and storing high-level entity information instead of low-level sensory information in the panoramic map, the amount of storage and computation resources for panoramic attention can be reduced significantly. Further, the mapping and storing of high-level entity information in the panoramic map also facilitates consistent and logical processing of different conceptual levels of information.

Type: Application

Filed: June 18, 2010

Publication date: January 6, 2011

Applicant: HONDA MOTOR CO., LTD.

Inventors: Ravi Kiran Sarvadevabhatla, Victor Ng-Thow-Hing
Audience Measurement System Utilizing Voice Recognition Technology

Publication number: 20110004474

Abstract: A method, a system, and a computer program product for determining a total count of audience members within a sensory receiving environment during the presentation of a program. A voice recognition unit is enabled when a signal for a program/subject/event, such as a broadcast program, is received. The voice recognition unit receives one or more sounds in the sensory receiving environment and analyzes the characteristics of the sounds. When one or more unique human voices are identified during the program, a count of the number of unique human voices is determined. The count of unique human voices is transmitted to a server, whereby the count of unique human voices is equal to a count of audience members. The total count of audience members is calculated for all sensory receiving environment associated with the program. An audience analysis graphical user interface is generated to display the total count of audience members.

Type: Application

Filed: July 2, 2009

Publication date: January 6, 2011

Applicant: International Business Machines Corporation

Inventors: Ravi P. Bansal, Mike V. Macias, Saidas T. Kottawar, Salil P. Gandhi, Sandip D. Mahajan
Battery Management System And Method

Publication number: 20100332233

Abstract: A battery-management method is performed by a battery-operated device. The method includes allocating a first portion of a battery capacity to a first function and a second portion of the battery capacity to a second function. The method further includes simultaneously displaying a first indicator relating to the first portion of the battery capacity and a second indicator relating to the second portion of the battery capacity.

Type: Application

Filed: September 13, 2010

Publication date: December 30, 2010

Applicant: RESEARCH IN MOTION LIMITED

Inventors: Joseph C. Chen, Jonathan Malton
METHOD AND APPARATUS FOR CONVERTING TEXT TO AUDIO AND TACTILE OUTPUT

Publication number: 20100332224

Abstract: In accordance with an example embodiment of the present invention, an apparatus comprises a controller configured to process punctuated text data, and to identify punctuation in said punctuated text data; and an output unit configured to generate audio output corresponding to said punctuated text data, and to generate tactile output corresponding to said identified punctuation.

Type: Application

Filed: June 30, 2009

Publication date: December 30, 2010

Applicant: NOKIA CORPORATION

Inventors: Jakke Sakari Mäkelä, Jukka Pekka Naula, Niko Santeri Porjo
INTELLIGENT HOME AUTOMATION

Publication number: 20100332235

Abstract: An intelligent home automation system answers questions of a user speaking “natural language” located in a home. The system is connected to, and may carry out the user's commands to control, any circuit, object, or system in the home. The system can answer questions by accessing the Internet. Using a transducer that “hears” human pulses, the system may be able to identify, announce and keep track of anyone entering or staying in the home or participating in a conversation, including announcing their identity in advance. The system may interrupt a conversation to implement specific commands and resume the conversation after implementation. The system may have extensible memory structures for term, phrase, relation and knowledge, question answering routines and a parser analyzer that uses transformational grammar and a modified three hypothesis analysis. The parser analyzer can be dormant unless spoken to. The system has emergency modes for prioritization of commands.

Type: Application

Filed: June 29, 2009

Publication date: December 30, 2010

Inventor: ABRAHAM BEN DAVID
Security, Safety, Augmentation Systems, And Associated Methods

Publication number: 20100323615

Abstract: A mobile device has a datalog module that captures multimedia data at the mobile device and transmits the multimedia data through cell networks to a control center. The mobile device may also include a GPS sensor wherein location information is included within the multimedia data. A mobile device has a motion module that, when activated at the mobile device or through a cell network, disables communications through the mobile device when in motion. A system disables operation of a mobile device by a vehicle operator and includes a transmitter within the vehicle that generates a disabling signal that, when received by a safety receiver within the mobile device, disables operation of the mobile device. A mobile device has a microphone, and a voice augmentation module which is selectively activated to augment voice data spoken into the mobile device, by removing background noise and/or replacing or changing voice data.

Type: Application

Filed: June 17, 2010

Publication date: December 23, 2010

Inventors: Curtis A. Vock, Perry Youngs
VOICE RECOGNITION SYSTEM, VOICE RECOGNITION METHOD, AND VOICE RECOGNITION PROCESSING PROGRAM

Publication number: 20100324899

Abstract: A speech recognition system for rapidly performing recognition processing while maintaining quality of speech recognition in a speech recognition device, are provided. A speech recognition system includes a speech input device which inputs speech and displays a recognition result, and a speech recognition device which receives the speech from the speech input device, performs recognition processing, and sends back the speech to the speech input device. The speech input device includes a user dictionary section which stores words used for recognizing the input speech, and a reduced user dictionary creation unit which extracts words corresponding to the input speech from the user dictionary and creates a reduced user dictionary. The speech recognition device has a speech recognition unit which inputs the input speech and the reduced user dictionary from the speech input/output device and recognizes the input speech based on the reduced user dictionary and a system dictionary provided beforehand.

Type: Application

Filed: March 14, 2008

Publication date: December 23, 2010

Inventor: Kiyoshi Yamabana
AUDIO RECOGNITION DEVICE AND AUDIO RECOGNITION METHOD

Publication number: 20100324897

Abstract: Acoustic models and language models are learned according to a speaking length which indicates a length of a speaking section in speech data, and speech recognition process is implemented by using the learned acoustic models and language models. A speech recognition apparatus includes means (103) for detecting a speaking section in speech data (101) and for generating a section information which indicates the detected speaking section, means (104) for recognizing a data part corresponding to a section information in the speech data as well as text data (102) written from the speech data and for classifying the data part based on a speaking length thereof, and means (106) for learning acoustic models and language models (107) by using the classified data part (105).

Type: Application

Filed: December 7, 2007

Publication date: December 23, 2010

Inventors: Tadashi Emori, Yoshifumi Onishi
TECHNIQUES TO PROVIDE A STANDARD INTERFACE TO A SPEECH RECOGNITION PLATFORM

Publication number: 20100324910

Abstract: Techniques and systems to provide speech recognition services over a network using a standard interface are described. In an embodiment, a technique includes accepting a speech recognition request that includes at least audio input, via an application program interface (API). The speech recognition request may also include additional parameters. The technique further includes performing speech recognition on the audio according to the request and any specified parameters; and returning a speech recognition result as a hypertext protocol (HTTP) response. Other embodiments are described and claimed.

Type: Application

Filed: June 19, 2009

Publication date: December 23, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Robert L. Chambers, Michael Bodell, Daphne Luong, Annie Wong, Faustinus K. Gozali, Andrew Ho, Rod Philander, Corby Anderson
VOICE CONTROL OF MULTIMEDIA CONTENT

Publication number: 20100318357

Abstract: Techniques are described for managing various types of content in various ways, such as based on voice commands or other voice-based control instructions provided by a user. In some situations, at least some of the content being managed includes content of a variety of types, such as music and other audio information, photos, images, non-television video information, videogames, Internet Web pages and other data, etc., which may be managed via the voice controls in a variety of ways, such as to allow a user to locate and identify content of potential interest, to schedule recordings of selected content, to manage previously recorded content (e.g., to play or delete the content), to control live television, etc. This abstract is provided to comply with rules requiring it, and is submitted with the intention that it will not be used to interpret or limit the scope or meaning of the claims.

Type: Application

Filed: October 22, 2009

Publication date: December 16, 2010

Applicant: Vulcan Inc.

Inventors: Anthony F. Istvan, Korina J.B. Stark, Robin Budd
COMPRESSOR AUGMENTED ARRAY PROCESSING

Publication number: 20100318353

Abstract: The present invention relates generally to the use of compressors, with an optional noise extractor, to improve audio sensing performance of one or more microphones. The audio sensing performance of a single element microphone array with dynamic range compression can be improved by the use of a noise extractor, to modify the operation of the compressor, typically to avoid noise floor amplification. Dynamic range compression can be applied to the output of two or more element microphone array processing with the optional use of a noise extractor. Dynamic range compression can precede the microphone array processing with the optional use of a noise extractor. Syllabic dynamic range compression may be used in one or more element microphone arrays, with the optional use of a noise extractor, which increases speech recognition accuracy.

Type: Application

Filed: June 16, 2010

Publication date: December 16, 2010

Inventor: Karl M. Bizjak
APPARATUS AND METHOD OF EXTENDING PRONUNCIATION DICTIONARY USED FOR SPEECH RECOGNITION

Publication number: 20100312550

Abstract: An apparatus and method for extending a pronunciation dictionary for speech recognition are provided. The apparatus and the method may segment speech information of an input utterance into at least one phoneme, collect segmentation information of the at least one segmented phoneme, analyze a pronunciation variation of the at least one segmented phoneme based on the collected segmentation information, and select a substitutable phoneme group for the at least one phoneme where the pronunciation variation occurs, and extend the pronunciation dictionary.

Type: Application

Filed: February 23, 2010

Publication date: December 9, 2010

Inventor: Gil Ho LEE
LOCAL AND REMOTE AGGREGATION OF FEEDBACK DATA FOR SPEECH RECOGNITION

Publication number: 20100312555

Abstract: A local feedback mechanism for customizing training models based on user data and directed user feedback is provided in speech recognition applications. The feedback data is filtered at different levels to address privacy concerns for local storage and for submittal to a system developer for enhancement of generic training models.

Type: Application

Filed: June 9, 2009

Publication date: December 9, 2010

Applicant: Microsoft Corporation

Inventors: Michael D. Plumpe, Julian Odell, Jon Hamaker, Rob Chambers, Christopher Le, Onur Domanic
VOICE XML NETWORK GATEWAY

Publication number: 20100312558

Abstract: A system (10) for controlling telecommunications calls includes a voice XML network gateway (12) including a voice interpreter module (20) and a call center server module (28) association with a telecommunications switch (58). The voice interpreter module (20) receives voice telecommunications signals, and the call center server module (28) receives call center telecommunications data signals. Interpreting circuitry (22, 24) interprets the voice telecommunications signals using the voice interpreter module in association speech recognition application (s) (16). Call center service providing (18) means provides call center service in response to the call center telecommunications data signals in association with call center application program(s).

Type: Application

Filed: May 10, 2010

Publication date: December 9, 2010

Applicant: SOLEO COMMUNICATIONS, INC.

Inventors: Daniel Gallagher, Richard W. Ibbotson, Michael G. Thorpe, Bruce VanGelder, Luther Wright
PROGRESSIVE APPLICATION OF KNOWLEDGE SOURCES IN MULTISTAGE SPEECH RECOGNITION

Publication number: 20100312557

Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.

Type: Application

Filed: June 8, 2009

Publication date: December 9, 2010

Applicant: Microsoft Corporation

Inventors: Nikko Strom, Julian Odell, Jon Hamaker

prev … 6 7 8 9 10 11 12 13 14 … next