Speech Recognition (epo) Patents (Class 704/E15.001)

  • Publication number: 20110060347
    Abstract: The invention relates to a device for superimposing known patterns, characteristic of a region, on (real) images of said region. The device comprises, a memory in which patterns are stored, which are representative of a selected region, of known position and orientation with relation to a common reference and processing means, for determining a pattern representative of the selected portion in the memory, on receipt of the designation of at least one portion of an observed image of the selected region, taken at a selected angle and at least one representative attribute of said region, taking account of the attribute selected, then superimposing the determined pattern on the selected portion of the image taking account of the selected angle.
    Type: Application
    Filed: October 20, 2010
    Publication date: March 10, 2011
    Applicant: Intuitive Surgical Operations, Inc.
    Inventors: Eve Coste-Maniere, Thierry Vieville, Fabien Mourgues
  • Publication number: 20110054788
    Abstract: Methods and systems are provided for updating calendar and appointments based on a conversation on the hand held device. The software application and/or device shall sense the end of the call and prompt the user to decide if the call resulted in an appointment. If he answers in the affirmative, he is taken to the calendar application within the device to update his calendar with the appointment.
    Type: Application
    Filed: September 1, 2009
    Publication date: March 3, 2011
    Inventor: Vic Iyer
  • Publication number: 20110054904
    Abstract: A mobile device suitable for use by a user in a store includes a subvocal message (SVM) module to detect an SVM from the user. The SVM includes data that indicates an item in the store. A transmitter transmits a request after detecting the SVM. The request includes information indicating the item. A receiver receives a reply. The reply includes information responsive to the request. An output device provides the responsive information to the user. The request may include a request for item position information, item price information, or item inventory information. The mobile device may detect the SVM via a subvocal sensor coupled to the user. The subvocal sensor may be in contact with the user in proximity to a vocal cord of the user. The subvocal sensor may be connected to the mobile device wirelessly or via a wire.
    Type: Application
    Filed: August 28, 2009
    Publication date: March 3, 2011
    Applicant: STERLING COMMERCE, INC.
    Inventor: Charles Stanley Fenton
  • Publication number: 20110054890
    Abstract: A mobile phone, and corresponding method, which is arranged to detect sounds of different types and to indicate to a user the direction from which those sounds are coming from. The mobile phone includes a microphone for recording sound and a display for providing feedback to the user. The phone also includes a sound mapping program which is arranged to interpret the sound recorded by the microphone and to provide an audio map of detected sounds. This is presented to the user on the display.
    Type: Application
    Filed: August 25, 2009
    Publication date: March 3, 2011
    Applicant: Nokia Corporation
    Inventors: Aarne Vesa Pekka Ketola, Panu Marten Jesper Johansson
  • Publication number: 20110050589
    Abstract: A method of receiving input from a user includes providing a surface within reach of a hand of the user. A plurality of locations on the surface that are touched by the user are sensed. An alphanumeric character having a shape most similar to the plurality of touched locations on the surface is determined. The user is audibly or visually informed of the alphanumeric character and/or a word in which the alphanumeric character is included. Feedback is received from the user regarding whether the alphanumeric character and/or word is an alphanumeric character and/or word that the user intended to be determined in the determining step.
    Type: Application
    Filed: August 28, 2009
    Publication date: March 3, 2011
    Applicant: ROBERT BOSCH GMBH
    Inventors: BAOSHI YAN, Fuliang Weng, Liu Ren, You-Chi Cheng, Zhongnan Shen
  • Publication number: 20110050412
    Abstract: Selected objects may be located by pushing a button on a keypad of a base unit, or by giving an oral command thereto. A receiver microprocessor, loaded with a unique electronic address, is attachable to each object. A base unit PROM is loaded with a library of digitized voice command templates. A user's command to find a lost object is received by a microphone, and digitized. The digitized command is compared with the templates using pattern recognition algorithms, which may utilize a Hidden Markov Model. When matched, the base unit processor causes radio transmission of RF interrogation packets targeted at the unique address corresponding to the lost object. A receiver chip detects the interrogation packets, and compares the transmitted unique address with the address stored in its microprocessor. Where matched, the microprocessor modulates a sounding device to direct the user to the lost object.
    Type: Application
    Filed: August 4, 2010
    Publication date: March 3, 2011
    Inventors: Cynthia Wittman, Gabe Neiser, Michael Keating
  • Publication number: 20110054905
    Abstract: A voice interactive service system provides different speech-based services to a plurality of users. Using a communication terminal, the services are accessed via a telecommunication network through service-specific connectivity ports. The system comprises processing cores which have different configurations of speech processing resources for performing different services. For performing a requested service, a connection module establishes a connection between the respective connectivity port and a processing core having a configuration of speech processing resources suitable for performing the requested service. Because of the service-specific resourcing of cores, there is no need for requesting and allocating processing resources from external resource servers. Moreover, the port-dedicated resourcing of the cores ensures that a successful access to a connectivity port leads to a successful provision of the requested service.
    Type: Application
    Filed: August 26, 2010
    Publication date: March 3, 2011
    Applicant: me2me AG
    Inventors: Roger LAGADEC, Patrik ESTERMANN, Luciano BUTERA
  • Publication number: 20110046948
    Abstract: The invention relates to a method of automatic sound recognition. The object of the present invention is to provide an alternative scheme for automatically recognizing sounds, e.g. human speech. The problem is solved by providing a training database comprising a number of models, each model representing a sound element in the form of a binary mask comprising binary time frequency (TF) units which indicate the energetic areas in time and frequency of the sound element in question, or of characteristic features or statistics extracted from the binary mask; providing an input signal comprising an input sound element; estimating the input sound element based on the models of the training database to provide an output sound element. The method has the advantage of being relatively simple and adaptable to the application in question. The invention may e.g. be used in devices comprising automatic sound recognition, e.g. for sound, e.g. voice control of a device, or in listening devices, e.g.
    Type: Application
    Filed: August 4, 2010
    Publication date: February 24, 2011
    Inventor: Michael Syskind PEDERSEN
  • Publication number: 20110046949
    Abstract: Some embodiments of the invention relate to a method and a system for detecting unwanted conversational media session data. In accordance with one aspect of the invention, a method of detecting unwanted conversation media session data according to some embodiments of the invention may include calculating two or more progressive similarity scores each with respect to a different instant during a progress of a real-time conversational media session, wherein each of said scores is associated with a similarity between the conversational media session's media data that was available at the associated instant and a reference data item corresponding to media data of a previous conversational media session, and evaluating progressive similarity between the real-time conversational media session and the reference data item based upon the two or more progressive similarity scores.
    Type: Application
    Filed: November 2, 2010
    Publication date: February 24, 2011
    Applicant: Commtouch Software Ltd.
    Inventors: Aharon Satt, Amir Lev
  • Publication number: 20110046962
    Abstract: A voice triggering control device for enabling a data collection host which assembled on it comprises a processing unit, a speaker, a control module, a power supply module and a housing containing the elements disclosed above. The control device controls the processing unit to output a high-frequency audio signal which is corresponded to an act command Then, broadcasting a high-frequency audio through the speaker, wherein the high-frequency audio is generated by the high-frequency audio signal, and the data collection host is enabled to perform the act command while receiving and decoding the high-frequency audio. Thereby, making the triggering control device enabling the data collection host proceed a functional action by the high-frequency audio can solve the contact fault problem in the prior art.
    Type: Application
    Filed: September 16, 2009
    Publication date: February 24, 2011
    Applicant: ASKEY COMPUTER CORP.
    Inventors: Ting-Lin Chang, Ching-Feng Hsieh
  • Publication number: 20110044447
    Abstract: Techniques for processing data representative of text associated with one or more content sources to generate a specification of a set of keyphrases of interest; processing a first set of audio signals collected during a first time period to generate first data characterizing putative occurrences of one or more keyphrases of the set in the first set of audio signals; evaluating the first data to generate keyphrase-specific comparison values for the first set of audio signals; deriving first trending data between the first set of audio signals and a second set of audio signals based in part on an analysis of the keyphrase-specific comparison values for the first set of audio signals relative to stored keyphrase-specific baseline values; and generating a visual representation of at least some of the first trending data and causing the visual representation of the first trending data to be presented on a display terminal.
    Type: Application
    Filed: August 21, 2009
    Publication date: February 24, 2011
    Applicant: Nexidia Inc.
    Inventors: Robert W. Morris, Marsal Gavalda, Peter S. Cardillo, Jon A. Arrowood
  • Publication number: 20110039514
    Abstract: Techniques for achieving personal security via mobile devices are presented. A portable mobile communication device, such as a phone or a personal digital assistant (PDA), is equipped with geographic positioning capabilities and is equipped with audio and visual devices. A panic mode of operation can be automatically detected in which real time audio and video for an environment surrounding the portable communication device are captured along with a geographic location for the portable communication device. This information is streamed over the Internet to a secure site where it can be viewed in real time and/or later inspected.
    Type: Application
    Filed: August 13, 2009
    Publication date: February 17, 2011
    Inventors: Sandeep Patnaik, Saheednanda Singh, Anilkumar Bolleni
  • Publication number: 20110040563
    Abstract: A voice control device for a display device includes a voice receiver for receiving a voice signal, a voice recognition unit coupled to the voice receiver for recognizing the voice signal to generate a recognition result, a function decision unit coupled to the voice recognition unit for selecting an operating function from a plurality of operating functions according to the recognition result, and an execution unit coupled to the function decision unit for controlling the display device to perform the operating function.
    Type: Application
    Filed: February 5, 2010
    Publication date: February 17, 2011
    Inventors: Xie-Ren Hsu, Kuang-Feng Sung
  • Publication number: 20110040560
    Abstract: A basic idea of the invention is to ascertain information on the course of the bit rate switching during an active speech phase. According to the invention, during the speech phase, information on the percentage proportion of broadband active speech frames in comparison to narrowband active speech frames is compiled on the part of the decoder. A high percentage proportion of broadband active speech frames indicates that a broadband use is preferred on the part of the codec and therefore a need exists for synthesizing noise information in broadband form during a DTX phase.
    Type: Application
    Filed: February 2, 2009
    Publication date: February 17, 2011
    Inventors: Panji Setiawan, Stefan Schandl, Herve Taddei
  • Publication number: 20110035221
    Abstract: Apparatus for monitoring an audience participation distribution at an event comprising a speech activity module operable to generate speech data representing speech detected at the event, a speaker identification module operable to determine, using the speech data, a first speaker who has contributed to the detected speech, and a processing unit operable to generate speaker data representing a value for the time that the first speaker has contributed to the detected speech and to output distribution data based on the speaker data representing a measure of the participation for the first speaker at the event.
    Type: Application
    Filed: August 7, 2009
    Publication date: February 10, 2011
    Inventors: Tong Zhang, Hui Chao, Xuemei Zhang
  • Publication number: 20110032845
    Abstract: Multimodal teleconferencing including receiving, by a multimodal teleconferencing module, a speech utterance from one of a plurality of participants in the multimodal teleconference; identifying the participant making the speech utterance as a current speaker; retrieving, by the multimodal teleconferencing module from accounts for the current speaker, content for display to the current speaker; retrieving, by the multimodal teleconferencing module from accounts for the current speaker, content for display to one or more other participants in the multimodal teleconference; providing, by the multimodal teleconferencing module to a multimodal teleconferencing client for display to the current speaker, an identification of the speaker and the content retrieved for the speaker; and providing, by the multimodal teleconferencing module to one or more of multimodal teleconferencing clients for display to the other participants, an identification of the current speaker with the content retrieved for the one or more ot
    Type: Application
    Filed: August 5, 2009
    Publication date: February 10, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR.
  • Publication number: 20110035224
    Abstract: A system, method, and computer-readable medium for parcel address recognition. A method includes receiving an address input and producing candidate address results corresponding to the address input. The method includes receiving operational scheme knowledge describing the mode of operation of a parcel processing system, and receiving at least one operational rule corresponding to the operational scheme knowledge. The method includes applying the at least one operational rule to the candidate address results and producing and storing a finalized result according to the operational rule and the candidate address results.
    Type: Application
    Filed: July 30, 2010
    Publication date: February 10, 2011
    Inventor: Stanley W. Sipe
  • Publication number: 20110035215
    Abstract: Disclosed is a method and apparatus for signal processing and signal pattern recognition. According to some embodiments of the present invention, events in the signal to be processed/recognized may be used to pace or clock the operation of one or more processing elements. The detected events may be based on signal energy level measurements. The processing/recognition elements may be neuron models. The signal to be processed/recognized may be a speech signal.
    Type: Application
    Filed: August 28, 2008
    Publication date: February 10, 2011
    Inventors: Haim Sompolinsky, Robert Guetig
  • Publication number: 20110035220
    Abstract: An apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command.
    Type: Application
    Filed: August 5, 2009
    Publication date: February 10, 2011
    Applicant: Verizon Patent and Licensing Inc.
    Inventor: Robert E. Opaluch
  • Publication number: 20110029308
    Abstract: The present invention relates to means and methods of classifying speech and music signals in voice communication systems, devices, telephones, and methods, and more specifically, to systems, devices, and methods that automate control when either speech or music is detected over communication links. The present invention provides a novel system and method for monitoring the audio signal, analyze selected audio signal components, compare the results of analysis with a pre-determined threshold value, and classify the audio signal either as speech or music.
    Type: Application
    Filed: June 10, 2010
    Publication date: February 3, 2011
    Inventors: Alon Konchitsky, Alberto D. Berstein, Sandeep Kulakcherla, William Martin Ribble, Kevin Fitzgerald, Don Seferovich
  • Publication number: 20110026699
    Abstract: A service that handles incoming telephone calls without bothering the telephone subscriber is disclosed. The service permits a call to go through to a subscriber if the service determines that the call is not unwanted and the caller has been unauthenticated. The authentication is based on challenging the caller to prove its identity rather than relying on caller ID displays. Prospective callers pre-register with the service providing caller account information. When a caller is issued a challenge, the caller may prove its authenticity by supplying the challenge back to the service along with its registered information.
    Type: Application
    Filed: July 30, 2009
    Publication date: February 3, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Arnon Amir, Nimrod Megiddo
  • Publication number: 20110029307
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Application
    Filed: October 8, 2010
    Publication date: February 3, 2011
    Applicant: AT&T Intellectual Property II, L.P. via transfer form AT&T Corp.
    Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
  • Publication number: 20110022390
    Abstract: In order to speak numerals in a manner readily comprehensible to a user, a speech device includes a voice synthesis portion 55 which, when a given character string includes a numeral made up of a plurality of digits, speaks the numeral in either a first speech method in which the numeral is read aloud as individual digits or a second speech method in which the numeral is read aloud as a full number, a user definition table 81, an association table 83, a region table 84, and a digit number table 87 which associate a type of a character string with either the first speech method or the second speech method, a process executing portion 53 which executes a process to thereby output data, and a speech control portion 51 which generates a character string on the basis of the output data and causes the voice synthesis portion 55 to speak the generated character string in one of the first and second speech methods that is associated with the type of the output data.
    Type: Application
    Filed: February 4, 2009
    Publication date: January 27, 2011
    Applicant: SANYO ELECTRIC CO., LTD.
    Inventors: Kinya Otani, Naoki Hirose
  • Publication number: 20110019805
    Abstract: Methods and systems are provided for searching audio records. Certain embodiments of the invention may be applied to search audio records containing a user's voice for instances where a specific sound, such as a word or phrase, is vocalized by the user. An audio sample is provided by recording the user vocalizing the sound. The audio sample is compared with the audio records to locate matches to the audio sample. In some embodiments, the audio records comprise recordings of calls between a near-end caller and a far-end caller, and the audio sample is a recording of a sound spoken by the near-end caller. The same input device may be used to record both the audio sample and the audio records.
    Type: Application
    Filed: January 14, 2009
    Publication date: January 27, 2011
    Applicant: ALGO COMMUNICATION PRODUCTS LTD.
    Inventor: Paul William Zoehner
  • Publication number: 20110022384
    Abstract: A method and a control system are provided for inputting commands to a wind turbine controller during a service or maintenance procedure. A command orally input by a user is transformed into an electrical signal representing the orally input command. The electrical signal is transformed into an input command signal which is further transformed into a reproduction signal. A user is provided the reproduction signal along with a confirmation request in a form recognized by a user, such as visually or speech representation. After the user confirms the request, a signal based on the input command is sent to the wind tower controller.
    Type: Application
    Filed: August 22, 2008
    Publication date: January 27, 2011
    Inventor: Michael Jensen
  • Publication number: 20110022388
    Abstract: In an example embodiment, there is disclosed an apparatus comprising an audio interface configured to receive an audio signal, a data interface is configured to communicate with at least one social graph, and logic is coupled to the audio interface and the data interface. The logic is configured to identify a calling party. The logic is further configured to acquire data representative of a called party from the audio signal. The logic is configured to initiate a search of the at least one social graph for the data representative of the called party to identify the called party responsive to acquiring the data representative of the called party.
    Type: Application
    Filed: July 27, 2009
    Publication date: January 27, 2011
    Inventors: Sung Fong Solomon WU, Aaron TONG, Sam C. LEE
  • Publication number: 20110022385
    Abstract: The present invention provides a method and equipment of pattern recognition capable of efficiently pruning partial hypotheses without lowering recognition accuracy, its pattern recognition program, and its recording medium. In a second search unit, a likelihood calculation unit calculates an acoustic likelihood by matching time series data of acoustic feature parameters against a lexical tree stored in a second database and an acoustic model stored in a third database to determine an accumulated likelihood by accumulating the acoustic likelihood in a time direction. A self-transition unit causes each partial hypothesis to make a self-transition in a search process. An LR transition unit causes each partial hypothesis to make an RL transition. A reward attachment unit adds a reward R(x) in accordance with the number of reachable words to each partial hypothesis to raise the accumulated likelihood. A pruning unit excludes partial hypotheses with less likelihood from search targets.
    Type: Application
    Filed: July 22, 2010
    Publication date: January 27, 2011
    Applicant: KDDI CORPORATION
    Inventor: Tsuneo Kato
  • Publication number: 20110022292
    Abstract: A method for speech recognition includes providing a source of geographical information within a vehicle. The geographical information pertains to a current location of the vehicle, a planned travel route of the vehicle, a map displayed within the vehicle, and/or a gesture marked by a user on a map. Words spoken within the vehicle are recognized by use of a speech recognition module. The recognizing is dependent upon the geographical information.
    Type: Application
    Filed: July 27, 2009
    Publication date: January 27, 2011
    Applicant: Robert Bosch GmbH
    Inventors: Zhongnan Shen, Fuliang Weng, Zhe Feng
  • Publication number: 20110022392
    Abstract: A framework is provided which performs location-based analysis using an individual feature such as a stress level obtained based on biological information.
    Type: Application
    Filed: December 18, 2009
    Publication date: January 27, 2011
    Inventor: Takashi IWAMOTO
  • Publication number: 20110022393
    Abstract: In a method for multimode information input and/or adaptation of the display of a display and control device, input signals of different modality are detected which are supplied via the device to a voice recognition unit, thus initiating a desired function and/or display as an output signal, which are displayed on the device and/or output by voice output. Touch and/or gesture input signals are provided on or to the device for selection of an object intended for interaction and activation of the voice recognition unit and for the vocabulary which is provided for interaction to be restricted with the selection of the object and/or activation of the voice recognition unit as a function of the selected object, on the basis of which a voice command from the restricted vocabulary is added to the selected object as an information input and/or for adaptation of the display, via the voice recognition unit.
    Type: Application
    Filed: November 12, 2008
    Publication date: January 27, 2011
    Inventors: Christoph Wäller, Moritz Neugebauer, Thomas Fabian, Ulrike Wehling, Günter Horna, Markus Missall
  • Publication number: 20110015924
    Abstract: A method of separating a mixture of acoustic signals from a plurality of sources comprises: providing pressure signals indicative of time-varying acoustic pressure in the mixture; defining a series of time windows; and for each time window: a) providing from the pressure signals a series of sample values of measured directional pressure gradient; b) identifying different frequency components of the pressure signals c) for each frequency component defining an associated direction; and d) from the frequency components and their associated directions generating a separated signal for one of the sources.
    Type: Application
    Filed: October 17, 2008
    Publication date: January 20, 2011
    Inventors: Banu Gunel Hacihabiboglu, Huseyin Hacihabiboglu, Ahmet Kondoz
  • Publication number: 20110014944
    Abstract: Embodiments disclose a technique to recognize text in a current frame of an image in a view finder of a digital camera. In accordance with the technique, text at a marker (e.g. a cursor or cross hairs) associated with the view finder is recognized and a lookup is performed based on the recognized text. Advantageously, the lookup yields useful information e.g. a translation of a recognized word that is displayed in the viewfinder adjacent to the text. The current frame is not captured by a user. As the user moves the camera to position a new word at the marker, the view finder is updated to provide lookup results associated with the new word. Lookups may be performed of a bilingual dictionary, a monolingual dictionary, a reference book, a travel guide, etc. Embodiments of the invention also cover digital cameras or mobile devices that implement the aforementioned technique.
    Type: Application
    Filed: July 13, 2010
    Publication date: January 20, 2011
    Applicant: Abbyy Software Ltd.
    Inventor: BORIS SAMOYLOV
  • Publication number: 20110015932
    Abstract: The present invention relates to a method for song searching by voice, especially the method with which users can complete settings and then start searching, so that the users' voices of search conditions will be acquired to make voice recognition, and the recognition results will be compared with the instruction data and song attribute data in the voice recognition database to obtain comparison data. If the comparison data do not correspond with the preset conditions, the next search condition generated from the comparison data will be broadcast with voice, and the users are allowed to speak out the next search condition to make comparisons of search conditions in the next process. If the comparison data correspond with the preset conditions, one or more song files will be read according to the comparison data and will be given a preview.
    Type: Application
    Filed: September 4, 2009
    Publication date: January 20, 2011
    Inventors: Chen-Wei SU, Tsung-Han Tsai, Chun-Ping Fang
  • Publication number: 20110010170
    Abstract: A wireless communication device is disclosed that accepts recorded audio data from an end-user. The audio data can be in the form of a command requesting user action. Likewise, the audio data can be converted into a text file. The audio data is reduced to a digital file in a format that is supported by the device hardware, such as a .wav, .mp3, .vnf file, or the like. The digital file is sent via secured or unsecured wireless communication to one or more server computers for further processing. In accordance with an important aspect of the invention, the system evaluates the confidence level of the of the speech recognition process. If the confidence level is high, the system automatically builds the application command or creates the text file for transmission to the communication device.
    Type: Application
    Filed: September 17, 2010
    Publication date: January 13, 2011
    Inventors: Stephen S. Burns, Mickey W. Kowitz
  • Publication number: 20110010177
    Abstract: A question and answer database expansion apparatus includes: a question and answer database in which questions and answers corresponding to the questions are registered in association with each other, a first speech recognition unit which carries out speech recognition for an input sound signal by using a language model based on the question and answer database, and outputs a first speech recognition result as the recognition result, a second speech recognition unit which carries out speech recognition for the input sound signal by using a language model based on a large vocabulary database, and outputs a second speech recognition result as the recognition result, and a question detection unit which detects an unregistered utterance, which is not registered in the question and answer database, from the input sound based on the first speech recognition result and the second speech recognition result, and outputs the detected unregistered utterance.
    Type: Application
    Filed: July 8, 2010
    Publication date: January 13, 2011
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Mikio NAKANO, Kotaro FUNAKOSHI, Hiromi NARIMATSU
  • Publication number: 20110004341
    Abstract: A robot using less storage and computational resources to embody panoramic attention. The robot includes a panoramic attention module with multiple levels that are hierarchically structured to process different levels of information. The top-level of the panoramic attention module receives information about entities detected from the environment of the robot and maps the entities to a panoramic map maintained by the robot. By mapping and storing high-level entity information instead of low-level sensory information in the panoramic map, the amount of storage and computation resources for panoramic attention can be reduced significantly. Further, the mapping and storing of high-level entity information in the panoramic map also facilitates consistent and logical processing of different conceptual levels of information.
    Type: Application
    Filed: June 18, 2010
    Publication date: January 6, 2011
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Ravi Kiran Sarvadevabhatla, Victor Ng-Thow-Hing
  • Publication number: 20110004474
    Abstract: A method, a system, and a computer program product for determining a total count of audience members within a sensory receiving environment during the presentation of a program. A voice recognition unit is enabled when a signal for a program/subject/event, such as a broadcast program, is received. The voice recognition unit receives one or more sounds in the sensory receiving environment and analyzes the characteristics of the sounds. When one or more unique human voices are identified during the program, a count of the number of unique human voices is determined. The count of unique human voices is transmitted to a server, whereby the count of unique human voices is equal to a count of audience members. The total count of audience members is calculated for all sensory receiving environment associated with the program. An audience analysis graphical user interface is generated to display the total count of audience members.
    Type: Application
    Filed: July 2, 2009
    Publication date: January 6, 2011
    Applicant: International Business Machines Corporation
    Inventors: Ravi P. Bansal, Mike V. Macias, Saidas T. Kottawar, Salil P. Gandhi, Sandip D. Mahajan
  • Publication number: 20100332233
    Abstract: A battery-management method is performed by a battery-operated device. The method includes allocating a first portion of a battery capacity to a first function and a second portion of the battery capacity to a second function. The method further includes simultaneously displaying a first indicator relating to the first portion of the battery capacity and a second indicator relating to the second portion of the battery capacity.
    Type: Application
    Filed: September 13, 2010
    Publication date: December 30, 2010
    Applicant: RESEARCH IN MOTION LIMITED
    Inventors: Joseph C. Chen, Jonathan Malton
  • Publication number: 20100332224
    Abstract: In accordance with an example embodiment of the present invention, an apparatus comprises a controller configured to process punctuated text data, and to identify punctuation in said punctuated text data; and an output unit configured to generate audio output corresponding to said punctuated text data, and to generate tactile output corresponding to said identified punctuation.
    Type: Application
    Filed: June 30, 2009
    Publication date: December 30, 2010
    Applicant: NOKIA CORPORATION
    Inventors: Jakke Sakari Mäkelä, Jukka Pekka Naula, Niko Santeri Porjo
  • Publication number: 20100332235
    Abstract: An intelligent home automation system answers questions of a user speaking “natural language” located in a home. The system is connected to, and may carry out the user's commands to control, any circuit, object, or system in the home. The system can answer questions by accessing the Internet. Using a transducer that “hears” human pulses, the system may be able to identify, announce and keep track of anyone entering or staying in the home or participating in a conversation, including announcing their identity in advance. The system may interrupt a conversation to implement specific commands and resume the conversation after implementation. The system may have extensible memory structures for term, phrase, relation and knowledge, question answering routines and a parser analyzer that uses transformational grammar and a modified three hypothesis analysis. The parser analyzer can be dormant unless spoken to. The system has emergency modes for prioritization of commands.
    Type: Application
    Filed: June 29, 2009
    Publication date: December 30, 2010
    Inventor: ABRAHAM BEN DAVID
  • Publication number: 20100323615
    Abstract: A mobile device has a datalog module that captures multimedia data at the mobile device and transmits the multimedia data through cell networks to a control center. The mobile device may also include a GPS sensor wherein location information is included within the multimedia data. A mobile device has a motion module that, when activated at the mobile device or through a cell network, disables communications through the mobile device when in motion. A system disables operation of a mobile device by a vehicle operator and includes a transmitter within the vehicle that generates a disabling signal that, when received by a safety receiver within the mobile device, disables operation of the mobile device. A mobile device has a microphone, and a voice augmentation module which is selectively activated to augment voice data spoken into the mobile device, by removing background noise and/or replacing or changing voice data.
    Type: Application
    Filed: June 17, 2010
    Publication date: December 23, 2010
    Inventors: Curtis A. Vock, Perry Youngs
  • Publication number: 20100324899
    Abstract: A speech recognition system for rapidly performing recognition processing while maintaining quality of speech recognition in a speech recognition device, are provided. A speech recognition system includes a speech input device which inputs speech and displays a recognition result, and a speech recognition device which receives the speech from the speech input device, performs recognition processing, and sends back the speech to the speech input device. The speech input device includes a user dictionary section which stores words used for recognizing the input speech, and a reduced user dictionary creation unit which extracts words corresponding to the input speech from the user dictionary and creates a reduced user dictionary. The speech recognition device has a speech recognition unit which inputs the input speech and the reduced user dictionary from the speech input/output device and recognizes the input speech based on the reduced user dictionary and a system dictionary provided beforehand.
    Type: Application
    Filed: March 14, 2008
    Publication date: December 23, 2010
    Inventor: Kiyoshi Yamabana
  • Publication number: 20100324897
    Abstract: Acoustic models and language models are learned according to a speaking length which indicates a length of a speaking section in speech data, and speech recognition process is implemented by using the learned acoustic models and language models. A speech recognition apparatus includes means (103) for detecting a speaking section in speech data (101) and for generating a section information which indicates the detected speaking section, means (104) for recognizing a data part corresponding to a section information in the speech data as well as text data (102) written from the speech data and for classifying the data part based on a speaking length thereof, and means (106) for learning acoustic models and language models (107) by using the classified data part (105).
    Type: Application
    Filed: December 7, 2007
    Publication date: December 23, 2010
    Inventors: Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20100324910
    Abstract: Techniques and systems to provide speech recognition services over a network using a standard interface are described. In an embodiment, a technique includes accepting a speech recognition request that includes at least audio input, via an application program interface (API). The speech recognition request may also include additional parameters. The technique further includes performing speech recognition on the audio according to the request and any specified parameters; and returning a speech recognition result as a hypertext protocol (HTTP) response. Other embodiments are described and claimed.
    Type: Application
    Filed: June 19, 2009
    Publication date: December 23, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Robert L. Chambers, Michael Bodell, Daphne Luong, Annie Wong, Faustinus K. Gozali, Andrew Ho, Rod Philander, Corby Anderson
  • Publication number: 20100318357
    Abstract: Techniques are described for managing various types of content in various ways, such as based on voice commands or other voice-based control instructions provided by a user. In some situations, at least some of the content being managed includes content of a variety of types, such as music and other audio information, photos, images, non-television video information, videogames, Internet Web pages and other data, etc., which may be managed via the voice controls in a variety of ways, such as to allow a user to locate and identify content of potential interest, to schedule recordings of selected content, to manage previously recorded content (e.g., to play or delete the content), to control live television, etc. This abstract is provided to comply with rules requiring it, and is submitted with the intention that it will not be used to interpret or limit the scope or meaning of the claims.
    Type: Application
    Filed: October 22, 2009
    Publication date: December 16, 2010
    Applicant: Vulcan Inc.
    Inventors: Anthony F. Istvan, Korina J.B. Stark, Robin Budd
  • Publication number: 20100318353
    Abstract: The present invention relates generally to the use of compressors, with an optional noise extractor, to improve audio sensing performance of one or more microphones. The audio sensing performance of a single element microphone array with dynamic range compression can be improved by the use of a noise extractor, to modify the operation of the compressor, typically to avoid noise floor amplification. Dynamic range compression can be applied to the output of two or more element microphone array processing with the optional use of a noise extractor. Dynamic range compression can precede the microphone array processing with the optional use of a noise extractor. Syllabic dynamic range compression may be used in one or more element microphone arrays, with the optional use of a noise extractor, which increases speech recognition accuracy.
    Type: Application
    Filed: June 16, 2010
    Publication date: December 16, 2010
    Inventor: Karl M. Bizjak
  • Publication number: 20100312550
    Abstract: An apparatus and method for extending a pronunciation dictionary for speech recognition are provided. The apparatus and the method may segment speech information of an input utterance into at least one phoneme, collect segmentation information of the at least one segmented phoneme, analyze a pronunciation variation of the at least one segmented phoneme based on the collected segmentation information, and select a substitutable phoneme group for the at least one phoneme where the pronunciation variation occurs, and extend the pronunciation dictionary.
    Type: Application
    Filed: February 23, 2010
    Publication date: December 9, 2010
    Inventor: Gil Ho LEE
  • Publication number: 20100312555
    Abstract: A local feedback mechanism for customizing training models based on user data and directed user feedback is provided in speech recognition applications. The feedback data is filtered at different levels to address privacy concerns for local storage and for submittal to a system developer for enhancement of generic training models.
    Type: Application
    Filed: June 9, 2009
    Publication date: December 9, 2010
    Applicant: Microsoft Corporation
    Inventors: Michael D. Plumpe, Julian Odell, Jon Hamaker, Rob Chambers, Christopher Le, Onur Domanic
  • Publication number: 20100312558
    Abstract: A system (10) for controlling telecommunications calls includes a voice XML network gateway (12) including a voice interpreter module (20) and a call center server module (28) association with a telecommunications switch (58). The voice interpreter module (20) receives voice telecommunications signals, and the call center server module (28) receives call center telecommunications data signals. Interpreting circuitry (22, 24) interprets the voice telecommunications signals using the voice interpreter module in association speech recognition application (s) (16). Call center service providing (18) means provides call center service in response to the call center telecommunications data signals in association with call center application program(s).
    Type: Application
    Filed: May 10, 2010
    Publication date: December 9, 2010
    Applicant: SOLEO COMMUNICATIONS, INC.
    Inventors: Daniel Gallagher, Richard W. Ibbotson, Michael G. Thorpe, Bruce VanGelder, Luther Wright
  • Publication number: 20100312557
    Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.
    Type: Application
    Filed: June 8, 2009
    Publication date: December 9, 2010
    Applicant: Microsoft Corporation
    Inventors: Nikko Strom, Julian Odell, Jon Hamaker