Speech Recognition (epo) Patents (Class 704/E15.001)

  • Publication number: 20100305948
    Abstract: A sub-phoneme model given acoustic data which corresponds to a phoneme. The acoustic data is generated by sampling an analog speech signal producing a sampled speech signal. The sampled speech signal is windowed and transformed into the frequency domain producing Mel frequency cepstral coefficients of the phoneme. The sub-phoneme model is used in a speech recognition system. The acoustic data of the phoneme is divided into either two or three sub-phonemes. A parameterized model of the sub-phonemes is built, where the model includes Gaussian parameters based on Gaussian mixtures and a length dependency according to a Poisson distribution. A probability score is calculated while adjusting the length dependency of the Poisson distribution. The probability score is a likelihood that the parameterized model represents the phoneme. The phoneme is subsequently recognized using the parameterized model.
    Type: Application
    Filed: June 1, 2009
    Publication date: December 2, 2010
    Inventors: Adam Simone, Roman Budnovich, Avraham Entelis
  • Publication number: 20100303224
    Abstract: A method of receiving a call from a first caller that is requesting for assistance with a product. Once the call center receives the call, a call-processing switch routes the first caller to a first agent. Once the caller is routed to the first agent, a first message is transmitted to both the first caller's terminal and the first agent's terminal. After the first message is presented to the first caller and the first agent, the call-processing switch will monitor the communications stream for distress. During monitoring of the communications stream, the call-processing switch will estimate whether a level of distress is present in the communications stream. If it is estimated by the call-processing switch that there is distress present in the communications stream, the call-processing switch will transmit a second message to the first caller's terminal and the first agent's terminal.
    Type: Application
    Filed: May 29, 2009
    Publication date: December 2, 2010
    Applicant: AVAYA INC.
    Inventors: George William Erhart, Valentine C. Matula, David Joseph Skiba, Lawrence O'Gorman
  • Publication number: 20100305947
    Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.
    Type: Application
    Filed: June 2, 2010
    Publication date: December 2, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
  • Publication number: 20100305951
    Abstract: A system for monitoring hands-free accessibility of media items for play at a vehicle includes a vehicle entertainment computing system (VECS) configured to receive predetermined rules for voice-activated access of the media items. Violations of the rules are detected based on media item metadata. If a violation is detected, a prompt is outputted. Media items are retrieved and played based on voice-activated requests. One embodiment includes a method for monitoring hands-free accessibility of media items for play at a vehicle. A system for formatting media items for accessibility at a VECS includes a media item incompatibility resolution system (MIIRS) configured to resolve violations of the predetermined rules by receiving additional rules relating to formatting violating media items. The media items are searched and the violations addressed by reformatting the media items for voice-activated access. The media items are outputted to the MIIRS.
    Type: Application
    Filed: June 2, 2009
    Publication date: December 2, 2010
    Applicant: Ford Global Technologies, LLC
    Inventors: Jeffrey Raymond Ostrowski, Julius Marchwicki, Darren Peter Shelcusky, Matthew Scott Bourdua
  • Publication number: 20100298010
    Abstract: A method of operating a mobile communication device having a set of one or more applications, each with its own associated user-configurable customization, the method comprising detecting whether the user-configurable customization of any of the applications has changed since an earlier time, and for all applications for which the user-configurable customization has changed since said earlier time, wirelessly transmitting those changes to a remote server. The method further comprises maintaining a set of flags indicating whether changes have occurred to the user-configurable customization, wherein detecting whether the user-configurable customization of any of the applications has changed since said earlier time includes reading the set of flags. The remote server is one of a carrier server and a third party provider server.
    Type: Application
    Filed: December 22, 2009
    Publication date: November 25, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Daniel L. ROTH, Laurence S. GILLICK, Jordan COHEN
  • Publication number: 20100299137
    Abstract: A game apparatus includes a CPU, and the CPU evaluates a pronunciation of a user with respect to an original sentence (ES). First, envelops as to a volume of a voice of the original sentence (ES) and a volume of a voice of the user are taken, and the average values of the volumes are made uniform. When the volumes are made uniform to each other, a degree of similarity (scoreA) of distributions of local solutions when the volumes are equal to or more than the average values, a degree of similarity (scoreB) of distributions (timing of concaves/convexes of the waveform) of values of the high or low level indicating whether or not the volume is equal to or more than a value multiplying the average value by a predetermined value, and a degree of similarity (scoreC) of dispersion values (dispersion of concaves/convexes of the waveform) of the envelopes are evaluated by utilizing the respective envelopes.
    Type: Application
    Filed: January 21, 2010
    Publication date: November 25, 2010
    Applicant: NINTENDO CO., LTD.
    Inventor: Tomokazu ABE
  • Publication number: 20100299142
    Abstract: A system and method for selecting and presenting advertisements based on natural language processing of voice-based inputs is provided. A user utterance may be received at an input device, and a conversational, natural language processor may identify a request from the utterance. At least one advertisement may be selected and presented to the user based on the identified request. The advertisement may be presented as a natural language response, thereby creating a conversational feel to the presentation of advertisements. The request and the user's subsequent interaction with the advertisement may be tracked to build user statistical profiles, thus enhancing subsequent selection and presentation of advertisements.
    Type: Application
    Filed: July 30, 2010
    Publication date: November 25, 2010
    Applicant: VoiceBox Technologies, Inc.
    Inventors: Tom Freeman, Mike Kennewick
  • Publication number: 20100299136
    Abstract: A method for executing a fully mixed initiative dialogue (FMID) interaction between a human and a machine, a dialogue system for a FMID interaction between a human and a machine and a computer readable data storage medium having stored thereon computer code for instructing a computer processor to execute a method for executing a FMID interaction between a human and a machine are provided. The method includes retrieving a predefined grammar setting out parameters for the interaction; receiving a voice input; analysing the grammar to dynamically derive one or more semantic combinations based on the parameters; obtaining semantic content by performing voice recognition on the voice input; and assigning the semantic content as fulfilling the one or more semantic combinations.
    Type: Application
    Filed: October 9, 2008
    Publication date: November 25, 2010
    Applicant: AGENCY FOR SCIENCE, TECHNOLOGY AND RESEARCH
    Inventors: Rong Tong, Shuanhu Bai, Haizhou Li
  • Publication number: 20100291972
    Abstract: Systems and methods for automatically setting reminders. A method for automatically setting reminders includes receiving utterances, determining whether the utterances match a stored phrase, and in response to determining that there is a match, automatically setting a reminder in a mobile communication device. Various filters can be applied to determine whether or not to set a reminder. Examples of suitable filters include location, date/time, callee's phone number, etc.
    Type: Application
    Filed: May 14, 2009
    Publication date: November 18, 2010
    Applicant: International Business Machines Corporation
    Inventors: Salil P. Gandhi, Saidas T. Kottawar, Mike V. Macias, Sandip D. Mahajan
  • Publication number: 20100289661
    Abstract: An apparatus and methods for implementing a garage door monitoring system coupled to a garage door opener. The door monitoring system allow a user to actuate the door under control via a network connection. In at least one embodiment, the door monitoring system is controlled by a cell phone or networked appliance capable of transmitting information and data via a cellular telephone network. The door monitoring system provides the added advantage of allowing a remote user to view the areas or regions near to the door under control prior to actuating the door. At least one embodiment comprises a method to validate reception of the pictures or video clips of the areas or regions near to the door prior to enabling the system to actuate the door. In another embodiment a pass code is embedded into the pictures or video provided to the remote user.
    Type: Application
    Filed: August 6, 2010
    Publication date: November 18, 2010
    Inventors: Justin R. Styers, Ryan H. McDowell
  • Publication number: 20100292991
    Abstract: Embodiments of the present invention provide a method for controlling a game system by speech and a game system thereof. The method includes collecting a speech command, storing the speech command in association with a game command; receiving a speech command from a user during a game, searching for a game command associated with the speech command, and controlling a game system using the game command found. The game system includes a speech collecting module, an associated storage module, a speech command recognizing module and a game controlling module. The present invention can implement control of a game system using speech.
    Type: Application
    Filed: July 27, 2010
    Publication date: November 18, 2010
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Jing LV
  • Publication number: 20100286987
    Abstract: An apparatus and method for generating an avatar based video message are provided. The apparatus and method are capable of generating an avatar based video message based on speech of a user. The avatar based video message apparatus and method displays information that corresponds to input user speech. The avatar based video message apparatus and method edits the input user speech according to a user input signal with reference to the displayed information, generates avatar animation according to the edited speech, and generates an avatar based video message based on the edited speech and the avatar animation.
    Type: Application
    Filed: April 5, 2010
    Publication date: November 11, 2010
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ick-sang HAN, Jeong-mi Cho
  • Publication number: 20100286984
    Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.
    Type: Application
    Filed: June 18, 2008
    Publication date: November 11, 2010
    Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
  • Publication number: 20100280820
    Abstract: Methods and systems for testing and analyzing integrated voice response systems are provided. Computer devices are used to simulate caller responses or inputs to components of the integrated voice response systems. The computer devices receive responses from the components. The responses may be in the form of VXML and grammar files that are used to implement call flow logic. The responses may to analyzed to evaluate the performance of the components and/or call flow logic.
    Type: Application
    Filed: June 8, 2010
    Publication date: November 4, 2010
    Inventor: Vijay Chandar NATESAN
  • Publication number: 20100273505
    Abstract: A method may include connecting to another user device, identifying a geographic location of the other user device, identifying a geographic location of the user device, mapping a sound source associated with the other user device, based on the geographic location of the other user device with respect to the geographic location of the user device, to a location of an auditory space associated with a user of the user device, placing the sound source in the location of the auditory space, and emitting, based on the placing, the sound source so that the sound source is capable of being perceived by the user in the location of the auditory space.
    Type: Application
    Filed: April 24, 2009
    Publication date: October 28, 2010
    Applicant: Sony Ericsson Mobile Communications AB
    Inventors: Ted MOLLER, Ian RATTIGAN
  • Publication number: 20100268538
    Abstract: Disclosed are an electronic apparatus and a voice recognition method for the same. The voice recognition method for the electronic apparatus includes: receiving an input voice of a user; determining characteristics of the user; and recognizing the input voice based on the determined characteristics of the user.
    Type: Application
    Filed: January 7, 2010
    Publication date: October 21, 2010
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Hee-seob RYU, Seung-kwon PARK, Jong-ho LEA, Jong-hyuk JANG
  • Publication number: 20100260273
    Abstract: A method of maintaining signal convergence of a transmitter and a receiver of an adaptive differential pulse code modulated (ADPCM) communication system during discontinuous transmission, by generating synchronized comfort noise frame in the transmitter and in the receiver during quiet periods and mutually updating the receiver's decoder and encoder on their states.
    Type: Application
    Filed: April 13, 2009
    Publication date: October 14, 2010
    Applicant: DSP Group Limited
    Inventors: Mark Raifel, Yaakov Chen, Eli Fogel
  • Publication number: 20100256972
    Abstract: An interpretation system that includes an optical or audio acquisition device for acquiring a sentence written or spoke in a source language and an audio restoration device for generating, from an input signal acquired by the acquisition device, a source sentence that is a transcription of the sentence in the source language. The interpretation system further includes a translation device for generating, from the source sentence, a target sentence that is a translation of the source sentence in a target language, and a speech synthesis device for generating, from the target sentence, an output audio signal reproduced by the audio restoration device. The interpretation system includes a smoothing device for calling the recognition, translation and speech synthesis devices in order to produce in real time an interpretation in the target language of the sentence in the source language.
    Type: Application
    Filed: November 18, 2008
    Publication date: October 7, 2010
    Inventor: Jean Grenier
  • Publication number: 20100256979
    Abstract: Services and performance characteristics are more and more frequently defined according to standard descriptions and formats, which is the case also for announcement services and dialogue services required especially in network services, for example. The associated descriptions are also provided in a standard form, e.g. by means of VoiceXML. When a service is introduced in the network, said descriptions are inserted into the network nodes, application, and/or media server. A browser functionality which reads and interprets the VoiceXML pages is required for processing the VoiceXML description on a media server platform such that the necessary basic functions of the media server can be allocated to the desired service and can be controlled. Prior art uses media servers comprising a single VoiceXML browser. The problem with such commercial products lies in the resulting suboptimality in terms of resource utilization and expenses, which is the case particularly for simple applications.
    Type: Application
    Filed: January 12, 2007
    Publication date: October 7, 2010
    Applicant: NOKIA SIEMENS NETWORKS GMBH & CO KG
    Inventors: Detlev Freund, Norbert Lobig
  • Publication number: 20100256977
    Abstract: Described is a technology by which a maximum entropy (MaxEnt) model, such as used as a classifier or in a conditional random field or hidden conditional random field that embed the maximum entropy model, uses continuous features with continuous weights that are continuous functions of the feature values (instead of single-valued weights). The continuous weights may be approximated by a spline-based solution. In general, this converts the optimization problem into a standard log-linear optimization problem without continuous weights at a higher-dimensional space.
    Type: Application
    Filed: April 1, 2009
    Publication date: October 7, 2010
    Applicant: Microsoft Corporation
    Inventors: Dong Yu, Li Deng, Alejandro Acero
  • Publication number: 20100256976
    Abstract: The interactive authentication system allows a consumer to interact with a base station, such as broadcast media (e.g., television and radio) or PC, to receive coupons, special sales offers, and other information with an electronic card. The electronic card can also be used to transmit a signal that can be received by the base station to perform a wide variety of tasks. These tasks can include launching an application, authenticating a user at a website, and completing a sales transaction at a website (e.g., by filling out a form automatically). The interaction between the base station and the electronic card is accomplished by using the conventional sound system in the base station so that a special reader hardware need not be installed to interact with the electronic card. The user is equipped with an electronic card that can receive and transmit data via sound waves. In the various embodiments, the sound waves can be audible or ultrasonic (which can be slightly audible to some groups of people).
    Type: Application
    Filed: April 1, 2010
    Publication date: October 7, 2010
    Applicant: BeepCard Ltd.
    Inventors: Alon ATSMON, Amit Antebi, Tsvi Lev, Moshe Cohen, Gavriel Speyer, Alan Sege, Nathan Altman, Rami Anati
  • Publication number: 20100246784
    Abstract: A method may include receiving, at a first device associated with a first party, communications from a second device. The communications may be associated with a communication session that includes an audio conversation, a text-based conversation or a multimedia conversation. The method may also include identifying a word or phrase from the communication session and retrieving, from a memory included in the first device, information associated with the word or phrase. The method may further include outputting, to a display associated with the first device, the retrieved information.
    Type: Application
    Filed: March 27, 2009
    Publication date: September 30, 2010
    Applicant: VERIZON PATENT AND LICENSING INC.
    Inventors: Kristopher T. Frazier, Heath Stallings
  • Publication number: 20100250341
    Abstract: A system and method for predicting what content a user wants to view based on such user's previous behavior and actions, comprising: receiving a cookie for every content page template in a web site; receiving a request for service of a content page; sending the content requested to a requester; for each content page sent, retrieving the cookie from the user; assigning a unique identifier (ID) to each new requester and storing the ID in the cookie; recording each ID, IP address, referrer, and time of request from the server; and storing the data recorded in a buffer for a period of time before storing it more permanently in a client-specific database. The system can be monetized by receiving fees from end users for presenting the content preferences or by receiving fees form content providers that include advertising related to the content preferences.
    Type: Application
    Filed: June 7, 2010
    Publication date: September 30, 2010
    Applicant: DAILYME, INC.
    Inventor: Eduardo A. Hauser
  • Publication number: 20100245585
    Abstract: A hands-free wireless wearable GPS enabled video camera and audio-video communications headset, mobile phone and personal media player, capable of real-time two-way and multi-feed wireless voice, data and audio-video streaming, telecommunications, and teleconferencing, coordinated applications, and shared functionality between one or more wirelessly networked headsets or other paired or networked wired or wireless devices and optimized device and data management over multiple wired and wireless network connections. The headset can operate in concert with one or more wired or wireless devices as a paired accessory, as an autonomous hands-free wide area, metro or local area and personal area wireless audio-video communications and multimedia device and/or as a wearable docking station, hot spot and wireless router supporting direct connect multi-device ad-hoc virtual private networking (VPN).
    Type: Application
    Filed: March 1, 2010
    Publication date: September 30, 2010
    Inventors: Ronald Eugene FISHER, Bryan Jonathan Davis, Bradley Brian Bushard, Mark Joseph Meyer, James Fisher, Nitin Patil, Ben Young, Daniel Johnson
  • Publication number: 20100241418
    Abstract: A speech recognition device includes one intention extracting language model and more in which an intention of a focused specific task is inherent, an absorbing language model in which any intention of the task is not inherent, a language score calculating section that calculates a language score indicating a linguistic similarity between each of the intention extracting language model and the absorbing language model, and the content of an utterance, and a decoder that estimates an intention in the content of an utterance based on a language score of each of the language models calculated by the language score calculating section.
    Type: Application
    Filed: March 11, 2010
    Publication date: September 23, 2010
    Applicant: Sony Corporation
    Inventors: Yoshinori Maeda, Hitoshi Honda, Katsuki Minamino
  • Publication number: 20100235167
    Abstract: One or more embodiments include a speech recognition learning system for improved speech recognition. The learning system may include a speech optimizing system. The optimizing system may receive a first stimulus data package including spoken utterances having at least one phoneme, and contextual information. A number of result data packages may be retrieved which include stored spoken utterances and contextual information. A determination may be made as to whether the first stimulus data package requires improvement. A second stimulus data package may be generated based on the determination. A number of speech recognition implementation rules for implementing the second stimulus data package may be received. The rules may be associated with the contextual information. A determination may be made as to whether the second stimulus data package requires further improvement.
    Type: Application
    Filed: March 13, 2009
    Publication date: September 16, 2010
    Inventor: Francois Bourdon
  • Publication number: 20100235169
    Abstract: Method for differentiation between voices including 1) analyzing perceptually relevant signal properties of the voices, e.g. average pitch and pitch variance, 2) determining sets of parameters representing the signal properties of the voices, and finally 3) extracting voice modification parameters representing modified signal properties of at least some of the voices. Hereby it is possible to increase a mutual parameter distance between the voices, and thereby the perceptual difference between the voices, when the voices have been modified according to the voice modification parameters. Preferably most of or all voices are modified in order to limit the amount of modification of one parameter.
    Type: Application
    Filed: May 15, 2007
    Publication date: September 16, 2010
    Applicant: Koninklijke Philips Electronics N.V.
    Inventor: Aki Sakari Harma
  • Publication number: 20100228683
    Abstract: An exemplary system is adapted for developing, testing, and operating payments and funds transfer systems, such as, for example, issuing systems, acquiring systems, and/or payment networks/systems. A model content repository stores elements for system models. An integrated development environment allows users to access the model content repository and design models. A deployment manager is adapted to compile and test models that have been defined and stored in the model content repository. The compiled code is executed in a platform runtime environment. Information that is collected from an executing system is communicated to the integrated development environment where the information may be presented to a user.
    Type: Application
    Filed: March 6, 2010
    Publication date: September 9, 2010
    Applicant: TxVia, Inc.
    Inventors: Carl Ansley, Anil Datt Aggarwal
  • Publication number: 20100228548
    Abstract: Techniques for enhanced automatic speech recognition are described. An enhanced ASR system may be operative to generate an error correction function. The error correction function may represent a mapping between a supervised set of parameters and an unsupervised training set of parameters generated using a same set of acoustic training data, and apply the error correction function to an unsupervised testing set of parameters to form a corrected set of parameters used to perform speaker adaptation. Other embodiments are described and claimed.
    Type: Application
    Filed: March 9, 2009
    Publication date: September 9, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Chaojun Liu, Yifan Gong
  • Publication number: 20100228540
    Abstract: Systems and methods for query-based searching using spoken input are disclosed. In systems and methods according to embodiments of the invention, continuous speech natural language queries are accepted from a user using a client device. Speech processing tasks are divided between the client device and one or more server systems. Once user speech is recognized, the system searches one or more data repositories containing queries for at least one query that matches the recognized speech and returns information related to the query.
    Type: Application
    Filed: May 20, 2010
    Publication date: September 9, 2010
    Applicant: PHOENIX SOLUTIONS, INC.
    Inventor: Ian M. Bennett
  • Publication number: 20100219936
    Abstract: The basic invention uses biometric signals to help identify the name of a family member, acquaintance or newly met individual. The biometric signals include facial and voice recognition. In addition, the invention can interactively produce the name of an individual met for the first time just after that individual shakes your hand and introduces themselves. Most people forget this name since they are concentrating on maintaining a conversation. An earphone plug can physically connect the handheld unit to the ear canal of the user and the identity of the individual is electronically passed over the wire to the user. A handheld unit identifies the individual and whispers their name into the ear canal. An earphone is physically connected to the handheld unit via a wire. This allows the identity of the individual to be electronically passed over the wire to the user. By touching the portable unit which is inserted into the ear canal, the name is whispered into the canal.
    Type: Application
    Filed: May 17, 2010
    Publication date: September 2, 2010
    Applicant: LCTank LLC
    Inventor: Constance Gabara
  • Publication number: 20100223060
    Abstract: The present invention relates to a speech interactive system and method. The system comprises a target information receiving module, an interactive mode setting and speech processing module, an interactive information update module, a decision module, and an output response module. It receives target information and sets corresponding target text sentence information. It also receives a user's speech signal, sets an interactive mode, decides the speech's target text sentence information, and generates an assessment for the target text sentence. Under the set interactive mode, the system updates the information in an interactive information recording table according to the assessment and a timing count. According to the interactive mode and the recorded information, an output mode for the target text sentence information is generated. According to the output mode and the recorded information, the response information is generated.
    Type: Application
    Filed: August 14, 2009
    Publication date: September 2, 2010
    Inventors: Yao-Yuan Chang, Sen-Chia Chang, Shih-Chieh Chien, Jia-Jang Tu
  • Publication number: 20100217602
    Abstract: The present invention relates, in general, to a combined mirror and presentation medium, which allows at least one of various presentation bodies to be inserted thereinto, acts as a mirror such that the inside thereof cannot be seen at normal times, and enables the display of an inserted presentation body to the outside at the time of illumination of the inside of the presentation medium.
    Type: Application
    Filed: June 22, 2006
    Publication date: August 26, 2010
    Inventor: Yong-Kun Kim
  • Publication number: 20100217588
    Abstract: Moving information of an object is input, and first sound information around the object is input. A motion status of the object is recognized based on the moving information. Second sound information is selectively extracted from the first sound information, based on the motion status. A first feature quantity is extracted from the second sound information. A plurality of models is stored in a memory. Each model has a second feature quantity and a corresponding specified context. The second feature quantity is previously extracted by the second extraction unit before the first feature quantity is extracted. A present context of the object is decided based on the specified context corresponding to the second feature quantity most similar to the first feature quantity. The present context of the object is output.
    Type: Application
    Filed: February 19, 2010
    Publication date: August 26, 2010
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kazushige OUCHI, Miwako Doi, Kazunori Imoto, Masaaki Kikuchi, Rika Hosaka
  • Publication number: 20100217596
    Abstract: In one aspect, a method for processing media includes accepting a query. One or more language patterns are identified that are similar to the query. A putative instance of the query is located in the media. The putative instance is associated with a corresponding location in the media. The media in a vicinity of the putative instance is compared to the identified language patterns and data characterizing the putative instance of the query is provided according to the comparing of the media to the language patterns, for example, as a score for the putative instance that is determined according to the comparing of the media to the language patterns.
    Type: Application
    Filed: February 24, 2009
    Publication date: August 26, 2010
    Applicant: Nexidia Inc.
    Inventors: Robert W. Morris, Jon A. Arrowood, Mark A. Clements, Kenneth King Griggs, Peter S. Cardillo, Marsal Gavalda
  • Publication number: 20100217593
    Abstract: A program for generating Hidden Markov Models to be used for speech recognition with a given speech recognition system, the information storage medium storing a program, that renders a computer to function as a scheduled-to-be-used model group storage section that stores a scheduled-to-be-used model group including a plurality of Hidden Markov Models scheduled to be used by the given speech recognition system, and a filler model generation section that generates Hidden Markov Models to be used as filler models by the given speech recognition system based on all or at least a part of the Hidden Markov Model group in the scheduled-to-be-used model group.
    Type: Application
    Filed: February 5, 2010
    Publication date: August 26, 2010
    Applicant: SEIKO EPSON CORPORATION
    Inventors: Paul W. Shields, Matthew E. Dunnachie, Yasutoshi Takizawa
  • Publication number: 20100211396
    Abstract: A digital speech enabled middleware module is disclosed that facilitates interaction between a large number of client devices and network-based automatic speech recognition (ASR) resources. The module buffers feature vectors associated with speech received from the client devices when the number of client devices is greater than the available ASR resources. When an ASR decoder becomes available, the module transmits the feature vectors to the ASR decoder and a recognition result is returned.
    Type: Application
    Filed: May 3, 2010
    Publication date: August 19, 2010
    Applicant: AT&T Intellectual Property II, LP via transfer from AT&T Corp.
    Inventors: Iker Arizmendi, Sarangarajan Parthasarathy, Richard Cameron Rose
  • Publication number: 20100211376
    Abstract: Computer implemented speech processing generates one or more pronunciations of an input word in a first language by a non-native speaker of the first language who is a native speaker of a second language. The input word is converted into one or more pronunciations. Each pronunciation includes one or more phonemes selected from a set of phonemes associated with the second language. Each pronunciation is associated with the input word in an entry in a computer database. Each pronunciation in the database is associated with information identifying a pronunciation language and/or a phoneme language.
    Type: Application
    Filed: February 2, 2010
    Publication date: August 19, 2010
    Applicant: Sony Computer Entertainment Inc.
    Inventors: Ruxin Chen, Gustavo Hernandez-Abrego, Masanori Omote, Xavier Menendez-Pidal
  • Publication number: 20100202596
    Abstract: An electronically authenticated internet voice connection can be initiated on an institution's website. Authentication of the customer's identity can be determined based upon already established credentials, such as a username and password. Upon verifying the identity of the customer, the institution's web server can generate and transmit a unique identifier to the customer's browser. The unique identifier can be an encrypted identifier used to authenticate the customer when establishing a subsequent voice connection.
    Type: Application
    Filed: February 12, 2009
    Publication date: August 12, 2010
    Applicant: International Business Machines Corporation
    Inventors: Scott M. Andrews, Anthony B. Ferguson, David P. Moore, John T. Robertson
  • Publication number: 20100204992
    Abstract: An process for recognizing an acoustic event in an audio signal has two stages The first stage involves possible candidates being selected, and the second stage involves each of the possible candidates being allocated a confidence value.
    Type: Application
    Filed: August 25, 2008
    Publication date: August 12, 2010
    Inventor: Markus Schlosser
  • Publication number: 20100205534
    Abstract: A method of responding to an emergency includes monitoring for an internet protocol based help request message; receiving the help request message; evaluating the help request message and additional information; and dispatching emergency help in response to the help request message and the additional information.
    Type: Application
    Filed: April 27, 2010
    Publication date: August 12, 2010
    Applicant: AT&T Intellectual Property I, L.P. f/k/a BellSouth Intellectual Property Corporation
    Inventors: Samuel N. Zellner, Mark J. Enzmann, Robert T. Moton, JR.
  • Publication number: 20100204982
    Abstract: Embodiments of a dialog system that utilizes grammar-based labeling scheme to generate labeled sentences for use in training statistical models. During the process of training data development, a grammar is constructed manually based on the application domain or adapted from a general grammar rule. An annotation schema is created accordingly based on the application requirements, such as syntactic and semantic information. Such information is then included in the grammar specification. After the labeled grammar is constructed, a generation algorithm is then used to generate sentences for training various statistical models.
    Type: Application
    Filed: February 6, 2009
    Publication date: August 12, 2010
    Applicant: ROBERT BOSCH GMBH
    Inventors: Fuliang Weng, Zhe Feng, Katrina Li
  • Publication number: 20100204987
    Abstract: A speech recognition device is disclosed. The device obtains sound of speech of a user and an image of a lip shape of the user. The device determines whether a sudden noise is generated during user speaking. When it is determined that a sudden noise is not generated, the device recognizes content of the speech based on the sound of the speech. When it is determined that a sudden noise is generated, the device recognize the content of the speech based on the image of the lip shape of the user.
    Type: Application
    Filed: February 3, 2010
    Publication date: August 12, 2010
    Applicant: DENSO CORPORATION
    Inventor: Hideo Miyauchi
  • Publication number: 20100205120
    Abstract: A method for researching and developing a recognition model in a computing environment, including gathering one or more data samples from one or more users in the computing environment into a training data set used for creating the recognition model, receiving one or more training parameters defining a feature extraction algorithm configured to analyze one or more features of the training data set, a classifier algorithm configured to associate the features to a template set, a selection of a subset of the training data set, a type of the data samples, or combinations thereof, creating the recognition model based on the training parameters, and evaluating the recognition model.
    Type: Application
    Filed: February 6, 2009
    Publication date: August 12, 2010
    Applicant: Microsoft Corporation
    Inventors: Yu Zou, Hao Wei, Gong Cheng, Dongmei Zhang, Jian Wang
  • Publication number: 20100198093
    Abstract: A voice recognition apparatus includes: a voice input element for inputting voice of an user; a voice pattern memory for storing multiple voice patterns respectively corresponding to multiple phrases; a voice recognition element for calculating a similarity degree between the voice and each voice pattern and determining the highest similarity degree so that one voice pattern corresponding to the highest similarity degree is recognized as the voice; a display for displaying a recognition result corresponding to the one voice pattern; an execution determination element for executing a process according to the one voice pattern when a predetermined operation is input by the user; a load estimation element for estimating a load of the user; and a display controller for controlling the display based on a positive correlation between the load and display repetition of the recognition result on the display.
    Type: Application
    Filed: February 2, 2010
    Publication date: August 5, 2010
    Applicant: DENSO CORPORATION
    Inventors: Yuusuke Katayama, Katsushi Asami, Manabu Otsuka
  • Publication number: 20100198597
    Abstract: Methods, speech recognition systems, and computer readable media are provided that recognize speech using dynamic pruning techniques. A search network is expanded based on a frame from a speech signal, a best hypothesis is determined in the search network, a default beam threshold is modified, and the search network is pruned using the modified beam threshold. The search network may be further pruned based on the search depth of the best hypothesis and/or the average number of frames per state for a search path.
    Type: Application
    Filed: January 30, 2009
    Publication date: August 5, 2010
    Inventor: Qifeng ZHU
  • Publication number: 20100198582
    Abstract: Nothing exists like this Verbal Command Laptop Computer and Software worldwide to transfer electronic information data of EDI. It can be used for the elderly when they need to scan something. Just set the item on a scanner and say, “scan, please.” The Verbal Command Laptop Computer and Software can be used to store names and addresses. Also, used as a fax machine. Just say, “fax, please” or “email please.” When user wishes to use email say, “check email please” or “send email please.” When wanting to use the Internet say, “Internet please” or for search engine say, “search engine.” The Verbal Command Laptop Computer and Software can handle all phases of a standard computer. The Verbal Command Laptop Computer and Software also has its own search engine. One can also use verbal command hands free with the Cordless Microphone or Verbal Head Set.
    Type: Application
    Filed: February 2, 2009
    Publication date: August 5, 2010
    Inventor: Gregory Walker Johnson
  • Publication number: 20100198591
    Abstract: A portable terminal having an audio pickup means that acquires sound, an absolute position detection unit that detects the absolute position of the portable terminal, a relative position detection unit that detects the relative position of the portable terminal, and a speech recognition and synthesis unit that recognizes the audio acquired by the audio pickup means as speech, is achieved with a simple configuration. A portable terminal (1) that exchanges data with a server (2) has disposed to the portable terminal an audio pickup means that acquires sound, an absolute position detection unit (1-1) that detects the absolute position of the portable terminal, a relative position detection unit (1-2) that detects the relative position of the portable terminal, and a speech recognition and synthesis unit (1-3) that recognizes the audio acquired by the audio pickup means as speech.
    Type: Application
    Filed: February 2, 2010
    Publication date: August 5, 2010
    Applicant: SEIKO EPSON CORPORATION
    Inventors: Junichi Yoshizawa, Tetsuo Ozawa, Koji Koseki
  • Publication number: 20100198598
    Abstract: A method for recognizing a speaker of an utterance in a speech recognition system is disclosed. A likelihood score for each of a plurality of speaker models for different speakers is determined. The likelihood score indicating how well the speaker model corresponds to the utterance. For each of the plurality of speaker models, a probability that the utterance originates from that speaker is determined. The probability is determined based on the likelihood score for the speaker model and requires the estimation of a distribution of likelihood scores expected based at least in part on the training state of the speaker.
    Type: Application
    Filed: February 4, 2010
    Publication date: August 5, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Tobias Herbig, Franz Gerl
  • Publication number: 20100198593
    Abstract: Enhancing speech components of an audio signal composed of speech and noise components includes controlling the gain of the audio signal in ones of its subbands, wherein the gain in a subband is reduced as the level of estimated noise components increases with respect to the level of speech components, wherein the level of estimated noise components is determined at least in part by (1) comparing an estimated noise components level with the level of the audio signal in the subband and increasing the estimated noise components level in the subband by a predetermined amount when the input signal level in the subband exceeds the estimated noise components level in the subband by a limit for more than a defined time, or (2) obtaining and monitoring the signal-to-noise ratio in the subband and increasing the estimated noise components level in the subband by a predetermined amount when the signal-to-noise ratio in the subband exceeds a limit for more than a defined time.
    Type: Application
    Filed: September 10, 2008
    Publication date: August 5, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Rongshan Yu