Pattern Display Patents (Class 704/276)
  • Publication number: 20120323581
    Abstract: Systems and methods are disclosed for performing voice personalization of video content. The personalized media content may include a composition of a background scene having a character, head model data representing an individualized three-dimensional (3D) head model of a user, audio data simulating the user's voice, and a viseme track containing instructions for causing the individualized 3D head model to lip sync the words contained in the audio data. The audio data simulating the user's voice can be generated using a voice transformation process. In certain examples, the audio data is based on a text input or selected by the user (e.g., via a telephone or computer) or a textual dialogue of a background character.
    Type: Application
    Filed: August 30, 2012
    Publication date: December 20, 2012
    Applicant: IMAGE METRICS, INC.
    Inventors: Jonathan Isaac Strietzel, Jon Hayes Snoddy, Douglas Alexander Fidaleo
  • Patent number: 8335689
    Abstract: A method and system for improving the efficiency of speech transcription by automating the management of a varying pool of human and machine transcribers having diverse qualifications, skills, and reliability for a fluctuating load of speech transcription tasks of diverse requirements such as accuracy, promptness, privacy, and security, from sources of diverse characteristics such as language, dialect, accent, speech style, voice type, vocabulary, audio quality, and duration.
    Type: Grant
    Filed: October 14, 2009
    Date of Patent: December 18, 2012
    Assignee: COGI, Inc.
    Inventors: Andreas Wittenstein, David Brahm, Mark Cromack, Robert Dolan
  • Patent number: 8326626
    Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In once aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: December 4, 2012
    Assignee: West Corporation
    Inventors: Mark J. Pettay, Fonda J. Narke
  • Patent number: 8315366
    Abstract: A system and method for determining a speaker's position and a generating a display showing the position of the speaker. In one embodiment, the system comprises a first speakerphone system and a second speakerphone system communicatively coupled to send and receive data. The speakerphone system comprises a display, an input device, a microphone array, a speaker, and a position processing module. The position processing module is coupled to receive acoustic signals from the microphone array. The position processing module uses these acoustic signals to determine a position of the speaker. The position information is then sent to other speakerphone system for presentation on the display. In one embodiment, the position processing module comprises an auto-detection module, a position analysis module, a tracking module and an identity matching module for the detection of sound, the determination of position and transmission of position information over the network.
    Type: Grant
    Filed: July 22, 2008
    Date of Patent: November 20, 2012
    Assignee: ShoreTel, Inc.
    Inventors: Edwin J. Basart, David B. Rucinski
  • Patent number: 8306824
    Abstract: An apparatus and method of creating a face character which corresponds to a voice of a user is provided. To create various facial expressions with fewer key models, a face character is divided in a plurality of areas and a voice sample is parameterized corresponding to pronunciation and emotion. If the user's voice is input, a face character image corresponding to divided face areas is synthesized using key models and data about parameters corresponding to the voice sample to synthesize an overall face character image using the synthesized face character image corresponding to the divided face areas.
    Type: Grant
    Filed: August 26, 2009
    Date of Patent: November 6, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Bong-cheol Park
  • Patent number: 8255224
    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving geographical information derived from a non-verbal user action associated with a first computing device. The non-verbal user action implies an interest of a user in a geographic location. The method also includes identifying a grammar associated with the geographic location using the derived geographical information and outputting a grammar indicator for use in selecting the identified grammar for voice recognition processing of vocal input from the user.
    Type: Grant
    Filed: March 7, 2008
    Date of Patent: August 28, 2012
    Assignee: Google Inc.
    Inventors: David Singleton, Debajit Ghosh
  • Patent number: 8249873
    Abstract: Tonal correction of speech is provided. Received speech is analyzed and compared to a table of commonly mispronounced phrases. These phrases are mapped to the phrase likely intended by the speaker. The phrase determines to be the phrase the user likely intended can be suggested to the user. If the user approves of the suggestion, tonal correction can be applied to the speech before that speech is delivered to a recipient.
    Type: Grant
    Filed: August 12, 2005
    Date of Patent: August 21, 2012
    Assignee: Avaya Inc.
    Inventors: Colin Blair, Kevin Chan, Christopher R. Gentle, Neil Hepworth, Andrew W. Lang, Paul R. Michaelis
  • Patent number: 8250484
    Abstract: A computer and a method for generation commands include loading a data exchange format (DXF) image, and selecting a measurement tool and selecting a DXF feature of the DXF image. The generation commands method further includes generating an edge detection command of the selected DXF feature according to the measurement tool when the size of the selected DXF feature is not larger than the size of an image area. And an edge detection command corresponding to each of the reselected measurement tools is generated when the size of the selected DXF feature is larger than the size of the image area.
    Type: Grant
    Filed: August 10, 2010
    Date of Patent: August 21, 2012
    Assignees: Hong Fu Jin Precision Industry (ShenZhen) Co., Ltd., Hon Hai Precision Industry Co., Ltd.
    Inventors: Chih-Kuang Chang, Zhong-Kui Yuan, Yi-Rong Hong, Xian-Yi Chen, Dong-Hai Li
  • Patent number: 8249870
    Abstract: A semi-automatic speech transcription system of the invention leverages the complementary capabilities of human and machine, building a system which combines automatic and manual approaches. With the invention, collected audio data is automatically distilled into speech segments, using signal processing and pattern recognition algorithms. The detected speech segments are presented to a human transcriber using a transcription tool with a streamlined transcription interface, requiring the transcriber to simply “listen and type”. This eliminates the need to manually navigate the audio, coupling the human effort to the amount of speech, rather than the amount of audio. Errors produced by the automatic system can be quickly identified by the human transcriber, which are used to improve the automatic system performance. The automatic system is tuned to maximize the human transcriber efficiency.
    Type: Grant
    Filed: November 12, 2008
    Date of Patent: August 21, 2012
    Assignee: Massachusetts Institute of Technology
    Inventors: Brandon Cain Roy, Deb Kumar Roy
  • Patent number: 8229751
    Abstract: A system and method of detecting unidentified broadcast electronic media content using a self-similarity technique is presented. The process and system catalogues repeated instances of content that has not be positively identified, but are sufficiently similar as to infer repetitive broadcasts. These catalogued instances may be further processed on the basis of different broadcast channels, sources, geographic locations of broadcasts or format to further assist the identification thereof.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: July 24, 2012
    Assignee: Mediaguide, Inc.
    Inventor: Kwan Cheung
  • Patent number: 8229748
    Abstract: Methods and apparatus to present a video program to a visually impaired person are disclosed. An example method comprises receiving a video stream and an associated audio stream of a video program, detecting a portion of the video program that is not readily consumable by a visually impaired person, obtaining text associated with the portion of the video program, converting the text to a second audio stream, and combining the second audio stream with the associated audio stream.
    Type: Grant
    Filed: April 14, 2008
    Date of Patent: July 24, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Hisao M. Chang, Horst Schroeter
  • Patent number: 8224656
    Abstract: A method, program storage device and mobile device provide speech disambiguation. Audio for speech recognition processing is transmitted by the mobile device. Results representing alternates identified to match the transmitted audio are received. The alternates are displayed in a disambiguation dialog screen for making corrections to the alternates. Corrections are made to the alternates using the disambiguation dialog screen until a correct result is displayed. The correct result is selected. Content associated with the selected correct result is received in parallel with the receiving of the results representing alternates identified to match the transmitted audio.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: July 17, 2012
    Assignee: Microsoft Corporation
    Inventors: Oliver Scholz, Robert L. Chambers, Julian James Odell
  • Patent number: 8219386
    Abstract: The Arabic poetry meter identification system and method produces coded Al-Khalyli transcriptions of Arabic poetry. The meters (Wazn, Awzan being forms of the Arabic poems units Bayt, Abyate) are identified. A spoken or written poem is accepted as input. A coded transcription of the poetry pattern forms is produced from input processing. The system identifies and distinguishes between proper spoken poetic meter and improper poetic meter. Error in the poem meters (Bahr, Buhur) and the ending rhyme pattern, “Qafiya” are detected and verified. The system accepts user selection of a desired poem meter and then interactively aids the user in the composition of poetry in the selected meter, suggesting alternative words and word groups that follow the desired poem pattern and dactyl components. The system can be in a stand-alone device or integrated with other computing devices.
    Type: Grant
    Filed: January 21, 2009
    Date of Patent: July 10, 2012
    Assignee: King Fahd University of Petroleum and Minerals
    Inventors: Al-Zahrani Abdul Kareem Saleh, Moustafa Elshafei
  • Patent number: 8214209
    Abstract: Disclosed is a speech recognition system which including speech input means for receiving the speech data, speech recognition means for receiving the input speech data from the speech input means and performing speech recognition, recognition result evaluation means for determining a priority of at least one of a recognition result and each portion forming the recognition result obtained by the speech recognition by the speech recognition means, storage means for storing the recognition result and the priority, recognition result formatting means for determining display/non-display of the recognition result and/or each portion forming the recognition result and generating output information according to the priority, and output means for outputting the output information.
    Type: Grant
    Filed: October 11, 2006
    Date of Patent: July 3, 2012
    Assignee: NEC Corporation
    Inventor: Kentarou Nagatomo
  • Patent number: 8208608
    Abstract: A multi-modal system providing for a single point of contact that can allow users to manage their personal contact information and contact lists, and connect to other users and businesses in a personalized, efficient, location-sensitive and organized manner. By accessing the system using any type of telephony-based device, a user can manage all of their personal and business contacts as well as perform generalized searches in public databases, such as white page and/or yellow page listings, or more personalized searches through databases of their business or personal contacts. A user may also, during a generalized search, go to a personalized search, and vice-versa. The system may also provide users with the opportunity to select certain businesses from their contact lists and allow these businesses to provide them with personalized data, either on demand or based on user-controlled permissions or areas of interest through various technologies including presence technologies.
    Type: Grant
    Filed: February 4, 2009
    Date of Patent: June 26, 2012
    Assignee: Call Genie Inc.
    Inventors: Todd Garrett Simpson, Christopher Edward Lugg
  • Patent number: 8209179
    Abstract: This invention realizes a speech communication system and method, and a robot apparatus capable of significantly improving entertainment property. A speech communication system with a function to make conversation with a conversation partner is provided with a speech recognition means for recognizing speech of the conversation partner, a conversation control means for controlling conversation with the conversation partner based on the recognition result of the speech recognition means, an image recognition means for recognizing the face of the conversation partner, and a tracking control means for tracing the existence of the conversation partner based on one or both of the recognition result of the image recognition means and the recognition result of the speech recognition means. The conversation control means controls conversation so as to continue depending on tracking of the tracking control means.
    Type: Grant
    Filed: July 2, 2004
    Date of Patent: June 26, 2012
    Assignee: Sony Corporation
    Inventors: Kazumi Aoyama, Hideki Shimomura
  • Patent number: 8201139
    Abstract: A framework for generating a semantic interpretation of natural language input includes an interpreter, a first set of types, and a second set of types. The interpreter is adapted to mediate between a client application and one or more analysis engines to produce interpretations of the natural language input that are valid for the client application. The first set of types is adapted to define interactions between the interpreter and the one or more analysis engines. The second set of types is adapted to define interactions between the interpreter and the client application.
    Type: Grant
    Filed: September 15, 2004
    Date of Patent: June 12, 2012
    Assignee: Microsoft Corporation
    Inventors: Su Chin Chang, Ravi C. Shahani, Domenic J. Cipollone, Michael V. Calcagno, Mari J. B. Olsen, David J. Parkinson
  • Publication number: 20120130720
    Abstract: An information providing device takes an image of a predetermined area and obtains the taken image in the form of image data, while externally obtaining voice data representing speech. The information providing device obtains text in a preset language corresponding to the speech in the form of text data, based on the obtained voice data, generates a composite image including the taken image and the text in the form of composite image data, based on the image data and the text data, and outputs the composite image data.
    Type: Application
    Filed: November 14, 2011
    Publication date: May 24, 2012
    Applicant: ELMO COMPANY LIMITED
    Inventor: Yasushi Suda
  • Patent number: 8155964
    Abstract: This invention includes: a voice quality feature database (101) holding voice quality features; a speaker attribute database (106) holding, for each voice quality feature, an identifier enabling a user to expect a voice quality of the voice quality feature; a weight setting unit (103) setting a weight for each acoustic feature of a voice quality; a scaling unit (105) calculating display coordinates of each voice quality feature based on the acoustic features in the voice quality feature and the weights set by the weight setting unit (103); a display unit (107) displaying the identifier of each voice quality feature on the calculated display coordinates; a position input unit (108) receiving designated coordinates; and a voice quality mix unit (110) (i) calculating a distance between (1) the received designated coordinates and (2) the display coordinates of each of a part or all of the voice quality features, and (ii) mixing the acoustic features of the part or all of the voice quality features together based
    Type: Grant
    Filed: June 4, 2008
    Date of Patent: April 10, 2012
    Assignee: Panasonic Corporation
    Inventors: Yoshifumi Hirose, Takahiro Kamai
  • Patent number: 8115772
    Abstract: In an embodiment, a method is provided for creating a personal animated entity for delivering a multi-media message from a sender to a recipient. An image file from the sender may be received by a server. The image file may include an image of an entity. The sender may be requested to provide input with respect to facial features of the image of the entity in preparation for animating the image of the entity. After the sender provides the input with respect to the facial features of the image of the entity, the image of the entity may be presented as a personal animated entity to the sender to preview. Upon approval of the preview from the sender, the image of the entity may be presented as a sender-selectable personal animated entity for delivering the multi-media message to the recipient.
    Type: Grant
    Filed: April 8, 2011
    Date of Patent: February 14, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Joern Ostermann, Mehmet Reha Civanlar, Ana Cristina Andres del Valle, Patrick Haffner
  • Patent number: 8117034
    Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and thus establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during the acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to the speech data (SD) just played back marked by the link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) now make it possible to synchronize the text cursor (TC) with the audio cursor (AC) or the audio cursor (AC) with the text cursor (TC) so that the positioning of the respective cursor (AC, TC) is simplified considerably.
    Type: Grant
    Filed: March 26, 2002
    Date of Patent: February 14, 2012
    Assignee: Nuance Communications Austria GmbH
    Inventor: Wolfgang Gschwendtner
  • Publication number: 20120029907
    Abstract: A digital pen designed to assist users in spelling words as they write. The invention is an electronic pen with a speaker located near the top of the device. A microphone may be located directly under the speaker in the form of a small screened concave or convex aperture. A switch on the back of the pen allows the user to choose between three settings: Medical Dictionary (D), Off (O), and Prescription Drug List (P). The device works by the user speaking the desired word into the microphone. The word will then appear on the illuminated digital display screen which lights up. The pen asks the user to confirm or deny the displayed word. The user says “yes” or “no” into the microphone. If denied, the pen displays another word until the correct word is located. Once confirmed, the pen will audibly and visibly spell the word one letter at a time as the user writes. The pen may be switched to the prescription drug list mode as needed.
    Type: Application
    Filed: December 30, 2010
    Publication date: February 2, 2012
    Inventors: Angela Loggins, Tamara S. Loggins
  • Patent number: 8108213
    Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.
    Type: Grant
    Filed: January 13, 2010
    Date of Patent: January 31, 2012
    Assignee: West Corporation
    Inventors: Mark J Pettay, Fonda J Narke
  • Patent number: 8099286
    Abstract: Situational awareness enhancement is promoted in a radio communication transceiver receiving non-verbal content and verbal content from a remote source. A vocoder communicatively coupled with the transceiver receives the verbal content packets, and a situational awareness encoder/decoder communicatively coupled with the transceiver receives the non-verbal content packets. The encoder/decoder links and synchronizes the verbal content packets with the corresponding non-verbal content packets, and provides the verbal content packets to the vocoder in parallel with the provision of the non-verbal content packets to the encoder/decoder. The non-verbal and verbal content is extracted from the respective packets and are synchronously output for display and playback, respectively.
    Type: Grant
    Filed: May 12, 2008
    Date of Patent: January 17, 2012
    Assignee: Rockwell Collins, Inc.
    Inventors: John Thommana, Lizy Paul
  • Patent number: 8078467
    Abstract: This invention provides a device and method for language model switching and adaptation, wherein the device comprises a notification manager which notifies a language model switching section of the current status information or the request for the language model of an destination application when the status of the destination application is changed; a language model switching section which selects one or more language models to be switched from a language model set according to the received current status information or the request; a LMB engine decodes a user input using the one or more selected language models; and a language model adaptation section which receives the decoded result and modifies the one or more selected language models based on the decoded result. Therefore, the user input is more accurate even if the language model switching section performs different switches among different language models and the performance of the language models are improved by the language model adaptation section.
    Type: Grant
    Filed: March 8, 2007
    Date of Patent: December 13, 2011
    Assignee: NEC (China) Co., Ltd.
    Inventors: Genqing Wu, Liqin Xu
  • Patent number: 8073701
    Abstract: The present disclosure relates to identity verification devices and methods. A system is provided that utilizes a system of tonal and rhythmic visualization of a spoken word to accurately identify the true owner of a credit or other personal card based on their voice.
    Type: Grant
    Filed: April 21, 2008
    Date of Patent: December 6, 2011
    Assignee: Master Key, LLC
    Inventor: Kenneth R. Lemons
  • Patent number: 8068107
    Abstract: In a multimedia presentation, having speech and graphic contributions, a list of graphic objects is provided. Each graphic is associated to a graphic file capable of being executed by a computer to display a corresponding graphic contribution on a screen. A speech file comprising a sequence of phrases is also created, each phrase comprising a speech contribution explaining at least one graphic contribution associated to a respective graphic object. Then, an arrangement string is created obtained as a sequence of a first graphic object and a respective first phrase, and then a second graphic object and a respective second phrase, and so on up to completion of all graphic objects and phrases of said list and of said speech file respectively. A processing speed for displaying the graphic objects is chosen.
    Type: Grant
    Filed: November 22, 2004
    Date of Patent: November 29, 2011
    Inventor: Mario Pirchio
  • Patent number: 8065155
    Abstract: An adaptive advertising apparatus and associated methods. In one embodiment, the apparatus comprises a computer readable medium having at least one computer program disposed thereon, the at least one program being configured to adaptively present (e.g., display) advertising-related content (e.g., audio, video, images, etc.) that is contextually related to the topic of a conversation between a plurality of parties or individuals. In one variant, the at least one program comprises a speech recognition program that analyzes digitized speech, and identifies one or more words therein in order to determine the topic of conversation or context. Contextually related (or targeted) advertising is then selected based on its relationship to the determined context or topic.
    Type: Grant
    Filed: February 10, 2010
    Date of Patent: November 22, 2011
    Inventor: Robert F. Gazdzinski
  • Patent number: 8032382
    Abstract: An apparatus and method for speech information processing includes detecting a first operation of a speech processing start instruction element, controlling a display so that speech recognition information is displayed in response to the detection of the first operation, detecting a second operation of the speech processing start instruction element, acquiring speech information in response to detection of the second operation, and performing speech recognition processing on the speech information.
    Type: Grant
    Filed: December 21, 2006
    Date of Patent: October 4, 2011
    Assignee: Canon Kabushiki Kaisha
    Inventors: Hiroki Yamamoto, Tsuyoshi Yagisawa
  • Publication number: 20110231194
    Abstract: In an embodiment, a method of interactive speech preparation is disclosed. The method may include or comprise displaying an interactive speech application on a display device, wherein the interactive speech application has a text display window. The method may also include or comprise accessing text stored in an external storage device over a communication network, and displaying the text within the text display window while capturing video and audio data with video and audio data capturing devices, respectively.
    Type: Application
    Filed: December 16, 2010
    Publication date: September 22, 2011
    Inventor: Steven Lewis
  • Patent number: 8010360
    Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.
    Type: Grant
    Filed: December 15, 2009
    Date of Patent: August 30, 2011
    Assignee: AT&T Intellectual Property IL, L.P.
    Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
  • Patent number: 8004529
    Abstract: A method for processing an animation file to provide an animated icon to an instant messaging environment is presented. An animation file is reformatted to generate the animated icon to satisfy a pre-defined size requirement of the instant messaging environment. The animated icon is stored for distribution to the instant messaging environment.
    Type: Grant
    Filed: October 1, 2007
    Date of Patent: August 23, 2011
    Assignee: Apple Inc.
    Inventors: Justin Wood, Thomas Goossens
  • Patent number: 7970616
    Abstract: A server may provide information to a processing device for displaying a parser user interface. The displayed parser user interface may include an input portal for inputting text input. The parser user interface may further include controls for selecting a level of compression. Upon selection of one of the controls, the server may process the text input and may produce text output which may include a placeholder symbol to replace specific words from the text input and/or abbreviated representations to replace other specific words from the text input. The server may send information to the processing device to display the produced text output, as well as other information. The server may further provide information to the processing device for displaying a speed reader user interface. The speed reader user interface may include controls for starting, stopping, and pausing a speed reading operation as well as other controls.
    Type: Grant
    Filed: July 23, 2007
    Date of Patent: June 28, 2011
    Inventor: Ronald M. Dapkunas
  • Publication number: 20110115988
    Abstract: A method for remotely outputting audio in a display apparatus includes outputting Audio/Visual (AV) data, if a command to remotely output audio is input while the AV data is being output, stopping outputting audio of the AV data while outputting video of the AV data, and transmitting at least one of compressed audio data separated from the AV data and information regarding a time when transmission of the compressed audio data starts to an external apparatus.
    Type: Application
    Filed: November 15, 2010
    Publication date: May 19, 2011
    Inventors: Woo-yong CHANG, Seung-dong Yu, Se-jun Park, Min-jeong Moon
  • Publication number: 20110093274
    Abstract: Disclosed is an apparatus and method of manufacturing an article using sound that modifies sound waveforms for sound of living things (including human voice) in various shapes and manufactures articles corresponding to the shapes. An apparatus for manufacturing an article using sound generates a sampling waveform based on the sound waveform. Next, the sampling waveform is converted into a two-dimensional image file and the two-dimensional image is again converted into a three-dimensional image file. Thereafter, an article is manufactured based on the two-dimensional or three-dimensional image file. According to the invention, the apparatus and method of manufacturing an article using sound manufactures an article based on the sampling waveform generated by sampling the sound waveform, thereby manufacturing a simplified article.
    Type: Application
    Filed: May 16, 2008
    Publication date: April 21, 2011
    Inventor: Kwanyoung Lee
  • Publication number: 20110087493
    Abstract: The invention relates to a communication system having a display unit (2) and a virtual being (3) that can be visually represented on the display unit (2) and that is designed for communication by means of natural speech with a natural person, wherein at least one interaction symbol (6, 7) that can be represented on the display unit (2) and by means of which the natural speech dialog between the virtual being (3) and the natural person is supported such that an achieved dialog state can be indicated and/or additional information depending on the dialog state achieved and/or information can he redundantly invoked, The invention further relates to a method for representing information of a communication between a virtual being and a natural person.
    Type: Application
    Filed: May 15, 2009
    Publication date: April 14, 2011
    Inventors: Stefan Sellschopp, Valentin Nicolescu, Helmut Krcmar
  • Publication number: 20110043832
    Abstract: A printed audio format includes a printed encoding of an audio signal, and a plurality of spaced-apart and parallel rails. The printed encoding of the audio signal is located between the plurality of rails and each rail comprises at least one marker. The printed encoding comprises a first portion and a second portion, each portion comprises a plurality of code frames, and each frame represents a time segment of an audio signal. The first portion encodes a first time period of the audio signal and the second portion encodes a second time period of the audio signal. The second portion is encoded in reverse order with respect to the first portion so that the joining part is on the same end of both portions.
    Type: Application
    Filed: October 29, 2010
    Publication date: February 24, 2011
    Applicant: Creative Technology Ltd
    Inventors: Wong Hoo Sim, Desmond Toh Onn Hii, Tur We Chan, Chin Fang Lim, Willie Png, Morgun Phay
  • Patent number: 7860717
    Abstract: A system and method may be disclosed for facilitating the site-specific customization of automated speech recognition systems by providing a customization client for site-specific individuals to update and modify language model input files and post processor input files. In customizing the input files, the customization client may provide a graphical user interface for facilitating the inclusion of words specific to a particular site. The customization client may also be configured to provide the user with a series of formatting rules for controlling the appearance and format of a document transcribed by an automated speech recognition system.
    Type: Grant
    Filed: September 27, 2004
    Date of Patent: December 28, 2010
    Assignee: Dictaphone Corporation
    Inventors: Amy J. Urhbach, Alan Frankel, Jill Carrier, Ana Santisteban, William F. Cote
  • Patent number: 7860718
    Abstract: Provided are an apparatus and method for speech segment detection, and a system for speech recognition. The apparatus is equipped with a sound receiver and an image receiver and includes: a lip motion signal detector for detecting a motion region from image frames output from the image receiver, applying lip motion image feature information to the detected motion region, and detecting a lip motion signal; and a speech segment detector for detecting a speech segment using sound frames output from the sound receiver and the lip motion signal detected from the lip motion signal detector. Since lip motion image information is checked in a speech segment detection process, it is possible to prevent dynamic noise from being misrecognized as speech.
    Type: Grant
    Filed: December 4, 2006
    Date of Patent: December 28, 2010
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Soo Jong Lee, Sang Hun Kim, Young Jik Lee, Eung Kyeu Kim
  • Publication number: 20100250256
    Abstract: A section 52 corresponding to a given duration is sampled from sound data 50 that indicates the voice of a player collected by a microphone, and a vocal tract cross-sectional area function 54 of the sampled section is calculated. The vertical dimension 1y of the mouth is calculated from a throat-side average cross-sectional area d1 of the vocal tract cross-sectional area function 54, and the area dm of the mouth is calculated from a mouth-side average cross-sectional area d2. The transverse dimension of the mouth is calculated from the area dm and the vertical dimension 1y of the mouth.
    Type: Application
    Filed: March 26, 2010
    Publication date: September 30, 2010
    Applicant: NAMCO BANDAI GAMES INC.
    Inventor: Hiroyuki HIRAISHI
  • Patent number: 7797146
    Abstract: A method of simulating interactive communication between a user and a human subject. The method comprises: assigning at least one phrase to a stored content sequence, wherein the content sequence comprises a content clip of the subject; parsing the at least one phrase to produce at least one phonetic clone; associating the at least one phonetic clone with the stored content sequence; receiving an utterance from the user; matching the utterance to the at least one phonetic clone; and displaying the stored content sequence associated with the at least one phonetic clone.
    Type: Grant
    Filed: May 13, 2003
    Date of Patent: September 14, 2010
    Assignee: Interactive Drama, Inc.
    Inventors: William G. Harless, Michael G. Harless, Marcia A. Zier
  • Patent number: 7788104
    Abstract: The present invention is to provide an information processing terminal which can use another expression means to indicate undesirable emotions directly transmitted to a party by a method of directly expressing talking person's emotions in real time, so that the whole image of a calling status can be reviewed afterward and grasped. An information processing terminal 1 including: a voice signal output portion 102 for inputting a voice; an emotion estimation portion 201 for generating parameters of emotions from the inputted voice; and a notification portion 30, 40, 50 for giving notice of various kinds of information, wherein the information processing terminal 1 further includes an emotion specifying portion 203 for specifying an emotion expressed by a distinctive parameter of the generated parameters, and the notification portion 30, 40, 50 gives notice of the specified emotion.
    Type: Grant
    Filed: September 9, 2005
    Date of Patent: August 31, 2010
    Assignee: Panasonic Corporation
    Inventors: Hideaki Matsuo, Takaaki Nishi, Tomoko Obama, Yasuki Yamakawa, Tetsurou Sugimoto
  • Publication number: 20100211397
    Abstract: An avatar facial expression representation technology is provided. The avatar facial expression representation technology estimates changes in emotion and emphasis in a user's voice from vocal information, and changes in mouth shape of the user from pronunciation information of the voice. The avatar facial expression technology tracks a user's facial movements and changes in facial expression from image information and may represent avatar facial expressions based on the result of the these operations. Accordingly, the avatar facial expressions can be obtained which are similar to actual facial expressions of the user.
    Type: Application
    Filed: January 28, 2010
    Publication date: August 19, 2010
    Inventors: Chi-youn PARK, Young-Kyoo HWANG, Jung-bae KIM
  • Publication number: 20100198583
    Abstract: The present invention relates to an indicating method for speech recognition system, comprising a multimedia electronic product and a speech recognition device. The steps of this method include: users enter voice commands into a voice input unit and convert these commands into speech signals, which are acquired and stored by a recording unit, converted by a microprocessor into a volume indicating oscillogram, and then displayed by a display module. At the same time, compliance with speech recognition conditions will be decided in that process.
    Type: Application
    Filed: February 4, 2009
    Publication date: August 5, 2010
    Applicant: AIBELIVE CO., LTD.
    Inventors: Chen-Wei Su, Chun-Ping Fang, Min-Ching Wu
  • Publication number: 20100153101
    Abstract: A computerized method and system is provided for automatically selecting from a digitized sound sample a segment of the sample that is optimal for the purpose of measuring clinical metrics for voice and speech assessment. A quality measure based on quality parameters of segments of the sound sample is applied to candidate segments to identify the highest quality segment within the sound sample. The invention can optionally provide feedback to the speaker to help the speaker increase the quality of the sound sample provided. The invention also can optionally perform sound pressure level calibration and noise calibration. The invention may optionally compute clinical metrics on the selected segment and may further include a normative database method or system for storing and analyzing clinical measurements.
    Type: Application
    Filed: November 19, 2009
    Publication date: June 17, 2010
    Inventor: David N. Fernandes
  • Patent number: 7729921
    Abstract: For design of a speech interface accepting speech control options, speech samples are stored on a computer-readable medium. A similarity calculating unit calculates a certain indication of similarity of first and second sets of ones of the speech samples, the first set of speech samples being associated with a first speech control option and the second set of speech samples being associated with a second speech control option. A display unit displays the similarity indication. In another aspect, word vectors are generated for the respective speech sample sets, indicating frequencies of occurrence of respective words in the respective speech sample sets. The similarity calculating unit calculates the similarity indication responsive to the word vectors of the respective speech sample sets. In another aspect, a perplexity indication is calculated for respective speech sample sets responsive to language models for the respective speech sample sets.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: June 1, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Osamu Ichikawa, Gakuto Kurata, Masafumi Nishimura
  • Patent number: 7729912
    Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.
    Type: Grant
    Filed: December 23, 2003
    Date of Patent: June 1, 2010
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
  • Patent number: 7693717
    Abstract: An apparatus comprising a session file, session file editor, annotation window, concatenation software and training software. The session file includes one or more audio files and text associated with each audio file segment. The session file editor displays text and provides text selection capability and plays back audio. The annotation window operably associated with the session file editor supports user modification of the selected text, the annotation window saves modified text corresponding to the selected text from the session file editor and audio associated with the modified text. The concatenation software concatenates modified text and audio associated therewith for two or more instances of the selected text. The training software trains a speech user profile using a concatenated file formed by the concatenating software.
    Type: Grant
    Filed: April 12, 2006
    Date of Patent: April 6, 2010
    Assignee: Custom Speech USA, Inc.
    Inventors: Jonathan Kahn, Michael C. Huttinger
  • Patent number: RE42000
    Abstract: A method of formatting and normalizing continuous lip motions to events in a moving picture besides text in a Text-To-Speech converter is provided. A synthesized speech is synchronized with a moving picture by using the method wherein the real speech data and the shape of a lip in the moving picture are analyzed, and information on the estimated lip shape and text information are directly used in generating the synthesized speech.
    Type: Grant
    Filed: October 19, 2001
    Date of Patent: December 14, 2010
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Jae Woo Yang, Jung Chul Lee, Min Soo Hahn, Hang Seop Lee, Youngjik Lee
  • Patent number: RE42647
    Abstract: The present invention provides a text-to-speech conversion system (TTS) for interlocking synchronizing with multimedia and a method for organizing input data of the TTS which can enhance the natural naturalness of synthesized speech and accomplish the synchronization of multimedia with TTS by defining additional prosody information, the information required to interlock synchronize TTS with multimedia, and interface between these this information and TTS for use in the production of the synthesized speech.
    Type: Grant
    Filed: September 30, 2002
    Date of Patent: August 23, 2011
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Jung Chul Lee, Min Soo Hahn, Hang Seop Lee, Jae Woo Yang, Youngjik Lee