Pattern Display Patents (Class 704/276)

Systems and Methods for Voice Personalization of Video Content

Publication number: 20120323581

Abstract: Systems and methods are disclosed for performing voice personalization of video content. The personalized media content may include a composition of a background scene having a character, head model data representing an individualized three-dimensional (3D) head model of a user, audio data simulating the user's voice, and a viseme track containing instructions for causing the individualized 3D head model to lip sync the words contained in the audio data. The audio data simulating the user's voice can be generated using a voice transformation process. In certain examples, the audio data is based on a text input or selected by the user (e.g., via a telephone or computer) or a textual dialogue of a background character.

Type: Application

Filed: August 30, 2012

Publication date: December 20, 2012

Applicant: IMAGE METRICS, INC.

Inventors: Jonathan Isaac Strietzel, Jon Hayes Snoddy, Douglas Alexander Fidaleo
Method and system for efficient management of speech transcribers

Patent number: 8335689

Abstract: A method and system for improving the efficiency of speech transcription by automating the management of a varying pool of human and machine transcribers having diverse qualifications, skills, and reliability for a fluctuating load of speech transcription tasks of diverse requirements such as accuracy, promptness, privacy, and security, from sources of diverse characteristics such as language, dialect, accent, speech style, voice type, vocabulary, audio quality, and duration.

Type: Grant

Filed: October 14, 2009

Date of Patent: December 18, 2012

Assignee: COGI, Inc.

Inventors: Andreas Wittenstein, David Brahm, Mark Cromack, Robert Dolan
Script compliance and quality assurance based on speech recognition and duration of interaction

Patent number: 8326626

Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In once aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script.

Type: Grant

Filed: December 22, 2011

Date of Patent: December 4, 2012

Assignee: West Corporation

Inventors: Mark J. Pettay, Fonda J. Narke
Speaker identification and representation for a phone

Patent number: 8315366

Abstract: A system and method for determining a speaker's position and a generating a display showing the position of the speaker. In one embodiment, the system comprises a first speakerphone system and a second speakerphone system communicatively coupled to send and receive data. The speakerphone system comprises a display, an input device, a microphone array, a speaker, and a position processing module. The position processing module is coupled to receive acoustic signals from the microphone array. The position processing module uses these acoustic signals to determine a position of the speaker. The position information is then sent to other speakerphone system for presentation on the display. In one embodiment, the position processing module comprises an auto-detection module, a position analysis module, a tracking module and an identity matching module for the detection of sound, the determination of position and transmission of position information over the network.

Type: Grant

Filed: July 22, 2008

Date of Patent: November 20, 2012

Assignee: ShoreTel, Inc.

Inventors: Edwin J. Basart, David B. Rucinski
Method and apparatus for creating face character based on voice

Patent number: 8306824

Abstract: An apparatus and method of creating a face character which corresponds to a voice of a user is provided. To create various facial expressions with fewer key models, a face character is divided in a plurality of areas and a voice sample is parameterized corresponding to pronunciation and emotion. If the user's voice is input, a face character image corresponding to divided face areas is synthesized using key models and data about parameters corresponding to the voice sample to synthesize an overall face character image using the synthesized face character image corresponding to the divided face areas.

Type: Grant

Filed: August 26, 2009

Date of Patent: November 6, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventor: Bong-cheol Park
Voice recognition grammar selection based on context

Patent number: 8255224

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving geographical information derived from a non-verbal user action associated with a first computing device. The non-verbal user action implies an interest of a user in a geographic location. The method also includes identifying a grammar associated with the geographic location using the derived geographical information and outputting a grammar indicator for use in selecting the identified grammar for voice recognition processing of vocal input from the user.

Type: Grant

Filed: March 7, 2008

Date of Patent: August 28, 2012

Assignee: Google Inc.

Inventors: David Singleton, Debajit Ghosh
Tonal correction of speech

Patent number: 8249873

Abstract: Tonal correction of speech is provided. Received speech is analyzed and compared to a table of commonly mispronounced phrases. These phrases are mapped to the phrase likely intended by the speaker. The phrase determines to be the phrase the user likely intended can be suggested to the user. If the user approves of the suggestion, tonal correction can be applied to the speech before that speech is delivered to a recipient.

Type: Grant

Filed: August 12, 2005

Date of Patent: August 21, 2012

Assignee: Avaya Inc.

Inventors: Colin Blair, Kevin Chan, Christopher R. Gentle, Neil Hepworth, Andrew W. Lang, Paul R. Michaelis
Computer and method for generatiing edge detection commands of objects

Patent number: 8250484

Abstract: A computer and a method for generation commands include loading a data exchange format (DXF) image, and selecting a measurement tool and selecting a DXF feature of the DXF image. The generation commands method further includes generating an edge detection command of the selected DXF feature according to the measurement tool when the size of the selected DXF feature is not larger than the size of an image area. And an edge detection command corresponding to each of the reselected measurement tools is generated when the size of the selected DXF feature is larger than the size of the image area.

Type: Grant

Filed: August 10, 2010

Date of Patent: August 21, 2012

Assignees: Hong Fu Jin Precision Industry (ShenZhen) Co., Ltd., Hon Hai Precision Industry Co., Ltd.

Inventors: Chih-Kuang Chang, Zhong-Kui Yuan, Yi-Rong Hong, Xian-Yi Chen, Dong-Hai Li
Semi-automatic speech transcription

Patent number: 8249870

Abstract: A semi-automatic speech transcription system of the invention leverages the complementary capabilities of human and machine, building a system which combines automatic and manual approaches. With the invention, collected audio data is automatically distilled into speech segments, using signal processing and pattern recognition algorithms. The detected speech segments are presented to a human transcriber using a transcription tool with a streamlined transcription interface, requiring the transcriber to simply “listen and type”. This eliminates the need to manually navigate the audio, coupling the human effort to the amount of speech, rather than the amount of audio. Errors produced by the automatic system can be quickly identified by the human transcriber, which are used to improve the automatic system performance. The automatic system is tuned to maximize the human transcriber efficiency.

Type: Grant

Filed: November 12, 2008

Date of Patent: August 21, 2012

Assignee: Massachusetts Institute of Technology

Inventors: Brandon Cain Roy, Deb Kumar Roy
Method and apparatus for automatic detection and identification of unidentified Broadcast audio or video signals

Patent number: 8229751

Abstract: A system and method of detecting unidentified broadcast electronic media content using a self-similarity technique is presented. The process and system catalogues repeated instances of content that has not be positively identified, but are sufficiently similar as to infer repetitive broadcasts. These catalogued instances may be further processed on the basis of different broadcast channels, sources, geographic locations of broadcasts or format to further assist the identification thereof.

Type: Grant

Filed: December 30, 2005

Date of Patent: July 24, 2012

Assignee: Mediaguide, Inc.

Inventor: Kwan Cheung
Methods and apparatus to present a video program to a visually impaired person

Patent number: 8229748

Abstract: Methods and apparatus to present a video program to a visually impaired person are disclosed. An example method comprises receiving a video stream and an associated audio stream of a video program, detecting a portion of the video program that is not readily consumable by a visually impaired person, obtaining text associated with the portion of the video program, converting the text to a second audio stream, and combining the second audio stream with the associated audio stream.

Type: Grant

Filed: April 14, 2008

Date of Patent: July 24, 2012

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Hisao M. Chang, Horst Schroeter
Speech recognition disambiguation on mobile devices

Patent number: 8224656

Abstract: A method, program storage device and mobile device provide speech disambiguation. Audio for speech recognition processing is transmitted by the mobile device. Results representing alternates identified to match the transmitted audio are received. The alternates are displayed in a disambiguation dialog screen for making corrections to the alternates. Corrections are made to the alternates using the disambiguation dialog screen until a correct result is displayed. The correct result is selected. Content associated with the selected correct result is received in parallel with the receiving of the results representing alternates identified to match the transmitted audio.

Type: Grant

Filed: March 14, 2008

Date of Patent: July 17, 2012

Assignee: Microsoft Corporation

Inventors: Oliver Scholz, Robert L. Chambers, Julian James Odell
Arabic poetry meter identification system and method

Patent number: 8219386

Abstract: The Arabic poetry meter identification system and method produces coded Al-Khalyli transcriptions of Arabic poetry. The meters (Wazn, Awzan being forms of the Arabic poems units Bayt, Abyate) are identified. A spoken or written poem is accepted as input. A coded transcription of the poetry pattern forms is produced from input processing. The system identifies and distinguishes between proper spoken poetic meter and improper poetic meter. Error in the poem meters (Bahr, Buhur) and the ending rhyme pattern, “Qafiya” are detected and verified. The system accepts user selection of a desired poem meter and then interactively aids the user in the composition of poetry in the selected meter, suggesting alternative words and word groups that follow the desired poem pattern and dactyl components. The system can be in a stand-alone device or integrated with other computing devices.

Type: Grant

Filed: January 21, 2009

Date of Patent: July 10, 2012

Assignee: King Fahd University of Petroleum and Minerals

Inventors: Al-Zahrani Abdul Kareem Saleh, Moustafa Elshafei
Speech recognition system, method, and computer readable medium that display recognition result formatted in accordance with priority

Patent number: 8214209

Abstract: Disclosed is a speech recognition system which including speech input means for receiving the speech data, speech recognition means for receiving the input speech data from the speech input means and performing speech recognition, recognition result evaluation means for determining a priority of at least one of a recognition result and each portion forming the recognition result obtained by the speech recognition by the speech recognition means, storage means for storing the recognition result and the priority, recognition result formatting means for determining display/non-display of the recognition result and/or each portion forming the recognition result and generating output information according to the priority, and output means for outputting the output information.

Type: Grant

Filed: October 11, 2006

Date of Patent: July 3, 2012

Assignee: NEC Corporation

Inventor: Kentarou Nagatomo
Method and system of providing personal and business information

Patent number: 8208608

Abstract: A multi-modal system providing for a single point of contact that can allow users to manage their personal contact information and contact lists, and connect to other users and businesses in a personalized, efficient, location-sensitive and organized manner. By accessing the system using any type of telephony-based device, a user can manage all of their personal and business contacts as well as perform generalized searches in public databases, such as white page and/or yellow page listings, or more personalized searches through databases of their business or personal contacts. A user may also, during a generalized search, go to a personalized search, and vice-versa. The system may also provide users with the opportunity to select certain businesses from their contact lists and allow these businesses to provide them with personalized data, either on demand or based on user-controlled permissions or areas of interest through various technologies including presence technologies.

Type: Grant

Filed: February 4, 2009

Date of Patent: June 26, 2012

Assignee: Call Genie Inc.

Inventors: Todd Garrett Simpson, Christopher Edward Lugg
Speech communication system and method, and robot apparatus

Patent number: 8209179

Abstract: This invention realizes a speech communication system and method, and a robot apparatus capable of significantly improving entertainment property. A speech communication system with a function to make conversation with a conversation partner is provided with a speech recognition means for recognizing speech of the conversation partner, a conversation control means for controlling conversation with the conversation partner based on the recognition result of the speech recognition means, an image recognition means for recognizing the face of the conversation partner, and a tracking control means for tracing the existence of the conversation partner based on one or both of the recognition result of the image recognition means and the recognition result of the speech recognition means. The conversation control means controls conversation so as to continue depending on tracking of the tracking control means.

Type: Grant

Filed: July 2, 2004

Date of Patent: June 26, 2012

Assignee: Sony Corporation

Inventors: Kazumi Aoyama, Hideki Shimomura
Semantic framework for natural language programming

Patent number: 8201139

Abstract: A framework for generating a semantic interpretation of natural language input includes an interpreter, a first set of types, and a second set of types. The interpreter is adapted to mediate between a client application and one or more analysis engines to produce interpretations of the natural language input that are valid for the client application. The first set of types is adapted to define interactions between the interpreter and the one or more analysis engines. The second set of types is adapted to define interactions between the interpreter and the client application.

Type: Grant

Filed: September 15, 2004

Date of Patent: June 12, 2012

Assignee: Microsoft Corporation

Inventors: Su Chin Chang, Ravi C. Shahani, Domenic J. Cipollone, Michael V. Calcagno, Mari J. B. Olsen, David J. Parkinson
INFORMATION PROVIDING DEVICE

Publication number: 20120130720

Abstract: An information providing device takes an image of a predetermined area and obtains the taken image in the form of image data, while externally obtaining voice data representing speech. The information providing device obtains text in a preset language corresponding to the speech in the form of text data, based on the obtained voice data, generates a composite image including the taken image and the text in the form of composite image data, based on the image data and the text data, and outputs the composite image data.

Type: Application

Filed: November 14, 2011

Publication date: May 24, 2012

Applicant: ELMO COMPANY LIMITED

Inventor: Yasushi Suda
Voice quality edit device and voice quality edit method

Patent number: 8155964

Abstract: This invention includes: a voice quality feature database (101) holding voice quality features; a speaker attribute database (106) holding, for each voice quality feature, an identifier enabling a user to expect a voice quality of the voice quality feature; a weight setting unit (103) setting a weight for each acoustic feature of a voice quality; a scaling unit (105) calculating display coordinates of each voice quality feature based on the acoustic features in the voice quality feature and the weights set by the weight setting unit (103); a display unit (107) displaying the identifier of each voice quality feature on the calculated display coordinates; a position input unit (108) receiving designated coordinates; and a voice quality mix unit (110) (i) calculating a distance between (1) the received designated coordinates and (2) the display coordinates of each of a part or all of the voice quality features, and (ii) mixing the acoustic features of the part or all of the voice quality features together based

Type: Grant

Filed: June 4, 2008

Date of Patent: April 10, 2012

Assignee: Panasonic Corporation

Inventors: Yoshifumi Hirose, Takahiro Kamai
System and method of customizing animated entities for use in a multimedia communication application

Patent number: 8115772

Abstract: In an embodiment, a method is provided for creating a personal animated entity for delivering a multi-media message from a sender to a recipient. An image file from the sender may be received by a server. The image file may include an image of an entity. The sender may be requested to provide input with respect to facial features of the image of the entity in preparation for animating the image of the entity. After the sender provides the input with respect to the facial features of the image of the entity, the image of the entity may be presented as a personal animated entity to the sender to preview. Upon approval of the preview from the sender, the image of the entity may be presented as a sender-selectable personal animated entity for delivering the multi-media message to the recipient.

Type: Grant

Filed: April 8, 2011

Date of Patent: February 14, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Joern Ostermann, Mehmet Reha Civanlar, Ana Cristina Andres del Valle, Patrick Haffner
Synchronise an audio cursor and a text cursor during editing

Patent number: 8117034

Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and thus establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during the acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to the speech data (SD) just played back marked by the link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) now make it possible to synchronize the text cursor (TC) with the audio cursor (AC) or the audio cursor (AC) with the text cursor (TC) so that the positioning of the respective cursor (AC, TC) is simplified considerably.

Type: Grant

Filed: March 26, 2002

Date of Patent: February 14, 2012

Assignee: Nuance Communications Austria GmbH

Inventor: Wolfgang Gschwendtner
myMedicalpen

Publication number: 20120029907

Abstract: A digital pen designed to assist users in spelling words as they write. The invention is an electronic pen with a speaker located near the top of the device. A microphone may be located directly under the speaker in the form of a small screened concave or convex aperture. A switch on the back of the pen allows the user to choose between three settings: Medical Dictionary (D), Off (O), and Prescription Drug List (P). The device works by the user speaking the desired word into the microphone. The word will then appear on the illuminated digital display screen which lights up. The pen asks the user to confirm or deny the displayed word. The user says “yes” or “no” into the microphone. If denied, the pen displays another word until the correct word is located. Once confirmed, the pen will audibly and visibly spell the word one letter at a time as the user writes. The pen may be switched to the prescription drug list mode as needed.

Type: Application

Filed: December 30, 2010

Publication date: February 2, 2012

Inventors: Angela Loggins, Tamara S. Loggins
Script compliance and quality assurance based on speech recognition and duration of interaction

Patent number: 8108213

Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.

Type: Grant

Filed: January 13, 2010

Date of Patent: January 31, 2012

Assignee: West Corporation

Inventors: Mark J Pettay, Fonda J Narke
System and method for providing situational awareness enhancement for low bit rate vocoders

Patent number: 8099286

Abstract: Situational awareness enhancement is promoted in a radio communication transceiver receiving non-verbal content and verbal content from a remote source. A vocoder communicatively coupled with the transceiver receives the verbal content packets, and a situational awareness encoder/decoder communicatively coupled with the transceiver receives the non-verbal content packets. The encoder/decoder links and synchronizes the verbal content packets with the corresponding non-verbal content packets, and provides the verbal content packets to the vocoder in parallel with the provision of the non-verbal content packets to the encoder/decoder. The non-verbal and verbal content is extracted from the respective packets and are synchronously output for display and playback, respectively.

Type: Grant

Filed: May 12, 2008

Date of Patent: January 17, 2012

Assignee: Rockwell Collins, Inc.

Inventors: John Thommana, Lizy Paul
Device and method for language model switching and adaptation

Patent number: 8078467

Abstract: This invention provides a device and method for language model switching and adaptation, wherein the device comprises a notification manager which notifies a language model switching section of the current status information or the request for the language model of an destination application when the status of the destination application is changed; a language model switching section which selects one or more language models to be switched from a language model set according to the received current status information or the request; a LMB engine decodes a user input using the one or more selected language models; and a language model adaptation section which receives the decoded result and modifies the one or more selected language models based on the decoded result. Therefore, the user input is more accurate even if the language model switching section performs different switches among different language models and the performance of the language models are improved by the language model adaptation section.

Type: Grant

Filed: March 8, 2007

Date of Patent: December 13, 2011

Assignee: NEC (China) Co., Ltd.

Inventors: Genqing Wu, Liqin Xu
Method and apparatus for identity verification using visual representation of a spoken word

Patent number: 8073701

Abstract: The present disclosure relates to identity verification devices and methods. A system is provided that utilizes a system of tonal and rhythmic visualization of a spoken word to accurately identify the true owner of a credit or other personal card based on their voice.

Type: Grant

Filed: April 21, 2008

Date of Patent: December 6, 2011

Assignee: Master Key, LLC

Inventor: Kenneth R. Lemons
Method to synchronize audio and graphics in a multimedia presentation

Patent number: 8068107

Abstract: In a multimedia presentation, having speech and graphic contributions, a list of graphic objects is provided. Each graphic is associated to a graphic file capable of being executed by a computer to display a corresponding graphic contribution on a screen. A speech file comprising a sequence of phrases is also created, each phrase comprising a speech contribution explaining at least one graphic contribution associated to a respective graphic object. Then, an arrangement string is created obtained as a sequence of a first graphic object and a respective first phrase, and then a second graphic object and a respective second phrase, and so on up to completion of all graphic objects and phrases of said list and of said speech file respectively. A processing speed for displaying the graphic objects is chosen.

Type: Grant

Filed: November 22, 2004

Date of Patent: November 29, 2011

Inventor: Mario Pirchio
Adaptive advertising apparatus and methods

Patent number: 8065155

Abstract: An adaptive advertising apparatus and associated methods. In one embodiment, the apparatus comprises a computer readable medium having at least one computer program disposed thereon, the at least one program being configured to adaptively present (e.g., display) advertising-related content (e.g., audio, video, images, etc.) that is contextually related to the topic of a conversation between a plurality of parties or individuals. In one variant, the at least one program comprises a speech recognition program that analyzes digitized speech, and identifies one or more words therein in order to determine the topic of conversation or context. Contextually related (or targeted) advertising is then selected based on its relationship to the determined context or topic.

Type: Grant

Filed: February 10, 2010

Date of Patent: November 22, 2011

Inventor: Robert F. Gazdzinski
Information processing apparatus and information processing method

Patent number: 8032382

Abstract: An apparatus and method for speech information processing includes detecting a first operation of a speech processing start instruction element, controlling a display so that speech recognition information is displayed in response to the detection of the first operation, detecting a second operation of the speech processing start instruction element, acquiring speech information in response to detection of the second operation, and performing speech recognition processing on the speech information.

Type: Grant

Filed: December 21, 2006

Date of Patent: October 4, 2011

Assignee: Canon Kabushiki Kaisha

Inventors: Hiroki Yamamoto, Tsuyoshi Yagisawa
Interactive Speech Preparation

Publication number: 20110231194

Abstract: In an embodiment, a method of interactive speech preparation is disclosed. The method may include or comprise displaying an interactive speech application on a display device, wherein the interactive speech application has a text display window. The method may also include or comprise accessing text stored in an external storage device over a communication network, and displaying the text within the text display window while capturing video and audio data with video and audio data capturing devices, respectively.

Type: Application

Filed: December 16, 2010

Publication date: September 22, 2011

Inventor: Steven Lewis
System and method for latency reduction for automatic speech recognition using partial multi-pass results

Patent number: 8010360

Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.

Type: Grant

Filed: December 15, 2009

Date of Patent: August 30, 2011

Assignee: AT&T Intellectual Property IL, L.P.

Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
Processing an animation file to provide an animated icon

Patent number: 8004529

Abstract: A method for processing an animation file to provide an animated icon to an instant messaging environment is presented. An animation file is reformatted to generate the animated icon to satisfy a pre-defined size requirement of the instant messaging environment. The animated icon is stored for distribution to the instant messaging environment.

Type: Grant

Filed: October 1, 2007

Date of Patent: August 23, 2011

Assignee: Apple Inc.

Inventors: Justin Wood, Thomas Goossens
Efficient review of data

Patent number: 7970616

Abstract: A server may provide information to a processing device for displaying a parser user interface. The displayed parser user interface may include an input portal for inputting text input. The parser user interface may further include controls for selecting a level of compression. Upon selection of one of the controls, the server may process the text input and may produce text output which may include a placeholder symbol to replace specific words from the text input and/or abbreviated representations to replace other specific words from the text input. The server may send information to the processing device to display the produced text output, as well as other information. The server may further provide information to the processing device for displaying a speed reader user interface. The speed reader user interface may include controls for starting, stopping, and pausing a speed reading operation as well as other controls.

Type: Grant

Filed: July 23, 2007

Date of Patent: June 28, 2011

Inventor: Ronald M. Dapkunas
DISPLAY APPARATUS AND METHOD FOR REMOTELY OUTPUTTING AUDIO

Publication number: 20110115988

Abstract: A method for remotely outputting audio in a display apparatus includes outputting Audio/Visual (AV) data, if a command to remotely output audio is input while the AV data is being output, stopping outputting audio of the AV data while outputting video of the AV data, and transmitting at least one of compressed audio data separated from the AV data and information regarding a time when transmission of the compressed audio data starts to an external apparatus.

Type: Application

Filed: November 15, 2010

Publication date: May 19, 2011

Inventors: Woo-yong CHANG, Seung-dong Yu, Se-jun Park, Min-jeong Moon
APPARATUS AND METHOD OF MANUFACTURING ARTICLE USING SOUND

Publication number: 20110093274

Abstract: Disclosed is an apparatus and method of manufacturing an article using sound that modifies sound waveforms for sound of living things (including human voice) in various shapes and manufactures articles corresponding to the shapes. An apparatus for manufacturing an article using sound generates a sampling waveform based on the sound waveform. Next, the sampling waveform is converted into a two-dimensional image file and the two-dimensional image is again converted into a three-dimensional image file. Thereafter, an article is manufactured based on the two-dimensional or three-dimensional image file. According to the invention, the apparatus and method of manufacturing an article using sound manufactures an article based on the sampling waveform generated by sampling the sound waveform, thereby manufacturing a simplified article.

Type: Application

Filed: May 16, 2008

Publication date: April 21, 2011

Inventor: Kwanyoung Lee
Communication System and Method for Representing Information in a Communication

Publication number: 20110087493

Abstract: The invention relates to a communication system having a display unit (2) and a virtual being (3) that can be visually represented on the display unit (2) and that is designed for communication by means of natural speech with a natural person, wherein at least one interaction symbol (6, 7) that can be represented on the display unit (2) and by means of which the natural speech dialog between the virtual being (3) and the natural person is supported such that an achieved dialog state can be indicated and/or additional information depending on the dialog state achieved and/or information can he redundantly invoked, The invention further relates to a method for representing information of a communication between a virtual being and a natural person.

Type: Application

Filed: May 15, 2009

Publication date: April 14, 2011

Inventors: Stefan Sellschopp, Valentin Nicolescu, Helmut Krcmar
PRINTED AUDIO FORMAT AND PHOTOGRAPH WITH ENCODED AUDIO

Publication number: 20110043832

Abstract: A printed audio format includes a printed encoding of an audio signal, and a plurality of spaced-apart and parallel rails. The printed encoding of the audio signal is located between the plurality of rails and each rail comprises at least one marker. The printed encoding comprises a first portion and a second portion, each portion comprises a plurality of code frames, and each frame represents a time segment of an audio signal. The first portion encodes a first time period of the audio signal and the second portion encodes a second time period of the audio signal. The second portion is encoded in reverse order with respect to the first portion so that the joining part is on the same end of both portions.

Type: Application

Filed: October 29, 2010

Publication date: February 24, 2011

Applicant: Creative Technology Ltd

Inventors: Wong Hoo Sim, Desmond Toh Onn Hii, Tur We Chan, Chin Fang Lim, Willie Png, Morgun Phay
System and method for customizing speech recognition input and output

Patent number: 7860717

Abstract: A system and method may be disclosed for facilitating the site-specific customization of automated speech recognition systems by providing a customization client for site-specific individuals to update and modify language model input files and post processor input files. In customizing the input files, the customization client may provide a graphical user interface for facilitating the inclusion of words specific to a particular site. The customization client may also be configured to provide the user with a series of formatting rules for controlling the appearance and format of a document transcribed by an automated speech recognition system.

Type: Grant

Filed: September 27, 2004

Date of Patent: December 28, 2010

Assignee: Dictaphone Corporation

Inventors: Amy J. Urhbach, Alan Frankel, Jill Carrier, Ana Santisteban, William F. Cote
Apparatus and method for speech segment detection and system for speech recognition

Patent number: 7860718

Abstract: Provided are an apparatus and method for speech segment detection, and a system for speech recognition. The apparatus is equipped with a sound receiver and an image receiver and includes: a lip motion signal detector for detecting a motion region from image frames output from the image receiver, applying lip motion image feature information to the detected motion region, and detecting a lip motion signal; and a speech segment detector for detecting a speech segment using sound frames output from the sound receiver and the lip motion signal detected from the lip motion signal detector. Since lip motion image information is checked in a speech segment detection process, it is possible to prevent dynamic noise from being misrecognized as speech.

Type: Grant

Filed: December 4, 2006

Date of Patent: December 28, 2010

Assignee: Electronics and Telecommunications Research Institute

Inventors: Soo Jong Lee, Sang Hun Kim, Young Jik Lee, Eung Kyeu Kim
CHARACTER MOUTH SHAPE CONTROL METHOD

Publication number: 20100250256

Abstract: A section 52 corresponding to a given duration is sampled from sound data 50 that indicates the voice of a player collected by a microphone, and a vocal tract cross-sectional area function 54 of the sampled section is calculated. The vertical dimension 1y of the mouth is calculated from a throat-side average cross-sectional area d1 of the vocal tract cross-sectional area function 54, and the area dm of the mouth is calculated from a mouth-side average cross-sectional area d2. The transverse dimension of the mouth is calculated from the area dm and the vertical dimension 1y of the mouth.

Type: Application

Filed: March 26, 2010

Publication date: September 30, 2010

Applicant: NAMCO BANDAI GAMES INC.

Inventor: Hiroyuki HIRAISHI
Method and system for simulated interactive conversation

Patent number: 7797146

Abstract: A method of simulating interactive communication between a user and a human subject. The method comprises: assigning at least one phrase to a stored content sequence, wherein the content sequence comprises a content clip of the subject; parsing the at least one phrase to produce at least one phonetic clone; associating the at least one phonetic clone with the stored content sequence; receiving an utterance from the user; matching the utterance to the at least one phonetic clone; and displaying the stored content sequence associated with the at least one phonetic clone.

Type: Grant

Filed: May 13, 2003

Date of Patent: September 14, 2010

Assignee: Interactive Drama, Inc.

Inventors: William G. Harless, Michael G. Harless, Marcia A. Zier
Information processing terminal for notification of emotion

Patent number: 7788104

Abstract: The present invention is to provide an information processing terminal which can use another expression means to indicate undesirable emotions directly transmitted to a party by a method of directly expressing talking person's emotions in real time, so that the whole image of a calling status can be reviewed afterward and grasped. An information processing terminal 1 including: a voice signal output portion 102 for inputting a voice; an emotion estimation portion 201 for generating parameters of emotions from the inputted voice; and a notification portion 30, 40, 50 for giving notice of various kinds of information, wherein the information processing terminal 1 further includes an emotion specifying portion 203 for specifying an emotion expressed by a distinctive parameter of the generated parameters, and the notification portion 30, 40, 50 gives notice of the specified emotion.

Type: Grant

Filed: September 9, 2005

Date of Patent: August 31, 2010

Assignee: Panasonic Corporation

Inventors: Hideaki Matsuo, Takaaki Nishi, Tomoko Obama, Yasuki Yamakawa, Tetsurou Sugimoto
FACIAL EXPRESSION REPRESENTATION APPARATUS

Publication number: 20100211397

Abstract: An avatar facial expression representation technology is provided. The avatar facial expression representation technology estimates changes in emotion and emphasis in a user's voice from vocal information, and changes in mouth shape of the user from pronunciation information of the voice. The avatar facial expression technology tracks a user's facial movements and changes in facial expression from image information and may represent avatar facial expressions based on the result of the these operations. Accordingly, the avatar facial expressions can be obtained which are similar to actual facial expressions of the user.

Type: Application

Filed: January 28, 2010

Publication date: August 19, 2010

Inventors: Chi-youn PARK, Young-Kyoo HWANG, Jung-bae KIM
INDICATING METHOD FOR SPEECH RECOGNITION SYSTEM

Publication number: 20100198583

Abstract: The present invention relates to an indicating method for speech recognition system, comprising a multimedia electronic product and a speech recognition device. The steps of this method include: users enter voice commands into a voice input unit and convert these commands into speech signals, which are acquired and stored by a recording unit, converted by a microprocessor into a volume indicating oscillogram, and then displayed by a display module. At the same time, compliance with speech recognition conditions will be decided in that process.

Type: Application

Filed: February 4, 2009

Publication date: August 5, 2010

Applicant: AIBELIVE CO., LTD.

Inventors: Chen-Wei Su, Chun-Ping Fang, Min-Ching Wu
Automated sound segment selection method and system

Publication number: 20100153101

Abstract: A computerized method and system is provided for automatically selecting from a digitized sound sample a segment of the sample that is optimal for the purpose of measuring clinical metrics for voice and speech assessment. A quality measure based on quality parameters of segments of the sound sample is applied to candidate segments to identify the highest quality segment within the sound sample. The invention can optionally provide feedback to the speaker to help the speaker increase the quality of the sound sample provided. The invention also can optionally perform sound pressure level calibration and noise calibration. The invention may optionally compute clinical metrics on the selected segment and may further include a normative database method or system for storing and analyzing clinical measurements.

Type: Application

Filed: November 19, 2009

Publication date: June 17, 2010

Inventor: David N. Fernandes
Apparatus, method, and program for supporting speech interface design

Patent number: 7729921

Abstract: For design of a speech interface accepting speech control options, speech samples are stored on a computer-readable medium. A similarity calculating unit calculates a certain indication of similarity of first and second sets of ones of the speech samples, the first set of speech samples being associated with a first speech control option and the second set of speech samples being associated with a second speech control option. A display unit displays the similarity indication. In another aspect, word vectors are generated for the respective speech sample sets, indicating frequencies of occurrence of respective words in the respective speech sample sets. The similarity calculating unit calculates the similarity indication responsive to the word vectors of the respective speech sample sets. In another aspect, a perplexity indication is calculated for respective speech sample sets responsive to language models for the respective speech sample sets.

Type: Grant

Filed: July 31, 2008

Date of Patent: June 1, 2010

Assignee: Nuance Communications, Inc.

Inventors: Osamu Ichikawa, Gakuto Kurata, Masafumi Nishimura
System and method for latency reduction for automatic speech recognition using partial multi-pass results

Patent number: 7729912

Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.

Type: Grant

Filed: December 23, 2003

Date of Patent: June 1, 2010

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
Session file modification with annotation using speech recognition or text to speech

Patent number: 7693717

Abstract: An apparatus comprising a session file, session file editor, annotation window, concatenation software and training software. The session file includes one or more audio files and text associated with each audio file segment. The session file editor displays text and provides text selection capability and plays back audio. The annotation window operably associated with the session file editor supports user modification of the selected text, the annotation window saves modified text corresponding to the selected text from the session file editor and audio associated with the modified text. The concatenation software concatenates modified text and audio associated therewith for two or more instances of the selected text. The training software trains a speech user profile using a concatenated file formed by the concatenating software.

Type: Grant

Filed: April 12, 2006

Date of Patent: April 6, 2010

Assignee: Custom Speech USA, Inc.

Inventors: Jonathan Kahn, Michael C. Huttinger
System for synchronization between moving picture and a text-to-speech converter

Patent number: RE42000

Abstract: A method of formatting and normalizing continuous lip motions to events in a moving picture besides text in a Text-To-Speech converter is provided. A synthesized speech is synchronized with a moving picture by using the method wherein the real speech data and the shape of a lip in the moving picture are analyzed, and information on the estimated lip shape and text information are directly used in generating the synthesized speech.

Type: Grant

Filed: October 19, 2001

Date of Patent: December 14, 2010

Assignee: Electronics and Telecommunications Research Institute

Inventors: Jae Woo Yang, Jung Chul Lee, Min Soo Hahn, Hang Seop Lee, Youngjik Lee
Text-to speech conversion system for synchronizing between synthesized speech and a moving picture in a multimedia environment and a method of the same

Patent number: RE42647

Abstract: The present invention provides a text-to-speech conversion system (TTS) for interlocking synchronizing with multimedia and a method for organizing input data of the TTS which can enhance the natural naturalness of synthesized speech and accomplish the synchronization of multimedia with TTS by defining additional prosody information, the information required to interlock synchronize TTS with multimedia, and interface between these this information and TTS for use in the production of the synthesized speech.

Type: Grant

Filed: September 30, 2002

Date of Patent: August 23, 2011

Assignee: Electronics and Telecommunications Research Institute

Inventors: Jung Chul Lee, Min Soo Hahn, Hang Seop Lee, Jae Woo Yang, Youngjik Lee

prev 1 2 3 4 5 6 7 8 next