Speaker Identification Or Verification (epo) Patents (Class 704/E17.001)

E Subclasses

Recognition of special voice characteristics, e.g., for use in a lie detector; recognition of animal voices, etc. (epo) (Class 704/E17.002)

Systems using speaker recognizers (epo) (Class 704/E17.003)

Details (epo) (Class 704/E17.004)

Preprocessing operations, e.g., segment selection, etc., pattern representation or modeling, e.g., based on linear discriminant analysis (LDA), principal components, etc.; feature selection or extraction (EPO) (Class 704/E17.005)
Training, model building, enrollment (EPO) (Class 704/E17.006)
Decision making techniques, pattern matching strategies (EPO) (Class 704/E17.007)
Hidden Markov Models (HMMs) (EPO) (Class 704/E17.012)
Artificial neural networks, connectionist approaches (EPO) (Class 704/E17.013)
Pattern transformations and operations aimed at increasing system robustness, e.g., against channel noise, different working conditions, etc. (EPO) (Class 704/E17.014)
Interactive procedures, man-machine interface (EPO) (Class 704/E17.015)

IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT

Publication number: 20120278077

Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.

Type: Application

Filed: July 11, 2012

Publication date: November 1, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
Document Certification and Security System

Publication number: 20120259635

Abstract: A system for the storing of client information in an independent repository is disclosed. Client data may be uploaded by client or those authorized by client or collected and stored by the repository. Data about the client file such as, for example, the time of upload and modifications are stored in a metadata file associated with the client file.

Type: Application

Filed: April 5, 2012

Publication date: October 11, 2012

Inventors: Gregory J. Ekchian, Jack A. Ekchian
SPEECH PROCESSING SYSTEM AND METHOD

Publication number: 20120253811

Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are upd

Type: Application

Filed: August 23, 2011

Publication date: October 4, 2012

Applicant: Kabushiki Kaisha Toshiba

Inventors: Catherine BRESLIN, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
Voice Recognition Device and Voice Recognition Method

Publication number: 20120253808

Abstract: According to an embodiment, a voice recognition device includes a voice inputting unit, a voice recognition processing unit, a vibration movement pattern model holding unit, and a vibration movement unit. The voice recognition processing unit performs voice recognition processing using a digital signal output from the voice inputting unit to output a voice recognition result and outputs voice reliability of the received voice signal. The vibration movement pattern model holding unit stores models prepared according to a number of patterns of the voice reliability output from the voice recognition processing unit and holds vibration movements corresponding to the models. The vibration movement unit detects whether or not the voice reliability output from the voice recognition processing unit matches any one of the models in the vibration movement pattern model holding unit and performs vibration movement predetermined for a matched model.

Type: Application

Filed: October 17, 2011

Publication date: October 4, 2012

Inventors: Motonobu Sugiura, Hiroshi Fujimura
VOICE RECOGNITION DEVICE

Publication number: 20120239399

Abstract: Disclosed is a voice recognition device which creates a recognition dictionary (statically-created dictionary) in advance for a vocabulary having words to be recognized whose number is equal to or larger than a threshold, and creates a recognition dictionary (dynamically-created dictionary) for a vocabulary having words to be recognized whose number is smaller than the threshold in an interactive situation.

Type: Application

Filed: March 30, 2010

Publication date: September 20, 2012

Inventors: Michihiro Yamazaki, Yuzo Maruta
METHOD AND APPARATUS FOR CREATING VOICE TAG

Publication number: 20120221335

Abstract: According to one embodiment, the method may include constructing a first voice tag for registration speech based on Hidden Markov acoustic model (HMM), constructing a second voice tag for the registration speech based on template matching, and combining the first voice tag and the second voice tag to construct voice tag of the registration speech.

Type: Application

Filed: February 24, 2012

Publication date: August 30, 2012

Inventors: Rui Zhao, Lei He
TRAINING DEVICE, TRAINING SYSTEM AND METHOD

Publication number: 20120213428

Abstract: A training device comprises a first regenerating unit regenerates at least one of an image and a voice for training during the training courses which lead the user to train the operation of an input device, an operation accepting unit accepts the user operation for at least one of the image and the voice for training from a simulated user interface which simulates a user interface of the input device during training, a second regenerating unit regenerates at least one of the image and the voice for training when the training is ended, and a normal operation instructing unit instructs a normal operation to the user by outputting at least one of the image and the voice of the normal operation of the user, which show at least one of the image and the voice for training, which is synchronous with the regeneration of the second regenerating unit.

Type: Application

Filed: February 7, 2012

Publication date: August 23, 2012

Applicant: TOSHIBA TEC KABUSHIKI KAISHA

Inventors: Daigo Kudou, Masanori Sambe, Takesi Kawaguti
Method and Apparatus for Obtaining Statistical Data from a Conversation

Publication number: 20120191454

Abstract: A system is described to monitor various parameters of a conversation, for example distinguishing voices in a conversation and reporting who in the group is violating the proper etiquette rules of conversation. These results would indicate any disruptive individuals in a conversation. So they are identified, monitored, trained to prevent further disturbances, and their etiquette is improved to prevent further disturbances. Some of the functions the system can perform include: report the identity of the voices, report how long one has spoken, report how often one interrupts, report how often one raises their voice, count the occurrences of obscenities and determine length of silences.

Type: Application

Filed: January 26, 2011

Publication date: July 26, 2012

Inventors: Quinton Andrew Gabara, Constance Marie Gabara, Helen Mary Gabara, Simone Marie Gabara, Cassandra Marlene Gabara, Asher Thomas Gabara, Thaddeus John Gabara
METHOD FOR VERIFYING THE IDENTITYOF A SPEAKER, SYSTEM THEREFORE AND COMPUTER READABLE MEDIUM

Publication number: 20120173239

Abstract: The invention refers to a method of verifying the identity of a speaker based on the speakers voice comprising the steps of: receiving (1, 5) a first and a second voice utterance; using biometric voice data to verify (2, 6) that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received first and/or second voice utterance and determine (8) the similarity of the two received voice utterances characterized in that the similarity is determined using biometric voice characteristics of the two voice utterances or data derived from such biometric voice characteristics.

Type: Application

Filed: June 26, 2009

Publication date: July 5, 2012

Inventors: Marta Sánchez Asenjo, Marta Garcia Gomar
MOVING IMAGE PHOTOGRAPHING METHOD AND MOVING IMAGE PHOTOGRAPHING APPARATUS

Publication number: 20120162470

Abstract: A moving image photographing apparatus that recognizes the shape of a speaker's mouth, and/or recognizes the speaker's voice to detect a speaker area, and selectively performs image signal processing with respect to the detected speaker area, and a moving image photographing method using the moving image photographing apparatus. The moving image photographing apparatus may selectively reproduce a moving image by generating a still image including the speaker area and using the still image as a bookmark.

Type: Application

Filed: September 22, 2011

Publication date: June 28, 2012

Applicant: Samsung Electronics., Ltd.

Inventors: Eun-young Kim, Seung-a Yi
AUDIO SIGNAL SOURCE VERIFICATION SYSTEM

Publication number: 20120143608

Abstract: An audio signal source verification system is presented that, in certain embodiments, receives a first template for an audio signal and compares it to templates from different sound sources to determine a correlation between them. A question and response format may be used to eliminate false verifications and to increase the probability that an audio signal is from the purported source of the signal. Moreover mobile devices may be operated to provide audio signals generated by users of those phones and the audio signals and templates derived form those signals may be compared to known templates to determine a confidence level or other indication may be used to indicate the mobile device user is who they purport to be. Moreover comparisons can be made using templates of different richness to achieve confidence levels and confidence levels may be represented based on the results of the comparisons.

Type: Application

Filed: June 10, 2011

Publication date: June 7, 2012

Inventor: John D. KAUFMAN
Method for identifying and protecting information

Publication number: 20120123786

Abstract: A method for identifying and authenticating a user and protecting information. The identification process is enabled by using a mobile device such as a smartphone, laptop, or thin client device. A user speaks a phrase to create an audio voiceprint while a camera streams video images and creates a video print. The video data is converted to a color band calculated pattern to numbers. The audio voiceprint, video print, and color band are registered in a database as a digital fingerprint. Processing of all audio and video input occurs on a human key system server so there is not usage by the thin client systems used by the user to access the human key server for authentication and verification. When a user registers an audio and video fingerprint is created and stored in the database as reference to identify that individual for the purpose of verification.

Type: Application

Filed: December 20, 2011

Publication date: May 17, 2012

Inventors: David Valin, Alex Socolof
DIGITAL METHOD AND ARRANGEMENT FOR AUTHENTICATING A PERSON

Publication number: 20120095763

Abstract: Digital method for authentication of a person by comparing a current voice profile with a previously stored initial voice profile, wherein to determine the relevant voice profile the person speaks at least one speech sample into the system, this speech sample is conveyed to a voice-profile calculation unit and thereby, on the basis of a prespecified voice-profile algorithm, the voice profile is calculated, such that the overall size of the speech sample and/or parameters of its evaluation to determine the relevant voice profile are established dynamically and automatically as the sample is spoken, in response to the result of an evaluation of a first partial speech sample.

Type: Application

Filed: February 19, 2008

Publication date: April 19, 2012

Applicant: VOICE.TRUST AG

Inventors: Raja Kuppuswamy, Christian Pilz
Method And Apparatus For Voice Signature Authentication

Publication number: 20120084078

Abstract: A scalable voice signature authentication capability is provided herein. The scalable voice signature authentication capability enables authentication of varied services such as speaker identification (e.g. private banking and access to healthcare account records), voice signature as a password (e.g. secure access for remote services and document retrieval) and the Internet and its various services (e.g.

Type: Application

Filed: September 30, 2010

Publication date: April 5, 2012

Applicant: Alcatel-Lucent USA Inc.

Inventors: Madhav Moganti, Anish Sankalia
CENTRALIZED BIOMETRIC AUTHENTICATION

Publication number: 20120078638

Abstract: A communications system includes a receiver and at least one transmitter. The receiver receives, from different intermediate systems, biometric samples from parties attempting to obtain services from the intermediate systems and information characterizing the expected identifies of the parties. The at least one transmitter transmits, to the intermediate systems, verification that the biometric samples match pre-registered biometric information obtained from a storage device such that the expected identities of the parties is verified as the identities of the parties.

Type: Application

Filed: November 10, 2011

Publication date: March 29, 2012

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Brian M. NOVACK, Daniel Larry MADSEN, Timothy R. THOMPSON
METHOD FOR DETECTING VOICE SECTION FROM TIME-SPACE BY USING AUDIO AND VIDEO INFORMATION AND APPARATUS THEREOF

Publication number: 20120078624

Abstract: The present invention relates to a method for detecting a voice section in time-space by using audio and video information. According to an embodiment of the present invention, a method for detecting a voice section from time-space by using audio and video information comprises the steps of: detecting a voice section in an audio signal which is inputted into a microphone array; verifying a speaker from the detected voice section; sensing the face of the speaker by using a video signal which is inputted into a camera if the speaker is successfully verified, and then estimating the direction of the face of the speaker; and determining the detected voice section as the voice section of the speaker if the estimated face direction corresponds to a reference direction which is previously stored.

Type: Application

Filed: February 10, 2010

Publication date: March 29, 2012

Applicant: Korea University-Industrial & Academic Collaboration Foundation

Inventors: Dongsuk Yook, Hyeowoo Lee
METHOD AND APPARATUS FOR PERFORMING MICROPHONE BEAMFORMING

Publication number: 20120065973

Abstract: A method and apparatus for performing microphone beamforming. The method includes recognizing a speech of a speaker, searching for a previously stored image associated with the speaker, searching for the speaker through a camera based on the image, recognizing a position of the speaker, and performing microphone beamforming according to the position of the speaker.

Type: Application

Filed: September 13, 2011

Publication date: March 15, 2012

Applicant: Samsung Electronics Co., Ltd.

Inventors: Sung-Jae Cho, Hyun-Soo Kim
Wireless Voice Activation Apparatus for Surgical Lasers

Publication number: 20120053941

Abstract: A wireless voice activation surgical laser system utilizing wireless transmitter receivers. The present invention integrates wireless communication between a surgical laser, a voice recognition device, and a microphone to allow surgeons to verbally activate or deactivate a surgical laser. The voice recognition device is able to recognize a surgeons commands and relay the commands directly to the surgical laser.

Type: Application

Filed: August 27, 2011

Publication date: March 1, 2012

Inventor: Michael D. SWICK
MESSAGING SYSTEM

Publication number: 20120035929

Abstract: The messaging system (100) comprises a message data engine (101), a message generation engine (102), and an inference analysis engine (107). The message data engine (101) is operable to identify the desired recipient of the message, analyse and deconstruct the desired message content into syllables, words or phrases as desired or as appropriate. The message or part message may then be passed to the inference engine (107) for review of the message content and context. The message may then be referred back to the requestor (111) through the interface (103) with details of the problem for remediation, or passed to the message generation engine (102) for transcription. The message generation engine (102) may apply a range of speech samples and/or speech parameters as appropriate to the input message in order to compile a representation of this message with the speaker characteristics that were requested.

Type: Application

Filed: February 9, 2010

Publication date: February 9, 2012

Inventors: Allan Gauld, Sarah Elizabeth Roberts
SPEAKER RECOGNITION VIA VOICE SAMPLE BASED ON MULTIPLE NEAREST NEIGHBOR CLASSIFIERS

Publication number: 20120016673

Abstract: A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.

Type: Application

Filed: September 27, 2011

Publication date: January 19, 2012

Applicant: Microsoft Corporation

Inventor: Amitava Das
Language Identification

Publication number: 20120010886

Abstract: A language identification system suitable for use with voice data transmitted through either a telephonic or computer network systems is presented. Embodiments that automatically select the language to be used based upon the content of the audio data stream are presented. In one embodiment the content of the data stream is supplemented with the context of the audio stream. In another embodiment the language determination is supplemented with preferences set in the communication devices and in yet another embodiment, global position data for each user of the system is used to supplement the automated language determination.

Type: Application

Filed: July 6, 2011

Publication date: January 12, 2012

Inventor: Javad Razavilar
AUDIO HUMAN VERIFICATION

Publication number: 20120004914

Abstract: A system generates an audio challenge that includes a first voice and one or more second voices, the first voice being audibly distinguishable, by a human, from the one or more second voices. The first voice conveys first information and the second voice conveys second information. The system provides the audio challenge to a user and verifies that the user is human based on whether the user can identify the first information in the audio challenge.

Type: Application

Filed: September 12, 2011

Publication date: January 5, 2012

Applicant: Tell Me Networks c/o Microsoft Corporation

Inventors: Nikko Strom, Dylan F. Salisbury
SPEAKER RECOGNITION IN A MULTI-SPEAKER ENVIRONMENT AND COMPARISON OF SEVERAL VOICE PRINTS TO MANY

Publication number: 20110320200

Abstract: One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer.

Type: Application

Filed: September 7, 2011

Publication date: December 29, 2011

Applicant: American Express Travel Related Services Company, Inc.

Inventors: Vicki Broman, Vernon Marshall, Seshasayee Bellamkonda, Marcel Leyva, Cynthia Hanson
System and Method for Controlling Devices that are Connected to a Network

Publication number: 20110313776

Abstract: A system, method and computer-readable medium for controlling devices connected to a network. The method includes receiving an utterance from a user for remotely controlling a device in a network; converting the received utterance to text using an automatic speech recognition module; accessing a user profile in the network that governs access to a plurality of devices on the network and identifiers which control a conversion of the text to a device specific control language; identifying based on the text a device to be controlled; converting at least a portion of the text to the device control language; and transmitting the device control language to the identified device, wherein the identified device implements a function based on the transmitted device control language.

Type: Application

Filed: August 27, 2011

Publication date: December 22, 2011

Applicant: AT&T Intellectual Property ll, L.P.

Inventors: Joseph A. ALFRED, Joseph M. Sommer
IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT

Publication number: 20110313766

Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.

Type: Application

Filed: August 30, 2011

Publication date: December 22, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
Conversational Subjective Quality Test Tool

Publication number: 20110313765

Abstract: A method for assessing quality of conversational speech between nodes of a communication network (1), comprising establishing a voice communication session via the communication network (1) between a user at a user terminal (2) and a virtual subject system (4), the virtual subject system (4) and user terminal (2) being connected to the communication network (1), the user terminal enabling the user to communicate by voice with the virtual subject system (4), during the session, acting as a conversation partner in a voice conversation with the virtual subject system (4), the virtual subject system being equipped with a speech generation module (42) to enable speaking during the session and a voice recognition module (41) to enable interpreting speech of the user during the session, and assessing the quality of speech over the communication network based on the voice conversation during the session, the assessing being performed by the user.

Type: Application

Filed: November 24, 2009

Publication date: December 22, 2011

Applicant: ALCATEL LUCENT

Inventor: Nicolas Tranquart
SPEECH RECOGNITION ACCURACY IMPROVEMENT THROUGH SPEAKER CATEGORIES

Publication number: 20110295603

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a computer-based method includes receiving a speech corpus at a speech management server system that includes multiple speech recognition engines tuned to different speaker types; using the speech recognition engines to associate the received speech corpus with a selected one of multiple different speaker types; and sending a speaker category identification code that corresponds to the associated speaker type from the speech management server system over a network. The speaker category identification code can be used by any one of speech-interactive applications coupled to the network to select one of an appropriate one of multiple application-accessible speech recognition engines tuned to the different speaker types in response to an indication that a user accessing the application is associated with a particular one of the speaker category identification codes.

Type: Application

Filed: April 28, 2011

Publication date: December 1, 2011

Inventor: William S. Meisel
SYSTEM AND METHOD FOR AUTOMATIC VERIFICATION OF THE UNDERSTANDABILITY OF SPEECH

Publication number: 20110295604

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for processing a message received from a user to determine whether an estimate of intelligibility is below an intelligibility threshold. The method includes recognizing a portion of a user's message that contains the one or more expected utterances from a critical information list, calculating an estimate of intelligibility for the recognized portion of the user's message that contains the one or more expected utterances, and prompting the user to repeat at least the recognized portion of the user's message if the calculated estimate of intelligibility for the recognized portion of the user's message is below an intelligibility threshold. In one aspect, the method further includes prompting the user to repeat at least a portion of the message if any of a measured speech level and a measured signal-to-noise ratio of the user's message are determined to be below their respective thresholds.

Type: Application

Filed: August 8, 2011

Publication date: December 1, 2011

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Harvey S. Cohen, Randy G. Goldberg, Kenneth H. Rosen
Voice Tracking Camera with Speaker Identification

Publication number: 20110285807

Abstract: A videoconferencing apparatus automatically tracks speakers in a room and dynamically switches between a controlled, people-view camera and a fixed, room-view camera. When no one is speaking, the apparatus shows the room view to the far-end. When there is a dominant speaker in the room, the apparatus directs the people-view camera at the dominant speaker and switches from the room-view camera to the people-view camera. When there is a new speaker in the room, the apparatus switches to the room-view camera first, directs the people-view camera at the new speaker, and then switches to the people-view camera directed at the new speaker. When there are two near-end speakers engaged in a conversation, the apparatus tracks and zooms-in the people-view camera so that both speakers are in view.

Type: Application

Filed: May 18, 2010

Publication date: November 24, 2011

Applicant: POLYCOM, INC.

Inventor: Jinwei FENG
METHOD FOR MEASURING ENVIRONMENTAL PARAMETERS FOR MULTI-MODAL FUSION

Publication number: 20110282665

Abstract: Provided is a method for measuring environmental parameters for multi-modal fusion. The method for measuring environmental parameters for multi-modal fusion, includes: preparing at least one enrolled modality; receiving at least one input modality; calculating image related environmental parameters of input images in at least one input modality based on illumination of enrolled image in at least one enrolled modality; and comparing the image related environmental parameters with a predetermined reference value and discarding the input image or outputting it as a recognition data according to the comparison result.

Type: Application

Filed: January 31, 2011

Publication date: November 17, 2011

Applicant: Electronics and Telecommunications Research Institute

Inventors: Hye Jin KIM, Do Hyung KIM, Su Young CHI, Jae Yeon LEE
VOICE RECOGNITION SYSTEM

Publication number: 20110276331

Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.

Type: Application

Filed: October 8, 2009

Publication date: November 10, 2011

Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
OIL LEVEL INSPECTION SYSTEM FOR RAILROAD CAR TRUCK

Publication number: 20110270611

Abstract: A system for inspecting an oil level in each part of a railroad car truck includes: an imaging unit that obtains an image of an oil level gauge; an oil level inspection unit that inspects whether or not the oil level in the each part of the railroad car truck is within a predetermined range based on the image of the oil level gauge obtained by the imaging unit; a voice input unit adapted for an inspector to input, via voice, an inspection result; a voice processing unit that determines whether or not the inspection result inputted via the voice input unit is good based on the inputted inspection result, and converts a determination result into displayable data; a display unit that displays an oil level inspection result and the determination result; and a storage unit that stores, as data, the oil level inspection result and the determination result.

Type: Application

Filed: December 3, 2009

Publication date: November 3, 2011

Applicant: CENTRAL JAPAN RAILWAY COMPANY

Inventors: Kyouichi Nishimura, Nozomu Nakamura, Kazuhiro Okada, Yoshitaka Tanaka
IMAGE CAPTURING APPARATUS AND METHOD FOR SELECTIVE REAL TIME FOCUS/PARAMETER ADJUSTMENT

Publication number: 20110267531

Abstract: An image capturing apparatus and method for selective real-time focus/parameter adjustment. The image capturing apparatus includes a display unit, an adjustment unit, and a generation unit. The display unit is configured to display an image. The interface unit is configured to enable a user to select a plurality of regions of the image displayed on the display unit. The adjustment unit is configured to enable the user to adjust at least one focus/parameter of at least one selected region of the image displayed on the display unit. The generation unit is configured to convert the image including at least one adjusted selected region into image data, where at least one focus/parameter of the at least one adjusted selected region has been adjusted by the adjustment unit prior to conversion.

Type: Application

Filed: May 3, 2010

Publication date: November 3, 2011

Applicant: CANON KABUSHIKI KAISHA

Inventor: Francisco Imai
METHOD FOR VERYFYING THE IDENTITY OF A SPEAKER AND RELATED COMPUTER READABLE MEDIUM AND COMPUTER

Publication number: 20110246198

Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speakers voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify (10) that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying (12, 13) that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting (16) the speakers identity to be verified in case that both verification steps give a positive result and not accepting (15) the speakers identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.

Type: Application

Filed: December 10, 2008

Publication date: October 6, 2011

Inventors: Marta Sánchez Asenjo, Alfredo Gutiérrez Navarro, Alberto Martin De Los Santos De Las Heras, Marta Garcia Gomar
MOBILE SYSTEMS AND METHODS OF SUPPORTING NATURAL LANGUAGE HUMAN-MACHINE INTERACTIONS

Publication number: 20110231182

Abstract: A mobile system is provided that includes speech-based and non-speech-based interfaces for telematics applications. The mobile system identifies and uses context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for users that submit requests and/or commands in multiple domains. The invention creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command. The invention may organize domain specific behavior and information into agents, that are distributable or updateable over a wide area network.

Type: Application

Filed: April 11, 2011

Publication date: September 22, 2011

Applicant: VoiceBox Technologies, Inc.

Inventors: Chris Weider, Richard Kennewick, Mike Kennewick, Philippe Di Cristo, Robert A. Kennewick, Samuel Menaker, Lynn Elise Armstrong
VOICE AUTHENTICATION SYSTEMS AND METHODS

Publication number: 20110224986

Abstract: A method for configuring a voice authentication system employing at least one authentication engine comprises utilising the at least one authentication engine to systematically compare a plurality of impostor voice sample against a voice sample of a legitimate person to derive respective authentication scores. The resultant authentication scores are analysed to determine a measure of confidence for the voice authentication system.

Type: Application

Filed: July 21, 2009

Publication date: September 15, 2011

Inventor: Clive Summerfield
OBFUSCATING SENSITIVE CONTENT IN AUDIO SOURCES

Publication number: 20110218798

Abstract: Techniques implemented as systems, methods, and apparatuses, including computer program products, for obfuscating sensitive content in an audio source representative of an interaction between a contact center caller and a contact center agent. The techniques include performing, by an analysis engine of a contact center system, a context-sensitive content analysis of the audio source to identify each audio source segment that includes content determined by the analysis engine to be sensitive content based on its context; and processing, by an obfuscation engine of the contact center system, one or more identified audio source segments to generate corresponding altered audio source segments each including obfuscated sensitive content.

Type: Application

Filed: March 5, 2010

Publication date: September 8, 2011

Applicant: Nexdia Inc.

Inventor: Marsal Gavalda
Analysis of the Temporal Evolution of Emotions in an Audio Interaction in a Service Delivery Environment

Publication number: 20110196677

Abstract: According to one illustrative embodiment, a method is provided for analyzing an audio interaction. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided.

Type: Application

Filed: February 11, 2010

Publication date: August 11, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Om D. Deshmukh, Chitra Dorai, Shailesh Joshi, Maureen E. Rzasa, Ashish Verma, Karthik Visweswariah, Gary J. Wright, Sai Zeng
VOICE RECOGNITION DEVICE

Publication number: 20110166859

Abstract: A voice recognition unit is constructed in such a way as to create a voice label string for an inputted voice uttered by a user inputted for each language on the basis of a feature vector time series of the inputted voice uttered by the user and data about a sound standard model, and register the voice label string into a voice label memory 2 while automatically switching among languages for a sound standard model memory 1 used to create the voice label string, and automatically switching among the languages for the voice label memory 2 for holding the created voice label string by using a first language switching unit SW1 and a second language switching unit SW2.

Type: Application

Filed: October 20, 2009

Publication date: July 7, 2011

Inventors: Tadashi Suzuki, Yasushi Ishikawa, Yuzo Maruta
TRANSLITERATING SEMITIC LANGUAGES INCLUDING DIACRITICS

Publication number: 20110137635

Abstract: The present disclosure describes a system and method of transliterating Semitic languages with support for diacritics. An input module receives and pre-processes Romanized character and forwards the pre-processed Romanized characters to a transliteration engine. The transliteration engine selects candidate transliteration rules, applies the rules, and scores and ranks the results for output. To optimize search for candidate transliteration rules, the transliteration engine may apply word-stemming strategies to process inflections indicated by affixes. The present disclosure further describes optimizations as pre-processing emphasis text, caching, dynamic transliteration rule pruning, and buffering/throttling input. The system and methods are suitable for multiple applications including but not limited to web applications, windows applications, client-server applications and input method editors such as those via Microsoft Text Services Framework TSF™.

Type: Application

Filed: December 8, 2009

Publication date: June 9, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Achraf Chalabi, Hany Grees, Mostafa Ashour, Roaa Mohammed
PHONE CONVERSATION RECORDING SYSTEM USING CALL CONTROL AND FUNCTIONS OF PHONE CONVERSATION RECORDING

Publication number: 20110135069

Abstract: New functions are added to the existing telephone network to provide services of a telecommunications carrier which are intended to deter frauds and crimes committed using telephony. Also, the telephonic circumstances during the commitment of a fraud or crime are preserved to assist prevention of recommitment of a fraud or crime. A voice announcement indicating that a telephone conversation now started will be recorded is issued to a sender in advance. This offers a function that deters frauds and crimes by creating psychological resistance. A warning is issued to the recipient after performing a voiceprint check. The contents of telephone conversations during the commitment of a fraud or crime are played back to provide information necessary to take countermeasures against frauds and crimes.

Type: Application

Filed: July 27, 2010

Publication date: June 9, 2011

Inventor: KAZUKI YOSHIDA
TARGET VOICE EXTRACTION METHOD, APPARATUS AND PROGRAM PRODUCT

Publication number: 20110131044

Abstract: An apparatus, program product and method is provided for separating a target voice from a plurality of other voices having different directions of arrival. The method comprises the steps of disposing a first and a second voice input device at a predetermined distance from one another and upon receipt of voice signals at said devices calculating discrete Fourier transforms for the signals and calculating a CSP (cross-power spectrum phase) coefficient by superpositioning multiple frequency-bin components based on correlation of the two spectra signals received and then calculating a weighted CSP coefficient from said two discrete Fourier-transformed speech signals. A target voice is separated when received by said devices from other voice signals in a spectrum by using the calculated weighted CSP coefficient.

Type: Application

Filed: November 29, 2010

Publication date: June 2, 2011

Applicant: International Business Machines Corporation

Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
METHOD AND APPARATUS FOR HANDLING A TELEPHONE CALL

Publication number: 20110125498

Abstract: One embodiment of the invention provides a computer-implemented method of handling a telephone call. The method comprises monitoring a conversation between an agent and a customer on a telephone line as part of the telephone call to extract the audio signal therefrom. Real-time voice analytics are performed on the extracted audio signal while the telephone call is in progress. The results from the voice analytics are then passed to a computer-telephony integration system responsible for the call for use by the computer-telephony integration system for determining future handling of the call.

Type: Application

Filed: June 19, 2009

Publication date: May 26, 2011

Applicant: NEWVOICEMEDIA LTD

Inventors: Richard Pickering, Joseph Moussalli, Ashley Unitt
METHOD AND SYSTEM FOR SPEAKER DIARIZATION

Publication number: 20110119060

Abstract: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.

Type: Application

Filed: November 15, 2009

Publication date: May 19, 2011

Applicant: International Business Machines Corporation

Inventor: Hagai Aronowitz
WEBCAST VIEWER VERIFICATION METHODS

Publication number: 20110102142

Abstract: A system and associated method verify the attendance and/or identity of viewers of audio/video/data streams transmitted over the internet. The system and method captures various types of interaction with the viewers and either takes appropriate action, as configured by a webcast program administrator, or simply logs this interaction to a database where audience attention and identity can be validated at a later date.

Type: Application

Filed: November 4, 2009

Publication date: May 5, 2011

Inventors: IAN J. WIDGER, Steven J. Silves, Jeremy M. Knight
Detecting And Communicating Biometrics Of Recorded Voice During Transcription Process

Publication number: 20110099011

Abstract: A method and system for determining and communicating biometrics of a recorded speaker in a voice transcription process. An interactive voice response system receives a request from a user for a transcription of a voice file. A profile associated with the requesting user is obtained, wherein the profile comprises biometric parameters and preferences defined by the user. The requested voice file is analyzed for biometric elements according to the parameters specified in the user's profile. Responsive to detecting biometric elements in the voice file that conform to the parameters specified in the user's profile, a transcription output of the voice file is modified according to the preferences specified in the user's profile for the detected biometric elements to form a modified transcription output file. The modified transcription output file may then be provided to the requesting user.

Type: Application

Filed: October 26, 2009

Publication date: April 28, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Peeyush Jaiswal
AGE DETERMINATION USING SPEECH

Publication number: 20110093267

Abstract: A device may be configured to provide a query to a user. Voice data may be received from the user responsive to the query. Voice recognition may be performed on the voice data to identify a query answer. A confidence score associated with the query answer may be calculated, wherein the confidence score represents the likelihood that the query answer has been accurately identified. A likely age range associated with the user may be determined based on the confidence score. The device to calculate the confidence score may be tuned to increase a likelihood of recognition of voice data for a particular age range of callers.

Type: Application

Filed: December 22, 2010

Publication date: April 21, 2011

Applicant: VERIZON PATENT AND LICENSING INC.

Inventor: Kevin R. Witzman
SYSTEM AND METHOD FOR VOICE RECOGNITION

Publication number: 20110093261

Abstract: Systems and methods are operable to associate each of a plurality of stored audio patterns with at least one of a plurality of digital tokens, identify a user based on user identification input, access a plurality of stored audio patterns associated with a user based on the user identification input, receive from a user at least one audio input from a custom language made up of custom language elements wherein the elements include at least one monosyllabic representation of a number, letter or word, select one of the plurality of stored audio patterns associated with the identified user, in the case that the audio input received from the identified user corresponds with one of the plurality of stored audio patterns, determine the digital token associated with the selected one of the plurality of stored audio patterns, and generate the output signal for use in a device based on the determined digital token.

Type: Application

Filed: October 15, 2010

Publication date: April 21, 2011

Inventor: Paul Angott
OUTPUT DEVICE DETECTION

Publication number: 20110080289

Abstract: A device may include a sensor configured to detect when a user is wearing or holding the device. The device may also include a display and a communication interface. The communication interface may be configured to forward an indication to a media playing device when the user is wearing or holding the device and receive content from the media playing device, where the content is received in response to the indication that the user is wearing or holding the device. The communication interface may also output the content to the display.

Type: Application

Filed: October 22, 2009

Publication date: April 7, 2011

Applicant: SONY ERICSSON MOBILE COMMUNICATIONS AB

Inventor: Wayne Christopher Minton
Method and System for Localizing and Authenticating a Person

Publication number: 20110071831

Abstract: The present invention refers to a method for localizing a person comprising the steps carried out in a computing system (1): determining (20) the localization of a telecommunication means (3, 6, 8) or determining a telecommunication means (3, 6, 8) at a specific location; this can be implemented using ANI or calling number received and a database to look up address of a fixed telephone, for a cellular device, cell-ID or triangulation can be used; receiving (21) a voice utterance of a person by the telecommunications means; and verifying (22) the identity of that person based on the received voice utterance using biometric voice data (speech, speaker recognition). Further the invention relates to a corresponding system and computer readable medium.

Type: Application

Filed: May 9, 2008

Publication date: March 24, 2011

Applicant: AGNITIO, S.L.

Inventors: Marta Garcia Gomar, Marta Sanchez Asenjo

prev 1 2 3 4 next