Details (epo) Patents (Class 704/E17.004)
  • Patent number: 11417344
    Abstract: The information processing method in the present disclosure is performed as below. At least one speech segment is detected from speech input to a speech input unit. A first feature quantity is extracted from each speech segment detected, the first feature quantity identifying a speaker whose voice is contained in the speech segment. The first feature quantity extracted is compared with each of second feature quantities stored in storage and identifying the respective voices of registered speakers who are target speakers in speaker recognition. The comparison is performed for each of consecutive speech segments, and under a predetermined condition, among the second feature quantities stored in the storage, at least one second feature quantity whose similarity with the first feature quantity is less than or equal to a threshold is deleted, thereby removing the at least one registered speaker identified by the at least one second feature quantity.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: August 16, 2022
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventor: Misaki Doi
  • Publication number: 20140022184
    Abstract: The recognition of user input to a computing device is enhanced. The user input is either speech, or handwriting data input by the user making screen-contacting gestures, or a combination of one or more prescribed words that are spoken by the user and one or more prescribed screen-contacting gestures that are made by the user, or a combination of one or more prescribed words that are spoken by the user and one or more prescribed non-screen-contacting gestures that are made by the user.
    Type: Application
    Filed: July 20, 2012
    Publication date: January 23, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Steven Bathiche, Anoop Gupta
  • Publication number: 20130317827
    Abstract: A computer-implemented system includes one or multiple application devices and a voice-controlled storage device. Multiple voice commands may be issued to multiple application devices simultaneously or separately, or to the same application device separately. The voice-controlled storage device is configured to perform content identification and voiceprint recognition on the voice commands. Therefore, each requestor may be allowed to operate the voice-controlled storage device in a corresponding operation mode according to respective authorization level.
    Type: Application
    Filed: May 23, 2012
    Publication date: November 28, 2013
    Inventors: Tsung-Chun Fu, I-Ming Lo
  • Publication number: 20130268273
    Abstract: A method of recognizing gender or age of a speaker according to speech emotion or arousal includes the following steps of A) segmentalizing speech signals into a plurality of speech segments; B) fetching the first speech segment from the plural speech segments to further acquire at least one of emotional features or arousal degree in the speech segment; C) determining whether at least one of the emotional feature and the arousal degree conforms to some condition; if yes, proceed to the step D); if no, return to the step B) and then fetch the next speech segment; D) fetching the feature indicative of gender or age from the speech segment to further acquire at least one feature parameter; and E) recognizing the at least one feature parameter to further determine the gender or age of the speaker at the currently-processed speech segment.
    Type: Application
    Filed: July 27, 2012
    Publication date: October 10, 2013
    Inventors: Oscal Tzyh-Chiang Chen, Ping-Tsung Lu, Jia-You Ke
  • Publication number: 20130173266
    Abstract: A voice analyzer includes a first voice acquisition unit provided in a place where a distance of a sound wave propagation path from a mouth of a user is a first distance, plural second voice acquisition units provided in places where distances of sound wave propagation paths from the mouth of the user are smaller than the first distance, and an identification unit that identifies whether the voices acquired by the first and second voice acquisition units are voices of the user or voices of others excluding the user on the basis of a result of comparison between first sound pressure of a voice signal of the voice acquired by the first voice acquisition unit and second sound pressure calculated from sound pressure of a voice signal of the voice acquired by each of the plural second voice acquisition units.
    Type: Application
    Filed: May 7, 2012
    Publication date: July 4, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Yohei NISHINO, Haruo HARADA, Kei SHIMOTANI, Hirohito YONEYAMA, Kiyoshi IIDA, Akira FUJII
  • Publication number: 20130144619
    Abstract: Techniques for ability enhancement are described. Some embodiments provide an ability enhancement facilitator system (“AEFS”) configured to enhance voice conferencing among multiple speakers. In one embodiment, the AEFS receives data that represents utterances of multiple speakers who are engaging in a voice conference with one another. The AEFS then determines speaker-related information, such as by identifying a current speaker, locating an information item (e.g., an email message, document) associated with the speaker, or the like. The AEFS then informs a user of the speaker-related information, such as by presenting the speaker-related information on a display of a conferencing device associated with the user.
    Type: Application
    Filed: January 23, 2012
    Publication date: June 6, 2013
    Inventors: Richard T. Lord, Robert W. Lord, Nathan P. Myhrvold, Clarence T. Tegreene, Roderick A. Hyde, Lowell L. Wood, JR., Muriel Y. Ishikawa, Victoria Y.H. Wood, Charles Whitmer, Paramvir Bahl, Doughlas C. Burger, Ranveer Chandra, William H. Gates, III, Paul Holman, Jordin T. Kare, Craig J. Mundie, Tim Paek, Desney S. Tan, Lin Zhong, Matthew G. Dyor
  • Publication number: 20130046538
    Abstract: A method implemented in a computer infrastructure having computer executable code having programming instructions tangibly embodied on a computer readable storage medium. The programming instructions are operable to receive a current waveform of a communication between a plurality of participants. Additionally, the programming instructions are operable to create a voiceprint from the current waveform if the current waveform is of a human voice. Furthermore, the programming instructions are operable to determine one of whether a match exists between the voiceprint and one library waveform of one or more library waveforms, whether a correlation exists between the voiceprint and a number of library waveforms of the one or more library waveforms and whether the voiceprint is unique.
    Type: Application
    Filed: October 19, 2012
    Publication date: February 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130030809
    Abstract: One aspect includes determining validity of an identity asserted by a speaker using a voice print associated with a user whose identity the speaker is asserting, the voice print obtained from characteristic features of at least one first voice signal obtained from the user uttering at least one enrollment utterance including at least one enrollment word by obtaining a second voice signal of the speaker uttering at least one challenge utterance that includes at least one word not in the at least one enrollment utterance, obtaining at least one characteristic feature from the second voice signal, comparing the at least one characteristic feature with at least a portion of the voice print to determine a similarity between the at least one characteristic feature and the at least a portion of the voice print, and determining whether the speaker is the user based, at least in part, on the similarity.
    Type: Application
    Filed: September 14, 2012
    Publication date: January 31, 2013
    Applicant: Nuance Communications, Inc.
    Inventors: Kevin R. Farrell, David A. James, William F. Ganong, III, Jerry K. Carter
  • Publication number: 20120239400
    Abstract: A speaker or a set of speakers can be recognized with high accuracy even when multiple speakers and a relationship between speakers change over time. A device comprises a speaker model derivation means for deriving a speaker model for defining a voice property per speaker from speech data made of multiple utterances to which speaker labels as information for identifying a speaker are given, a speaker co-occurrence model derivation means for, by use of the speaker model derived by the speaker model derivation means, deriving a speaker co-occurrence model indicating a strength of a co-occurrence relationship between the speakers from session data which is divided speech data in units of a series of conversation, and a model structure update means for, with reference to a session of newly-added speech data, detecting predefined events, and when the predefined event is detected, updating a structure of at least one of the speaker model and the speaker co-occurrence model.
    Type: Application
    Filed: October 21, 2010
    Publication date: September 20, 2012
    Applicant: NRC Corporation
    Inventor: Takafumi Koshinaka
  • Publication number: 20120239398
    Abstract: In one aspect, a method for determining a validity of an identity asserted by a speaker using a voice print is provided. The method comprises acts of performing a first verification stage comprising comparing a first voice signal from the speaker uttering at least one first challenge utterance-with at least a portion of the voice print and performing a second verification stage if it is concluded in the first verification stage that the first voice signal was obtained from an utterance by the user. The second verification stage comprises adapting at least one parameter of the voice print based, at least in part, on the first voice signal to obtain an adapted voice print, and comparing a second voice signal from the speaker uttering at least one second challenge utterance with at least a portion of the adapted voice print.
    Type: Application
    Filed: April 9, 2012
    Publication date: September 20, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Kevin R. Farrell, David A. James, William F. Ganong, III, Jerry K. Carter
  • Publication number: 20120084087
    Abstract: A method, device, and system for speaker recognition are provided. The method includes: receiving a Speaker Verification instruction sent from a Media Gateway Controller (MGC) (101); executing a speaker verification operation according to the Speaker Verification instruction, and obtaining a result of the speaker verification operation (102); and reporting the result of the speaker verification operation to the MGC (103).
    Type: Application
    Filed: December 12, 2011
    Publication date: April 5, 2012
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Weiwei YANG, Ning ZHU
  • Publication number: 20120078639
    Abstract: Systems, computer-implemented methods, and tangible computer-readable media are provided for voice authentication. The method includes receiving a speech sample from a user through an Internet browser for authentication as part of a request for a restricted-access resource, performing a comparison of the received speech sample to a previously established speech profile associated with the user, transmitting an authentication to the network client if the comparison is equal to or greater than a certainty threshold, and transmitting a denial to the network client if the comparison is less than the certainty threshold.
    Type: Application
    Filed: December 5, 2011
    Publication date: March 29, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventor: Saurabh KUMAR
  • Publication number: 20110275348
    Abstract: A system for unlocking a mobile device including: an input module configured to receive voice information; an output module configured to output an unlock signal; a processor connected to the input module and output module and configured to: receive the voice information from the input module; analyze the voice information; determine if the analyzed voice information matches with a predetermined voice profile; and if there is a match, send an unlock signal to the mobile device via an output module. A method for unlocking a mobile device including: receiving voice information from the mobile device; analyzing the voice information; determining if the analyzed voice information matches with a predetermined voice profile; and if there is a match, sending an unlock signal to the mobile device.
    Type: Application
    Filed: June 15, 2009
    Publication date: November 10, 2011
    Applicant: BCE INC.
    Inventors: David Clark, Jonathan Arsenault, Stephane Fortier
  • Publication number: 20100217594
    Abstract: In a system in which a business organization authenticates a user by speaker recognition, there is provided the system that obviates a necessity for the user to register a speaker model by uttering voice for each business partner. A user device 20 has voice input unit 24, speaker model preparation unit 21, storage unit 22, and communication unit 23 that transfers a speaker model. A business-organization-side management device 40 has speaker model acquisition unit 41, storage unit 42, and communication unit 43. A business-organization-side speaker recognition device 50 has speaker model acquisition unit 51; speaker model registration unit 52, characteristic acquisition unit 53, and speaker recognition unit 54. The user prepares and retains a speaker model used for speaker recognition by use of the user device 20.
    Type: Application
    Filed: May 7, 2010
    Publication date: August 26, 2010
    Applicant: PANASONIC CORPORATION
    Inventors: Yuki SAWADA, Kazuhiro Watada, Kota Yasunaga, Hiroyuki Sakate
  • Publication number: 20090055167
    Abstract: Disclosed is a method for providing translation service using a mobile communication terminal. The method includes a button input step of pressing a voice recognition key to use a voice recognition function, a menu screen provision step of selecting a translator menu item, a translation recognition method determination step of selecting a sentence input method or a word input method, a Korean input step of inputting Korean, a confirmation step of confirming whether a completed Korean sentence matches an intended sentence, and a translated sentence output step of providing a relevant translated sentence in a text form and reproducing the relevant translated sentence in a voice form.
    Type: Application
    Filed: March 15, 2006
    Publication date: February 26, 2009
    Inventor: Seok-yong Moon
  • Publication number: 20080177539
    Abstract: A method of processing voice signals suitable for enhancing the speech discrimination ability of a hearing impaired person is disclosed. First, a voice signal is received, and the received voice signal is divided into a plurality of voice frames. A frequency spectrum analysis is conducted on one of the voice frames to estimate the effective bandwidth of the voice frame. Next, a frequency transposition process is performed on the voice signal so as to suit the auditory sensation bandwidth of a hearing impaired person. In addition, an energy compensation process is performed on the voice frame after performing the frequency transposition process so as to compensate the reduced energy brought by the frequency transposition process.
    Type: Application
    Filed: September 16, 2007
    Publication date: July 24, 2008
    Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE
    Inventors: Tai-Huei Huang, Po-Kai Huang
  • Publication number: 20070244690
    Abstract: The present invention relates to a method, a text segmentation system and a computer program product for clustering of text into text clusters representing a distinct semantic meaning. The text clustering method identifies text portions and assigns text portions to different clusters in such a way that each text cluster refers to one or several semantic topics. The clustering method incorporates an optimization procedure based on a re-clustering procedure evaluating a target function being indicative of the correlation between a text unit and a cluster. The text clustering method makes use of a text emission model and a cluster transition model and makes further use of various smoothing techniques.
    Type: Application
    Filed: November 11, 2004
    Publication date: October 18, 2007
    Applicant: KONINKLIJKE PHILIPS ELECTRONIC, N.V.
    Inventor: Jochen Peters