Speaker Identification Or Verification (epo) Patents (Class 704/E17.001)
  • Patent number: 11960584
    Abstract: A system is provided for fraud prevention upscaling with a fraudster voice print watchlist. The system includes a processor and a computer readable medium operably coupled thereto, to perform fraud prevention operations which include receiving a first voice print of a user during a voice authentication request, accessing the fraudster voice print watchlist comprising voice print representatives for a plurality of voice print clusters each having one or more of a plurality of voice prints identified as fraudulent for a voice biometric system, determining that one or more of the voice print representatives in the fraudster voice print watchlist meets or exceeds a first biometric threshold for risk detection of the first voice print during the fraud prevention operations, and determining whether the first voice print matches a first one of the plurality of voice print clusters.
    Type: Grant
    Filed: September 2, 2021
    Date of Patent: April 16, 2024
    Assignee: NICE LTD.
    Inventors: Michael Fainstein, Roman Frenkel
  • Patent number: 11955122
    Abstract: Techniques for determining whether audio is machine-outputted or non-machine-outputted are described. A device may receive audio, may process the audio to determine audio data including audio features corresponding to the audio, and may process the audio data to determine audio embedding data. The device may process the audio embedding data to determine whether the audio is machine-outputted or non-machine-outputted. In response to determining that the audio is machine-outputted, then the audio may be discarded or not processed further. Alternatively, in response to determining that the audio is non-machine-outputted (e.g., live speech from a user), then the audio may be processed further (e.g., using ASR processing).
    Type: Grant
    Filed: September 28, 2021
    Date of Patent: April 9, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Mansour Ahmadi, Udhgee Murugesan, Roger Hau-Bin Cheng, Roberto Barra Chicote, Kian Jamali Abianeh, Yixiong Meng, Oguz Hasan Elibol, Itay Teller, Kevin Kwanghoon Ha, Andrew Roths
  • Patent number: 11842741
    Abstract: A feature vector having high class identification capability is generated. A signal processing system provided with: a first generation unit for generating a first feature vector on the basis of one of time-series voice data, meteorological data, sensor data, and text data, or on the basis of a feature quantity of one of these; a weight calculation unit for calculating a weight for the first feature vector; a statistical amount calculation unit for calculating a weighted average vector and a weighted high-order statistical vector of second or higher order using the first feature vector and the weight; and a second generation unit for generating a second feature vector using the weighted high-order statistical vector.
    Type: Grant
    Filed: March 13, 2019
    Date of Patent: December 12, 2023
    Assignee: NEC CORPORATION
    Inventors: Koji Okabe, Takafumi Koshinaka
  • Patent number: 11830502
    Abstract: An electronic device is provided. The electronic device includes a microphone to receive audio, a communicator, a memory configured to store computer-executable instructions, and a processor configured to execute the computer-executable instructions. The processor is configured to determine whether the received audio includes a predetermined trigger word; based on determining that the predetermined trigger word is included in the received audio; activate a speech recognition function of the electronic device; detect a movement of a user while the speech recognition function is activated; and based on detecting the movement of the user, transmit a control signal, to a second electronic device to activate a speech recognition function of the second electronic device.
    Type: Grant
    Filed: November 21, 2022
    Date of Patent: November 28, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kwangyoun Kim, Kyungmin Lee, Youngho Han, Sungsoo Kim, Sichen Jin, Jisun Park, Yeaseul Song, Jaewon Lee
  • Patent number: 11818557
    Abstract: A spatial normalization unit generates a normalized spectrum by normalizing an orientation component of a microphone array for a target direction included in a spectrum of an acoustic signal acquired from each of a plurality of microphones forming the microphone array into an orientation component for a predetermined standard direction. A mask function estimating unit determines a mask function used for extracting a component of a target sound source arriving in the target direction on the basis of the normalized spectrum using a machine learning model. A mask processing unit estimates the component of the target sound source installed in the target direction by applying the mask function to the acoustic signal.
    Type: Grant
    Filed: February 22, 2022
    Date of Patent: November 14, 2023
    Assignees: HONDA MOTOR CO., LTD., OSAKA UNIVERSITY
    Inventors: Kazuhiro Nakadai, Ryu Takeda
  • Patent number: 11798531
    Abstract: A speech recognition method, a speech recognition apparatus, and a method and an apparatus for training a speech recognition model are provided. The speech recognition method includes: recognizing a target word speech from a hybrid speech, and obtaining, as an anchor extraction feature of a target speech, an anchor extraction feature of the target word speech based on the target word speech; obtaining a mask of the target speech according to the anchor extraction feature of the target speech; and recognizing the target speech according to the mask of the target speech.
    Type: Grant
    Filed: October 22, 2020
    Date of Patent: October 24, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Jun Wang, Dan Su, Dong Yu
  • Patent number: 11783838
    Abstract: An electronic device is provided. The electronic device includes a microphone to receive audio, a communicator, a memory configured to store computer-executable instructions, and a processor configured to execute the computer-executable instructions. The processor is configured to determine whether the received audio includes a predetermined trigger word; based on determining that the predetermined trigger word is included in the received audio; activate a speech recognition function of the electronic device; detect a movement of a user while the speech recognition function is activated; and based on detecting the movement of the user, transmit a control signal, to a second electronic device to activate a speech recognition function of the second electronic device.
    Type: Grant
    Filed: November 21, 2022
    Date of Patent: October 10, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kwangyoun Kim, Kyungmin Lee, Youngho Han, Sungsoo Kim, Sichen Jin, Jisun Park, Yeaseul Song, Jaewon Lee
  • Patent number: 11769518
    Abstract: A sound analysis system includes: first and second sound pressure acquisition means for respectively acquiring a sound pressure of a voice of a user and disposed in an equipment worn by the user at positions that differ in a distance from a mouth of the user under the state in which the user is wearing the equipment; distance estimation means for estimating a distance between either the first or the second sound pressure acquisition means and the mouth of the user based on the acquired sound pressures; and sound pressure correction means for calculating a difference between a reference value of the distance between the first or the second sound pressure acquisition means and the mouth of the user and the estimated distance, and correcting at least one of the sound pressures acquired by the first and the second sound pressure acquisition means based on the calculated difference.
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: September 26, 2023
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Eiji Mitsuda, Hikaru Sugata
  • Patent number: 11763838
    Abstract: A voice recognition device includes a plurality of mics disposed toward different directions and a processor connected with the plurality of mics, wherein the processor is configured to determine, in a setup mode, a direction of a first sound received through the plurality of mics; set a non-detecting zone, which includes the direction of the first sound; determine, in a normal mode, a direction of a second sound received through the plurality of mics; and skip voice recognition for the second sound or an operation based on the voice recognition depending on whether the direction of the second sound belongs to the non-detecting zone.
    Type: Grant
    Filed: June 14, 2021
    Date of Patent: September 19, 2023
    Assignee: HANWHA TECHWIN CO., LTD.
    Inventor: Kyoungjeon Jeong
  • Patent number: 11727953
    Abstract: A method implemented by a computing system comprises generating, by the computing system, a fingerprint comprising a plurality of bin samples associated with audio content. Each bin sample is specified within a frame of the fingerprint and is associated with one of a plurality of non-overlapping frequency ranges and a value indicative of a magnitude of energy associated with a corresponding frequency range. The computing system removes, from the fingerprint, a plurality of bin samples associated with a frequency sweep in the audio content.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: August 15, 2023
    Assignee: Gracenote, Inc.
    Inventors: Alexander Berrian, Todd J. Hodges, Robert Coover, Matthew James Wilkinson, Zafar Rafii
  • Patent number: 11704397
    Abstract: In order to detect a replay attack in a speaker recognition system, at least one feature is identified in a detected magnetic field. It is then determined whether the at least one identified feature of the detected magnetic field is indicative of playback of speech through a loudspeaker. If so, it is determined that a replay attack may have taken place.
    Type: Grant
    Filed: October 20, 2020
    Date of Patent: July 18, 2023
    Assignee: Cirrus Logic, Inc.
    Inventor: John Paul Lesso
  • Patent number: 11676600
    Abstract: According to some aspects, a method of monitoring an acoustic environment of a mobile device, at least one computer readable medium encoded with instructions that, when executed, perform such a method and/or a mobile device configured to perform such a method is provided. The method comprises receiving acoustic input from the environment of the mobile device while the mobile device is operating in the low power mode, detecting whether the acoustic input includes a voice command based on performing a plurality of processing stages on the acoustic input, wherein at least one of the plurality of processing stages is performed while the mobile device is operating in the low power mode, and using at least one contextual cue to assist in detecting whether the acoustic input includes a voice command.
    Type: Grant
    Filed: August 9, 2021
    Date of Patent: June 13, 2023
    Assignee: CERENCE OPERATING COMPANY
    Inventors: William F. Ganong, III, Paul A. Van Mulbregt, Vladimir Sejnoha, Glen Wilson
  • Patent number: 11659324
    Abstract: A system and method for storing data samples in discrete poses and recalling the stored data samples for updating a sound filter. The system determines that a microphone array at a first time period is in a first discrete pose of a plurality of discrete poses, wherein the plurality of discrete poses discretizes a pose space. The pose space includes at least an orientation component and may further include a translation component. The system retrieves one or more historical data samples associated with the first discrete pose, generated from sound captured by the microphone array before the first time period, and stored in a memory cache (e.g., for memorization). The system updates a sound filter for the first discrete pose using the retrieved one or more historical data samples. The system generates and presents audio content using the updated sound filter.
    Type: Grant
    Filed: October 18, 2021
    Date of Patent: May 23, 2023
    Assignee: META PLATFORMS TECHNOLOGIES, LLC
    Inventors: Jacob Ryan Donley, Nava K. Balsam, Vladimir Tourbabin
  • Patent number: 11620982
    Abstract: A transcription of a query for content discovery is generated, and a context of the query is identified, as well as a first plurality of candidate entities to which the query refers. A search is performed based on the context of the query and the first plurality of candidate entities, and results are generated for output. A transcription of a second voice query is then generated, and it is determined whether the second transcription includes a trigger term indicating a corrective query. If so, the context of the first query is retrieved. A second term of the second query similar to a term of the first query is identified, and a second plurality of candidate entities to which the second term refers is determined. A second search is performed based on the second plurality of candidates and the context, and new search results are generated for output.
    Type: Grant
    Filed: June 1, 2020
    Date of Patent: April 4, 2023
    Assignee: ROVI GUIDES, INC.
    Inventors: Jeffry Copps Robert Jose, Sindhuja Chonat Sri
  • Patent number: 11508378
    Abstract: An electronic device is provided. The electronic device includes a microphone to receive audio, a communicator, a memory configured to store computer-executable instructions, and a processor configured to execute the computer-executable instructions. The processor is configured to determine whether the received audio includes a predetermined trigger word; based on determining that the predetermined trigger word is included in the received audio; activate a speech recognition function of the electronic device; detect a movement of a user while the speech recognition function is activated; and based on detecting the movement of the user, transmit a control signal, to a second electronic device to activate a speech recognition function of the second electronic device.
    Type: Grant
    Filed: October 23, 2019
    Date of Patent: November 22, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kwangyoun Kim, Kyungmin Lee, Youngho Han, Sungsoo Kim, Sichen Jin, Jisun Park, Yeaseul Song, Jaewon Lee
  • Patent number: 11493959
    Abstract: System and methods for processing audio signals are disclosed. In one implementation, a system may include a wearable apparatus including an image sensor to capture images from an environment of a user; an audio sensor to capture an audio signal from the environment of the user; and at least one processor. The processor may be programmed to receive the audio signal captured by the audio sensor; identify at least one segment including speech in the audio signal; receive an image including a representation of a code; analyze the code to determine whether the code is associated with the user and/or the wearable apparatus; and after determining that the code is associated with the user and/or the wearable apparatus, transmit at least one segment of the audio signal, at least one image of the plurality of images, and/or other information to a computing platform.
    Type: Grant
    Filed: August 12, 2021
    Date of Patent: November 8, 2022
    Assignee: ORCAM TECHNOLOGIES LTD.
    Inventors: Yonatan Wexler, Amnon Shashua
  • Patent number: 11416593
    Abstract: The present disclosure provides an electronic device, a control method therefor, and a control program therefor capable of preventing an operation for activating a function protected by user authentication from becoming complicated. An electronic device includes: a keyword management DB for storing identification information of a registrant and a registered keyword in association with each other; a command management DB for storing a command and required authentication scores in association with each other; a data generator for creating grammar data including a registered keyword and a command; an utterance recognizer for matching the grammar data and extracted data extracted from an utterance of a user and acquiring a recognized authentication score and a recognized command; and an authenticator for determining that the command is recognized by comparing the required authentication score associated with the command determined to be the same as the recognized command and the recognized authentication score.
    Type: Grant
    Filed: November 10, 2017
    Date of Patent: August 16, 2022
    Assignee: Asahi Kasel Kabushiki Kaisha
    Inventor: Toshiyuki Miyazaki
  • Patent number: 11402461
    Abstract: [Object] To make it possible to estimate the position of a sound source resulting from a wearing displacement, even in the case where the wearing displacement occurs in a wearable device.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: August 2, 2022
    Assignee: Sony Corporation
    Inventors: Yuichiro Koyama, Toshiyuki Sekiya
  • Publication number: 20150106089
    Abstract: A computer-implemented method includes listening for audio name information indicative of a name of a computer, with the computer configured to listen for the audio name information in a first power mode that promotes a conservation of power; detecting the audio name information indicative of the name of the computer; after detection of the audio name information, switching to a second power mode that promotes a performance of speech recognition; receiving audio command information; and performing speech recognition on the audio command information.
    Type: Application
    Filed: December 30, 2010
    Publication date: April 16, 2015
    Inventors: Evan H. Parker, Michal R. Grabowski
  • Publication number: 20140088965
    Abstract: Methods and systems populate a speech signature database with unique speech signatures that are associated with one or more speaker identities and are further associated with one or more mobile stations and/or telephone numbers. Real-time voice signals are compared to the speech signatures in the speech signature database. When a match is found, the mobile station from which the voice signal originated is located in real-time. Further, the associations in the speech signature database are leveraged to find other relevant mobile stations or users and to generate additional associations and to also locate associated users and mobile stations.
    Type: Application
    Filed: September 27, 2012
    Publication date: March 27, 2014
    Applicant: POLARIS WIRELESS, INC.
    Inventor: Narender Goel
  • Publication number: 20140074471
    Abstract: A method is provided and includes estimating an approximate list of potential speakers in a file from one or more applications. The file (e.g., an audio file, video file, or any suitable combination thereof) includes a recording of a plurality of speakers. The method also includes segmenting the file according to the approximate list of potential speakers such that each segment corresponds to at least one speaker; and recognizing particular speakers in the file based on the approximate list of potential speakers.
    Type: Application
    Filed: September 10, 2012
    Publication date: March 13, 2014
    Applicant: CISCO TECHNOLOGY, INC.
    Inventors: Ananth Sankar, Sachin Kajarekar, Satish K. Gannu
  • Publication number: 20140039893
    Abstract: Disclosed embodiments provide for personalizing a voice user interface of a remote multi-user service. A voice user interface for the remote multi-user service can be provided and voice information from an identified user can be received at the multi-user service through the voice user interface. A language model specific to the identified user can be retrieved that models one or more language elements. The retrieved language model can be applied to interpret the received voice information and a response can be generated by the multi-user service in response the interpreted voice information.
    Type: Application
    Filed: July 31, 2012
    Publication date: February 6, 2014
    Applicant: SRI INTERNATIONAL
    Inventor: Steven Weiner
  • Publication number: 20140006026
    Abstract: A system for generating one or more enhanced audio signals such that one or more sound levels corresponding with sounds received from one or more sources of sound within an environment may be dynamically adjusted based on contextual information is described. The one or more enhanced audio signals may be generated by a head-mounted display device (HMD) worn by an end user within the environment and outputted to earphones associated with the HMD such that the end user may listen to the one or more enhanced audio signals in real-time. In some cases, each of the one or more sources of sound may correspond with a priority level. The priority level may be dynamically assigned depending on whether the end user of the HMD is focusing on a particular source of sound or has specified a predetermined level of importance corresponding with the particular source of sound.
    Type: Application
    Filed: June 29, 2012
    Publication date: January 2, 2014
    Inventors: Mathew J. Lamb, Ben J. Sugden, Robert L. Crocco, JR., Brian E. Keane, Christopher E. Miles, Kathryn Stone Perez, Laura K. Massey, Alex Aben-Athar Kipman
  • Publication number: 20130304478
    Abstract: An embodiment of the invention provides a method of preparing for speaker authentication. The method includes: receiving speech data that represents an utterance made by a user; extracting side information; examining the side information to determine whether to allow speaker model training using the speech data; and generating a feedback message for the user based on the side information if speaker model training using the speech data is not allowed.
    Type: Application
    Filed: August 28, 2012
    Publication date: November 14, 2013
    Inventors: Liang-Che Sun, Yiou-Wen Cheng
  • Publication number: 20130231933
    Abstract: A method and system for assignee identification of speech includes defining several time intervals and utilizing one or more function evaluations to classify each of the several participants as addressing speech to an automated character or not addressing speech to the automated character during each of the several time intervals. A first function evaluation includes computing values for a predetermined set of features for each of the participants during a particular time interval and assigning a first addressing status to each of the several participants in the particular time interval, based on the values of each of the predetermined sets of features determined during the particular time interval. A second function evaluation may assign a second addressing status to each of the several participants in the particular time interval utilizing results of the first function evaluation for the particular time interval and for one or more additional contiguous time intervals.
    Type: Application
    Filed: March 2, 2012
    Publication date: September 5, 2013
    Applicant: DISNEY ENTERPRISES, INC.
    Inventors: Hannaneh Hajishirzi, Jill Fain Lehman
  • Publication number: 20130218573
    Abstract: An electronic device for browsing a document is disclosed. The document being browsed includes a plurality of command-associated text strings. First, a text string selector of the electronic device selects a plurality of candidate text strings from the command-associated text strings. Afterward, an acoustic string provider of the electronic device prepares a candidate acoustic string for each of the candidate text strings. Thereafter, a microphone of the electronic device receives a voice command. Next, a speech recognizer of the electronic device searches the candidate acoustic strings for a target acoustic string that matches the voice command, wherein the target acoustic string corresponds to a target text string of the candidate text strings. Finally, a document browser of the electronic device executes a command associated with the target text string.
    Type: Application
    Filed: February 21, 2012
    Publication date: August 22, 2013
    Inventors: Yiou-Wen Cheng, Liang-Che Sun, Chao-Ling Hsu, Hsi-Kang Tsao, Jyh-Horng Lin
  • Publication number: 20130191127
    Abstract: A voice analyzer includes a plate-shaped body, a plurality of first voice acquisition units that are placed on both surfaces of the plate-shaped body and that acquire a voice of a speaker, a sound pressure comparison unit that compares sound pressure of a voice acquired by the first voice acquisition unit placed on one surface of the plate-shaped body with sound pressure of a voice acquired by the first voice acquisition unit placed on the other surface and determines a larger sound pressure, and a voice signal selection unit that selects information regarding a voice signal which is associated with the larger sound pressure and is determined by the sound pressure comparison unit.
    Type: Application
    Filed: July 20, 2012
    Publication date: July 25, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Kiyoshi IIDA, Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Akira FUJII, Yohei NISHINO
  • Publication number: 20130166296
    Abstract: The present invention relates to a method and apparatus for generating speaker-specific spoken passwords. One embodiment of a method for generating a spoken password for use by a speaker of interest includes identifying one or more speech features that best distinguish the speaker of interest from a plurality of impostor speakers and incorporating the speech features in the spoken password.
    Type: Application
    Filed: December 21, 2011
    Publication date: June 27, 2013
    Inventor: NICOLAS SCHEFFER
  • Publication number: 20130166299
    Abstract: A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body and is used to hang the apparatus body from a neck of a user, a first voice acquisition unit provided in the strap or the apparatus body, a second voice acquisition unit provided at a position where a distance of a sound wave propagation path from a mouth of the user is smaller than a distance of a sound wave propagation path from the mouth of the user to the first voice acquisition unit, and an identification unit that identifies a sound, in which first sound pressure acquired by the first voice acquisition unit is larger by a predetermined value or more than second sound pressure acquired by the second voice acquisition unit, on the basis of a result of comparison between the first sound pressure and the second sound pressure.
    Type: Application
    Filed: May 18, 2012
    Publication date: June 27, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Kei SHIMOTANI, Yohei NISHINO, Hirohito YONEYAMA, Kiyoshi IIDA, Akira FUJII, Haruo HARADA
  • Publication number: 20130166298
    Abstract: A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body to make the apparatus body hung from a neck of a wearer, a first voice acquisition unit that acquires a voice of a speaker and is disposed in either a left or right strap when viewed from the wearer, a second voice acquisition unit that acquires the voice of the speaker and is disposed in the opposite strap in which the first voice acquisition unit is disposed, and an arrangement recognition unit that recognizes arrangements of the first and second voice acquisition units, when viewed from the wearer, by comparing a voice signal of the voice acquired by the first voice acquisition unit with sound pressure of a heart sound of the wearer acquired by the second voice acquisition unit.
    Type: Application
    Filed: April 20, 2012
    Publication date: June 27, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Akira FUJII, Yohei NISHINO, Kiyoshi IIDA
  • Publication number: 20130144620
    Abstract: Various embodiments of the present invention for validating the authenticity of a website are provided. An example of a method according to the present invention comprises providing a website having an artifact, receiving a communication from a user, at a service provider, for validating the website associated with a service provider, inquiring from the user a description of the artifact comparing the artifact on the website with the description of the artifact from the user and generating a indication to the user based upon the comparing. The communication is over a first communication channel and the website is accessed over a second communication channel. The first communication channel is different than the second. The artifact can be displayed after a user session is identified.
    Type: Application
    Filed: December 6, 2011
    Publication date: June 6, 2013
    Applicant: TELCORDIA TECHNOLOGIES, INC.
    Inventors: Richard J. Lipton, Shoshana K. Loeb, Thimios Panagos
  • Publication number: 20130080167
    Abstract: In one embodiment, a method includes receiving an acoustic input signal at a speech recognizer. A user is identified that is speaking based on the acoustic input signal. The method then determines speaker-specific information previously stored for the user and a set of responses based on the recognized acoustic input signal and the speaker-specific information for the user. It is determined if the response should be output and the response is outputted if it is determined the response should be output.
    Type: Application
    Filed: December 16, 2011
    Publication date: March 28, 2013
    Applicant: SENSORY, INCORPORATED
    Inventor: Todd F. Mozer
  • Publication number: 20130080168
    Abstract: An audio analysis apparatus includes the following components. A strap has an end portion connected to a main body and is used to hang the main body from a user's neck. A first audio acquisition device is at the end portion or in the main body. Second and third audio acquisition devices are at positions separate from the end portion by substantially the same predetermined distances, on the respective sides of the strap extending from the user's neck. An analysis unit discriminates whether an acquired sound is an uttered voice of the user or another person by comparing audio signals of acquired by the first and second or third audio acquisition devices and detects an orientation of the user's face by comparing the audio signals acquired by the second and third audio acquisition devices. A transmission unit transmits the analysis result to an external apparatus.
    Type: Application
    Filed: February 27, 2012
    Publication date: March 28, 2013
    Applicant: FUJI XEROX Co., Ltd.
    Inventors: Kiyoshi Iida, Haruo Harada, Hirohito Yoneyama, Kei Shimotani, Yohei Nishino
  • Publication number: 20130080170
    Abstract: An audio analysis apparatus includes the following components. A main body includes a discrimination unit and a transmission unit. A strap is used for hanging the main body from a user's neck. A first audio acquisition device is provided to the strap or the main body. A second audio acquisition device is provided to the strap at a position where a distance between the second audio acquisition device and the user's mouth is smaller than the distance between the first audio acquisition device and the user's in a state where the strap is worn around the user's neck. The discrimination unit discriminates whether an acquired sound is an uttered voice of the user or of another person by comparing audio signals of the sound acquired by the first and second audio acquisition devices. The transmission unit transmits information including the discrimination result to an external apparatus.
    Type: Application
    Filed: March 5, 2012
    Publication date: March 28, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Yohei NISHINO, Kiyoshi IIDA, Takao NAITO
  • Publication number: 20130043977
    Abstract: A system for confirming that a subject is the source of spoken audio and the identity of the subject providing the spoken audio is described. The system includes at least one motion sensor operable to capture physical motion of at least one articulator that contributes to the production of speech, at least one acoustic signal sensor to receive acoustic signals, and a processing device comprising a memory and communicatively coupled to the at least one motion sensor and the at least one acoustic signal sensor. The processing device is programmed to correlate physical motion data with acoustical signal data to uniquely characterize the subject for purposes of verifying the subject is the source of the acoustical signal data and the identity of the subject.
    Type: Application
    Filed: August 19, 2011
    Publication date: February 21, 2013
    Inventors: George A. Velius, David A. Whelan
  • Publication number: 20130041666
    Abstract: A voice recognition apparatus, a voice recognition server, a voice recognition system, and a voice recognition method, in which a general-purpose voice recognition engine may accurately recognize a limited number of words used in a specific area.
    Type: Application
    Filed: August 8, 2012
    Publication date: February 14, 2013
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Eun-sang BAK
  • Publication number: 20130024196
    Abstract: Systems, methods, and apparatus for using at least one mobile device to receive a representation of at least one audio signal. In some embodiments, the at least one audio signal comprises speech of at least one of a plurality of first participants of a meeting, the plurality of first participants participating in the meeting from a first location, and the at least one audio signal may be audibly rendered to at least one second participant of the meeting at a second location different from the first location. In some embodiments, the at least one mobile device may further receive an indication of an identity of a leading speaker of the speech in the at least one audio signal, the leading speaker being identified from among the plurality of first participants, and may render the identity of the leading speaker to the at least one second participant.
    Type: Application
    Filed: July 21, 2011
    Publication date: January 24, 2013
    Applicant: Nuance Communications, Inc.
    Inventors: William F. Ganong, III, David Mark Krowitz, Tobias Wolff, Markus Buck
  • Publication number: 20130024197
    Abstract: An electronic device and a method for controlling an electronic device are disclosed. The electronic device includes: a display unit; a voice input unit; and a controller displaying a plurality of contents on the display unit, receiving a voice command for controlling any one of the plurality of contents through the voice input unit, and controlling content corresponding to the received voice command. Multitasking performed by the electronic device can be effectively controlled through a voice command.
    Type: Application
    Filed: October 13, 2011
    Publication date: January 24, 2013
    Applicant: LG ELECTRONICS INC.
    Inventors: Seokbok JANG, Jongse PARK, Joonyup LEE, Jungkyu CHOI
  • Publication number: 20130018657
    Abstract: A method (700) and system (900) for authenticating a user is provided. The method can include receiving one or more spoken utterances from a user (702), recognizing a phrase corresponding to one or more spoken utterances (704), identifying a biometric voice print of the user from one or more spoken utterances of the phrase (706), determining a device identifier associated with the device (708), and authenticating the user based on the phrase, the biometric voice print, and the device identifier (710). A location of the handset or the user can be employed as criteria for granting access to one or more resources (712).
    Type: Application
    Filed: September 13, 2012
    Publication date: January 17, 2013
    Applicant: Porticus Technology, Inc.
    Inventors: Germano Di Mambro, Bernardas Salna
  • Publication number: 20130006634
    Abstract: Techniques are provided to improve identification of a person using speaker recognition. In one embodiment, a unique social graph may be associated with each of a plurality of defined contexts. The social graph may indicate speakers likely to be present in a particular context. Thus, an audio signal including a speech signal may be collected and processed. A context may be inferred, and a corresponding social graph may be identified. A set of potential speakers may be determined based on the social graph. The processed signal may then be compared to a restricted set of speech models, each speech model being associated with a potential speaker. By limiting the set of potential speakers, speakers may be more accurately identified.
    Type: Application
    Filed: January 6, 2012
    Publication date: January 3, 2013
    Applicant: QUALCOMM Incorporated
    Inventors: Leonard Henry Grokop, Vidya Narayanan
  • Publication number: 20130006642
    Abstract: Methods and systems for providing a voice-based digital signature service are disclosed. The method includes a first user sending a document to a second user for signature, and the first user also sending a PIN to the second user and to a voice verification authority. The second user sending, to the voice verification authority, a voice recording comprising the PIN along with consent of the second user to the PIN. The voice verification authority comparing the voice recording with a predefined voice sample of the second user, and the PIN received from the first user with the PIN received from the second user. The voice verification authority then sending a notification to a signing entity based on the comparison. The signing entity signing the PIN of the document with a private key associated with the second user and sends an acknowledgement to the first user and the second user.
    Type: Application
    Filed: September 27, 2011
    Publication date: January 3, 2013
    Applicant: Infosys Limited
    Inventors: Ashutosh Saxena, Vishal Anjaiah Gujjary, Harigopal K.B. Ponnapalli
  • Publication number: 20120330663
    Abstract: An identity authentication method is applied on a system. The system is connected to an external storage device storing a first voice model. The system includes an information server and a terminal. The information server includes a database. The information server executes the following steps. First, receiving the first voice model transmitted by the terminal. Second, determining whether the first voice model matches one second voice model, and transmitting the verification result to the terminal. The terminal executes the following steps. First, generating a prompt to prompt the user to input voice signals. Second, receiving the input voice signals. Third, extracting voice features from the input voice signals. Fourth, determining whether the extracted voice features matches the first voice model. Fifth, determining the verification result is successful when matches, and determining the identity authentication is success only when two verification results are both successful.
    Type: Application
    Filed: August 11, 2011
    Publication date: December 27, 2012
    Applicant: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventors: YING-CHUAN YU, YING-XIONG HUANG, HSING-CHU WU, SHIH-PIN WU
  • Publication number: 20120323575
    Abstract: Speaker content generated in an audio conference is visually represented in accordance with a method. Speaker content from a plurality of audio conference participants is monitored using a computer with a tangible non-transitory processor and memory. The speaker content from each of the plurality of audio conference participants is monitored. A visual representation of speaker content for each of the plurality of audio conference participants is generated based on the analysis of the speaker content from each of the plurality of audio conference participant. The visual representation of speaker content is displayed.
    Type: Application
    Filed: June 17, 2011
    Publication date: December 20, 2012
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: David C. GIBBON, Andrea BASSO, Lee BEGEJA, Sumit KUMAR, Zhu LIU, Bernard S. RENGER, Behzad SHAHRARAY, Eric ZAVESKY
  • Publication number: 20120323574
    Abstract: Event audio data that is based on verbal utterances associated with a medical event associated with a patient is received. A list of a plurality of candidate text strings that match interpretations of the event audio data is obtained, based on information included in a medical speech repository, information included in a speech accent repository, and a matching function. A selection of at least one of the candidate text strings included in the list is obtained. A population of at least one field of an electronic medical form is initiated, based on the obtained selection.
    Type: Application
    Filed: June 17, 2011
    Publication date: December 20, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Wang, Bin Zhou
  • Publication number: 20120313767
    Abstract: A touch sensor assembly having a selectable sensitivity level allows for a user to activate a touch sensor while wearing hand covers. The touch sensor assembly includes a touch sensor, a comparison unit, and a control unit. The touch sensor has a selectable sensitivity level, and detects a value corresponding to a capacitance of the touch sensor. The comparison unit has a predetermined threshold stored therein, and determines activation of the touch sensor if the detected value exceeds the predetermined value. The control unit is operable to select a sensitivity level of the touch sensor by varying the predetermined threshold by an amount unrelated to an environmental effect on the touch sensor.
    Type: Application
    Filed: June 7, 2011
    Publication date: December 13, 2012
    Applicant: Toyota Motor Engineering & Manufacturing North America, Inc.
    Inventor: Nicholas Scott Sitarski
  • Publication number: 20120303369
    Abstract: Functionality is described herein for recognizing speakers in an energy-efficient manner. The functionality employs a heterogeneous architecture that comprises at least a first processing unit and a second processing unit. The first processing unit handles a first set of audio processing tasks (associated with the detection of speech) while the second processing unit handles a second set of audio processing tasks (associated with the identification of speakers), where the first set of tasks consumes less power than the second set of tasks. The functionality also provides unobtrusive techniques for collecting audio segments for training purposes. The functionality also encompasses new applications which may be invoked in response to the recognition of speakers.
    Type: Application
    Filed: May 26, 2011
    Publication date: November 29, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Alice Jane B. Brush, Nissanka Arachchige Bodhi Priyantha, Jie Liu, Amy K. Karlson, Hong Lu
  • Publication number: 20120296650
    Abstract: Embodiments of the present invention provide a method, system and article of manufacture for adjusting a language model within a voice recognition system, based on text received from an external application. The external application may supply text representing the words of one participant to a text-based conversation. n such a case, changes may be made to a language model by analyzing the external text received from the external application.
    Type: Application
    Filed: August 2, 2012
    Publication date: November 22, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Cary L. Bates, Brian P. Wallenfelt
  • Publication number: 20120296649
    Abstract: A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature.
    Type: Application
    Filed: July 31, 2012
    Publication date: November 22, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Pradeep K. Bansal, Lee Begeja, Carroll W. Creswell, Jeffrey Farah, Benjamin J. Stern, Jay Wilpon
  • Publication number: 20120284027
    Abstract: An embodiment of the present invention provides a speech recognition engine that utilizes portable voice profiles for converting recorded speech to text. Each portable voice profile includes speaker-dependent data, and is configured to be accessible to a plurality of speech recognition engines through a common interface. A voice profile manager receives the portable voice profiles from other users who have agreed to share their voice profiles. The speech recognition engine includes speaker identification logic to dynamically select a particular portable voice profile, in real-time, from a group of portable voice profiles. The speaker-dependent data included with the portable voice profile enhances the accuracy with which speech recognition engines recognize spoken words in recorded speech from a speaker associated with a portable voice profile.
    Type: Application
    Filed: June 14, 2012
    Publication date: November 8, 2012
    Inventors: Jacqueline Mallett, Sunil Vemuri, N. Rao Machiraju
  • Publication number: 20120284026
    Abstract: In an aspect, in general, a method for computer assisted speaker authentication in a voice communication session includes establishing a voice communication session between a first speaker and an agent, accepting a first voice signal from the first speaker, determining a voice characteristic measure of the first voice signal, including characterizing a similarity of the first voice signal to each of one or more stored characterizations of voice signals previously acquired from one or more known speakers, and providing an interface to the agent during the voice communication session between the agent and the first speaker, including presenting an indicator based on the determined voice characteristic measure to the agent.
    Type: Application
    Filed: May 6, 2011
    Publication date: November 8, 2012
    Applicant: Nexidia Inc.
    Inventors: Peter S. Cardillo, Marsal Gavalda