Abstract: An arrangement is provided for facilitating selection and activation of a voice control system by a vehicle operator. The arrangement may include a switch selectively switchable between a first switch position (P1) and a second switch position (P2) and arranged to emit a signal when switched between the switch positions (P1, P2). The arrangement may further include a processing unit arranged to select, based on an interpretation of the signal, one of the vehicle voice control system or a voice control system of an external communication device and to communicate an activation signal to the selected one of the vehicle voice control system and the external communication device voice control system.
Abstract: Systems and methods are provided for associating a phonetic pronunciation with a name by receiving the name, mapping the name to a plurality of monosyllabic components that are combinable to construct the phonetic pronunciation of the name, receiving a user input to select one or more of the plurality, and combining the selected one or more of the plurality of monosyllabic components to construct the phonetic pronunciation of the name.
Abstract: Hearing device configuration and hearing treatment using categorical perception; systems and methods for categorical perception based configuration of hearing devices and hearing treatment.
Abstract: A voice converting apparatus and a voice converting method are provided. The method of converting a voice using a voice converting apparatus including receiving a voice from a counterpart, analyzing the voice and determining whether the voice abnormal, converting the voice into a normal voice by adjusting a harmonic signal of the voice in response to determining that the voice is abnormal, and transmitting the normal voice.
Type:
Grant
Filed:
December 27, 2016
Date of Patent:
November 6, 2018
Assignee:
SAMSUNG ELECTRONICS CO., LTD.
Inventors:
Jong-youb Ryu, Yoon-jae Lee, Seoung-hun Kim, Young-tae Kim
Abstract: A language model for automatic speech processing, such as a finite state transducer (FST) may be configured to incorporate information about how a particular word sequence (N-gram) may be used in a similar manner from another N-gram. A score of a component of the FST (such as an arc or state) relating to the first N-gram may be based on information of the second N-gram. Further, the FST may be configured to have an arc between a state of the first N-gram and a state of the second N-gram to allow for cross N-gram back off, rather than backoff from a larger N-gram to a smaller N-gram during traversal of the FST during speech processing.
Type:
Grant
Filed:
June 30, 2016
Date of Patent:
November 6, 2018
Assignee:
Amazon Technologies, Inc.
Inventors:
Ankur Gandhe, Denis Sergeyevich Filimonov, Ariya Rastrow, Björn Hoffmeister
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for designating certain voice commands as hotwords. The methods, systems, and apparatus include actions of receiving a hotword followed by a voice command. Additional actions include determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, where a voice command that is designated as a hotword is treated as a voice input regardless of whether the voice command is preceded by another hotword. Further actions include, in response to determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, designating the voice command as a hotword.
Abstract: Provided are an automatic interpretation system and method for generating a synthetic sound having characteristics similar to those of an original speaker's voice. The automatic interpretation system for generating a synthetic sound having characteristics similar to those of an original speaker's voice includes a speech recognition module configured to generate text data by performing speech recognition for an original speech signal of an original speaker and extract at least one piece of characteristic information among pitch information, vocal intensity information, speech speed information, and vocal tract characteristic information of the original speech, an automatic translation module configured to generate a synthesis-target translation by translating the text data, and a speech synthesis module configured to generate a synthetic sound of the synthesis-target translation.
Type:
Grant
Filed:
July 19, 2016
Date of Patent:
October 23, 2018
Assignee:
ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
Inventors:
Seung Yun, Ki Hyun Kim, Sang Hun Kim, Yun Young Kim, Jeong Se Kim, Min Kyu Lee, Soo Jong Lee, Young Jik Lee, Mu Yeol Choi
Abstract: A relation detection model training solution. The relation detection model training solution mines freely available resources from the World Wide Web to train a relationship detection model for use during linguistic processing. The relation detection model training system searches the web for pairs of entities extracted from a knowledge graph that are connected by a specific relation. Performance is enhanced by clipping search snippets to extract patterns that connect the two entities in a dependency tree and refining the annotations of the relations according to other related entities in the knowledge graph. The relation detection model training solution scales to other domains and languages, pushing the burden from natural language semantic parsing to knowledge base population. The relation detection model training solution exhibits performance comparable to supervised solutions, which require design, collection, and manual labeling of natural language data.
Type:
Grant
Filed:
December 20, 2013
Date of Patent:
September 11, 2018
Assignee:
Microsoft Technology Licensing, LLC
Inventors:
Dilek Z. Hakkani-Tur, Gokhan Tur, Larry Paul Heck
Abstract: Embodiments are directed to a computer implemented counterproductive interaction identification system. The system includes an electronic tool configured to hold data of a user, and an analyzer circuit configured to derive a cognitive trait of the user based at least in part on the data of the user. The system further includes a decision engine configured to determine, based at least in part on the derived cognitive trait of the user, that the user is a target or a source of an actual or an impending counterproductive interaction.
Type:
Grant
Filed:
December 1, 2015
Date of Patent:
September 4, 2018
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Aaron K. Baughman, James R. Kozloski, Timothy M. Lynar, Suraj Pandey, John M. Wagner
Abstract: A double-sided display simultaneous translation method includes: receiving a voice input; translating the inputted voice according to a preset rule; and outputting translated contents. Different from the prior art, in the double-sided display simultaneous translation method provided in embodiments of the present invention, by means of a bidirectional voice input mode corresponding to a double-channel system, the same device allows both parties in communication to speak at the same time and can accurately differentiate voice of both parties to provide right translation.
Abstract: One embodiment provides a method, including: accessing, using a processor, a form comprising at least one fillable field; receiving, from an audio input device, audio input from a user; identifying, using a processor, a fillable field associated with the audio input; and providing input, based on the audio input, to the fillable field associated with the audio input. Other aspects are described and claimed.
Abstract: A method of morphing speech from an original speaker into the speech of a second, target speaker with decomposing either speech into source and filter, and without the need to determine the formant positions by warping spectral envelops.
Abstract: An alignment of attributes is extracted from a pair of mentions in an annotated corpus. An annotated document is read from the annotated corpus. Three character string arrays (A, B, and C) are stored into a first TRIE. The character string arrays include character string representations of attributes of one or more first tokens, second tokens, and neighborhood tokens of mentions of the annotated document, respectively. The storing into the first TRIE obtains partial integer arrays (K, L, and M) that include identifiers corresponding to the attributes of the mentions. The partial integer arrays are stored into a second TRIE. The storing into the second TRIE obtains an identifier array (X) that includes three identifiers. The identifier array is stored into a third TRIE. The storing into the TRIE obtains one identifier of the alignment of attributes.
Type:
Grant
Filed:
November 11, 2015
Date of Patent:
May 29, 2018
Assignee:
International Business Machines Corporation
Abstract: An audio signal encoding method is provided. The method includes: dividing a frequency band of an audio signal into a plurality of sub-bands, and quantifying a sub-band normalization factor of each sub-band; determining signal bandwidth of bit allocation according to the quantified sub-band normalization factor, or according to the quantified sub-band normalization factor and bit rate information; allocating bits for a sub-band within the determined signal bandwidth; and coding a spectrum coefficient of the audio signal according to the bits allocated for each sub-band. According to embodiments of the present invention, during coding and decoding, signal bandwidth of bit allocation is determined according to the quantified sub-band normalization factor and bit rate information. In this manner, the determined signal bandwidth is effectively coded and decoded by centralizing the bits, and audio quality is improved.
Abstract: Method and apparatus are provided for reconstructing a noise component of a speech/audio signal. A bitstream, is received and decoded to obtain a speech/audio signal. A first speech/audio signal is determined according to the speech/audio signal. A symbol of each sample value in the first speech/audio signal and an amplitude value of each sample value in the first speech/audio signal is determined. An adaptive normalization length and an adjusted amplitude value of each sample value are determined according to the adaptive normalization length and the amplitude value of each sample value. A second speech/audio signal is determined according to the symbol of each sample value and the adjusted amplitude value of each sample value.
Abstract: In one implementation, a computer-implemented method includes receiving, at a mobile computing device, ambiguous user input that indicates more than one of a plurality of commands; and determining a current context associated with the mobile computing device that indicates where the mobile computing device is currently located. The method can further include disambiguating the ambiguous user input by selecting a command from the plurality of commands based on the current context associated with the mobile computing device; and causing output associated with performance of the selected command to be provided by the mobile computing device.
Type:
Grant
Filed:
July 1, 2016
Date of Patent:
May 8, 2018
Assignee:
Google LLC
Inventors:
John Nicholas Jitkoff, Michael J. LeBeau
Abstract: A method and a system for identity authentication are presented. In one example embodiment, audio data (e.g. a sound wave) may be received from a user. The audio data may be used to establish an identity of an entity to the user. The audio data may be stored at a storage location; and be presented to the user to establish the identity of the entity when the entity participates in an electronic communication with the user. In another example embodiment, a server (e.g., a web client or client application server) may present a plurality of audio files to a user; receive a user selection of selected audio data from the plurality of audio files; responsive to the user selection, the server may communicate, via a network, the selected audio data to another server. The selected audio data may be used as an identity authentication.
Abstract: Embodiments are directed to a computer implemented counterproductive interaction identification system. The system includes an electronic tool configured to hold data of a user, and an analyzer circuit configured to derive a cognitive trait of the user based at least in part on the data of the user. The system further includes a decision engine configured to determine, based at least in part on the derived cognitive trait of the user, that the user is a target or a source of an actual or an impending counterproductive interaction.
Type:
Grant
Filed:
November 5, 2015
Date of Patent:
April 24, 2018
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Aaron K. Baughman, James R. Kozloski, Timothy M. Lynar, Suraj Pandey, John M. Wagner
Abstract: A hearing aid apparatus includes a frequency analysis device configured to determine an instantaneous fundamental frequency value of a speech signal for a time portion of the speech signal. A statistical evaluation device is configured to determine an average fundamental frequency value of the speech signal over several time portions. A hearing aid apparatus further includes a fundamental frequency modifier that is configured to modify the instantaneous fundamental frequency value to a modified fundamental frequency value such that a difference or a quotient of the instantaneous fundamental frequency value is changed to the average fundamental frequency value according to a specific function. Thereby, a frequency range may be modified within which the fundamental frequency value varies. The hearing aid apparatus further includes a speech signal generator that is configured to generate, on the basis of the modified fundamental frequency value, a speech signal modified with regard to the fundamental frequency.
Type:
Grant
Filed:
May 16, 2016
Date of Patent:
April 3, 2018
Assignee:
Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.v.
Abstract: Method, system and product for automatic performance of user interaction operations on a computing device. A method comprising: obtaining an identifier of an operations sequence; obtaining the operations sequence by searching a repository of operations sequences using the identifier, wherein the repository of operation sequences comprises operations sequences defined based on a previous execution of one or more operations by another computing device other than the computing device on behalf of another user other than the user; and automatically executing the operations sequence or portion thereof on the computing device. Another method comprises: identifying elements in a layout of a GUI, displaying in visible proximity to each of the elements an assigned unique label; recognizing speech by a user vocally indicating a selected element by referring to the assigned label; and, automatically performing a user interaction operation on the selected element.