Patents Examined by Thuykhanh Le
  • Patent number: 10460727
    Abstract: Various systems and methods for multi-talker speech separation and recognition are disclosed herein. In one example, a system includes a memory and a processor to process mixed speech audio received from a microphone. In an example, the processor can also separate the mixed speech audio using permutation invariant training, wherein a criterion of the permutation invariant training is defined on an utterance of the mixed speech audio. In an example, the processor can also generate a plurality of separated streams for submission to a speech decoder.
    Type: Grant
    Filed: May 23, 2017
    Date of Patent: October 29, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: James Droppo, Xuedong Huang, Dong Yu
  • Patent number: 10452786
    Abstract: In a flow of computer actions, a computer system (110) receives a request involving a machine translation. In performing the translation (160, 238), or in using the translation in subsequent computer operations (242, 1110), the computer system takes into account known statistical relationships (310), obtained from previously accumulated click-through data (180), between a machine translation performed in a flow, the flow's portions preceding the translation, and success indicators pertaining to the flow's portion following the translation. The statistical relationships are derived by data mining of the click-through data. Further, normal actions can be suspended to use a random option to accumulate the click-through data and/or perform statistical AB testing. Other features are also provided.
    Type: Grant
    Filed: December 29, 2014
    Date of Patent: October 22, 2019
    Assignee: PayPal, Inc.
    Inventor: Hassan Sawaf
  • Patent number: 10438603
    Abstract: Provided are methods of decoding speech from the brain of a subject. The methods include detecting speech production signals from electrodes operably coupled to the speech motor cortex of a subject while the subject produces or imagines producing a speech sound. The methods further include deriving a speech production signal pattern from the detected speech production signals, and correlating the speech production signal pattern with a reference speech production signal pattern to decode speech from the brain of the subject. Speech communication systems and devices for practicing the subject methods are also provided.
    Type: Grant
    Filed: February 26, 2018
    Date of Patent: October 8, 2019
    Assignee: The Regents of the University of California
    Inventors: Edward F. Chang, Kristofer E. Bouchard
  • Patent number: 10424301
    Abstract: A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.
    Type: Grant
    Filed: September 10, 2018
    Date of Patent: September 24, 2019
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventors: Yuichiro Takayanagi, Masashi Kusaka
  • Patent number: 10409550
    Abstract: A method and apparatus for providing voice command functionality to an interactive whiteboard appliance is provided. An interactive whiteboard appliance comprises: one or more processors; a non-transitory computer-readable medium having instructions embodied thereon, the instructions when executed by the one or more processors cause performance of: detecting, during execution of an annotation window on the interactive whiteboard appliance, a voice input received from a user; storing, in an audio packet, a recording of the voice input; transmitting the audio packet to a speech-to-text service; receiving, from the speech-to-text service, a command string comprising a transcription of the recording of the voice input; using voice mode command processing in a command processor, identifying, from the command string, an executable command that is executable by the interactive whiteboard appliance; causing the application of the interactive whiteboard appliance to execute the executable command.
    Type: Grant
    Filed: March 4, 2016
    Date of Patent: September 10, 2019
    Assignee: RICOH COMPANY, LTD.
    Inventors: Rathnakara Malatesha, Lana Wong, Hiroshi Kitada
  • Patent number: 10403273
    Abstract: The present teaching relates to facilitating a guided dialog with a user. In one example, an input utterance is obtained from the user. One or more task sets are estimated based on the input utterance. Each of the one or more task sets includes a plurality of tasks estimated to be requested by the user via the input utterance and is associated with a confidence score computed based on statistics with respect to the plurality of tasks in the task set. At least one of the one or more task sets is selected based on their respective confidence scores. A response is generated based on the tasks in the selected at least one task set. The response is provided to the user.
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: September 3, 2019
    Assignee: Oath Inc.
    Inventors: Sungjin Lee, Amanda Stent
  • Patent number: 10395639
    Abstract: A method for providing a context awareness service is provided. The method includes defining a control command for the context awareness service depending on a user input, triggering a playback mode and the context awareness service in response to a user selection, receiving external audio through a microphone in the playback mode, determining whether the received audio corresponds to the control command, and executing a particular action assigned to the control command when the received audio corresponds to the control command.
    Type: Grant
    Filed: February 21, 2018
    Date of Patent: August 27, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jin Park, Jiyeon Jung
  • Patent number: 10379808
    Abstract: Methods, systems and apparatus for associating electronic devices together based on received audio commands are described. Methods for associating an audio-controlled device with a physically separate display screen device such that information responses can then be provided in both audio and graphic formats using the two devices in conjunction with each other are described. The audio-controlled device can receive audio commands that can be analyzed to determine the author, which can then be used to further streamline the association operation.
    Type: Grant
    Filed: September 29, 2015
    Date of Patent: August 13, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Justin-Josef Angel, Eric Alan Breitbard, Sean Robert Ryan, Robert Steven Murdock, Michael Douglas McQueen, Ryan Charles Chase, Colin Neil Swann
  • Patent number: 10380262
    Abstract: Automatic semantic analysis for characterizing and correlating literary elements within a digital work of literature is accomplished by employing natural language processing and deep semantic analysis of text to create annotations for the literary elements found in a segment or in the entirety of the literature, a weight to each literary element and its associated annotations, wherein the weight indicates an importance or relevance of a literary element to at least the segment of the work of literature; correlating and matching the literary elements to each other to establish one or more interrelationships; and producing an overall weight for the correlated matches.
    Type: Grant
    Filed: August 6, 2018
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Corville Orain Allen, Scott Robert Carrier, Eric Woods
  • Patent number: 10366419
    Abstract: A method includes, through a digital platform, encoding a digital media file related to a message from a publisher with decodable data, generating a modified digital media file therefrom, capturing, through a client application of a mobile device of a client user, the modified digital media file playing on a broadcasting device to generate capture data therefrom, and generating a response action of the client user based on analyzing the capture data. The method also includes associating the response action to the message of the publisher, automatically interpreting, through the client application, the modified digital media file to decode the decodable data therein, enabling initiation of the response action without interrupting an experience of concurrent sensing of media content through the broadcasting device by the client user, and providing a capability to the client user to control data thereof generated through the initiated response action.
    Type: Grant
    Filed: April 4, 2018
    Date of Patent: July 30, 2019
    Inventor: Roland Storti
  • Patent number: 10366707
    Abstract: Mechanisms, in a natural language processing (NLP) system are provided. The NLP system receives a plurality of communications associated with a communication system, over a predetermined time period, from a plurality of end user devices. The NLP system identifies, for each communication in the plurality of communications, a user submitting the communication to thereby generate a set of users comprising a plurality of users associated with the plurality of communications. The NLP system retrieves a user model for each user in the set of users, which specifies at least one attribute of a corresponding user. The NLP system generates an aggregate user model that aggregates the at least one attribute of each user in the set of users together to generate an aggregate representation of the attributes of the plurality of users in the set of users. The NLP system performs a cognitive operation based on the aggregate user model.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: July 30, 2019
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Laura J. Rodriguez
  • Patent number: 10366163
    Abstract: Systems and methods for determining knowledge-guided information for a recurrent neural networks (RNN) to guide the RNN in semantic tagging of an input phrase are presented. A knowledge encoding module of a Knowledge-Guided Structural Attention Process (K-SAP) receives an input phrase and, in conjunction with additional sub-components or cooperative components generates a knowledge-guided vector that is provided with the input phrase to the RNN for linguistic semantic tagging. Generating the knowledge-guided vector comprises at least parsing the input phrase and generating a corresponding hierarchical linguistic structure comprising one or more discrete sub-structures. The sub-structures may be encoded into vectors along with attention weighting identifying those sub-structures that have greater importance in determining the semantic meaning of the input phrase.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: July 30, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yun-Nung Chen, Dilek Z. Hakkani-Tur, Gokhan Tur, Asli Celikyilmaz, Jianfeng Gao, Li Deng
  • Patent number: 10360898
    Abstract: A system and method are presented for predicting speech recognition performance using accuracy scores in speech recognition systems within the speech analytics field. A keyword set is selected. Figure of Merit (FOM) is computed for the keyword set. Relevant features that describe the word individually and in relation to other words in the language are computed. A mapping from these features to FOM is learned. This mapping can be generalized via a suitable machine learning algorithm and be used to predict FOM for a new keyword. In at least embodiment, the predicted FOM may be used to adjust internals of speech recognition engine to achieve a consistent behavior for all inputs for various settings of confidence values.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: July 23, 2019
    Inventors: Aravind Ganapathiraju, Yingyi Tan, Felix Immanuel Wyss, Scott Allen Randal
  • Patent number: 10347273
    Abstract: A speech processing apparatus includes: an expectation value calculation unit configured to calculate, using an input signal spectrum and a speech model that models a feature quantity of speech, a spectrum expectation value which is an expectation value of a spectrum of an acoustic component included in the input signal spectrum; and an acoustic power estimation unit configured to estimate an acoustic power of the acoustic component of the input signal spectrum based on the input signal spectrum and the spectrum expectation value.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: July 9, 2019
    Assignee: NEC CORPORATION
    Inventors: Shuji Komeiji, Masanori Tsujikawa, Ryosuke Isotani
  • Patent number: 10339958
    Abstract: Detecting and monitoring legacy devices (such as appliances in a home) using audio sensing is disclosed. Methods and systems are provided for transforming audio data captured by the sensor to afford privacy when speech is overheard by the sensor. Because these transformations may negatively impact the ability to detect/monitor devices, an effective transformation is determined based on both privacy and detectability concerns.
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: July 2, 2019
    Assignee: ARRIS Enterprises LLC
    Inventors: Anthony J. Braskich, Venugopal Vasudevan
  • Patent number: 10339936
    Abstract: Disclosed are a method, a device and a system of generation of a modified digital media file based on a encoding of a digital media file with a decodable data such that the decodable data is indistinguishable through a human ear from a primary audio stream.
    Type: Grant
    Filed: August 4, 2017
    Date of Patent: July 2, 2019
    Inventor: Roland Storti
  • Patent number: 10339941
    Abstract: The invention provides a decoder being configured for processing an encoded audio bitstream, wherein the decoder includes: a bitstream decoder configured to derive a decoded audio signal from the bitstream, wherein the decoded audio signal includes at least one decoded frame; a noise estimation device configured to produce a noise estimation signal containing an estimation of the level and/or the spectral shape of a noise in the decoded audio signal; a comfort noise generating device configured to derive a comfort noise signal from the noise estimation signal; and a combiner configured to combine the decoded frame of the decoded audio signal and the comfort noise signal in order to obtain an audio output signal.
    Type: Grant
    Filed: August 2, 2018
    Date of Patent: July 2, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Guillaume Fuchs, Anthony Lombard, Emmanuel Ravelli, Stefan Doehla, Jérémie Lecomte, Martin Dietz
  • Patent number: 10332519
    Abstract: An apparatus including circuitry configured to determine a position of a mouth of a user that is distinguishable among a plurality of people, and control an acquisition condition for collecting a sound based on the determined position of the user's mouth.
    Type: Grant
    Filed: March 9, 2016
    Date of Patent: June 25, 2019
    Assignee: SONY CORPORATION
    Inventors: Kiyoshi Yoshikawa, Atsushi Okubo, Ken Miyashita
  • Patent number: 10332546
    Abstract: There are disclosed devices, system and methods for desired signal spotting in noisy, flawed environments by identifying a signal to be spotted, identifying a target confidence level, and then passing a pool of cabined arrays through a comparator to detect the identified signal, wherein the cabined arrays are derived from respective distinct environments. The arrays may include plural converted samples, each converted sample include a product of a conversion of a respective original sample, the conversion including filtering noise and transforming the original sample from a first form to a second form. Detecting may include measuring a confidence of the presence of the identified signal in each of plural converted samples using correlation of the identified signal to bodies of known matching samples. If the confidence for a given converted sample satisfies the target confidence level, the given sample is flagged.
    Type: Grant
    Filed: March 22, 2019
    Date of Patent: June 25, 2019
    Assignee: Invoca, Inc.
    Inventors: Sean Michael Storlie, Victor Jara Borda, Michael Kingsley McCourt, Jr., Leland W. Kirchhoff, Colin Denison Kelley, Nicholas James Burwell
  • Patent number: 10325598
    Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
    Type: Grant
    Filed: July 10, 2017
    Date of Patent: June 18, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Kenneth John Basye, Hugh Evan Secker-Walker, Tony David, Reinhard Kneser, Jeffrey Penrod Adams, Stan Weidner Salvador, Mahesh Krishnamoorthy