Patents Examined by Farzad Kazeminezhad
  • Patent number: 11393454
    Abstract: A dialog generator receives data corresponding to desired dialog, such as application programming interface (API) information and sample dialog. A first model corresponding to an agent simulator and a second model corresponding to a user simulator take turns creating a plurality of dialog outlines of the desired dialog. The dialog generator may determine that one or more additional APIs are relevant to the dialog and may create further dialog outlines related thereto. The dialog outlines are converted to natural dialog to generate the dialog.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: July 19, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Anish Acharya, Angeliki Metallinou, Tagyoung Chung, Shachi Paul, Shubhra Chandra, Chien-wei Lin, Dilek Hakkani-Tur, Arindam Mandal
  • Patent number: 11386906
    Abstract: There is provided an error concealment unit, method, and computer program, for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. In one embodiment, the error concealment unit provides an error concealment audio information for a lost audio frame on the basis of a properly decoded audio frame preceding the lost audio frame. The error concealment unit derives a damping factor on the basis of characteristics of a decoded representation of the properly decoded audio frame preceding the lost audio frame. The error concealment unit performs a fade out using the damping factor.
    Type: Grant
    Filed: August 28, 2020
    Date of Patent: July 12, 2022
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung, e.V.
    Inventors: Jérémie Lecomte, Adrian Tomasek
  • Patent number: 11380315
    Abstract: One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.
    Type: Grant
    Filed: March 9, 2019
    Date of Patent: July 5, 2022
    Assignee: CISCO TECHNOLOGY, INC.
    Inventors: Ahmad Abdulkader, Mohamed Gamal Mohamed Mahmoud
  • Patent number: 11373646
    Abstract: A household appliance control method, device and system, and an intelligent air conditioner are provided. The method includes that: a sound source location is determined by means of a camera; voice information of a user is picked up according to the sound source location; and at least one control operation is performed on a household appliance according to the voice information. The voice information matches at least one corresponding control instruction. By means of the method, a location of at least one user can be detected according to the camera, thereby enhancing at least one pickup audio signal at the location of the at least one user.
    Type: Grant
    Filed: February 14, 2017
    Date of Patent: June 28, 2022
    Assignee: GREE ELECTRIC APPLIANCES, INC. OF ZHUHAI
    Inventors: Guangyou Liu, Wencheng Zheng, Yuehui Mao, Zi Wang, Yao Chen, Bo Liang
  • Patent number: 11355106
    Abstract: An information processing apparatus includes circuitry configured to acquire audio information to be used for operating a target apparatus, recognize the audio information, obtain specific instruction information indicating specific information processing to be instructed to the target apparatus based on a recognition result of the acquired audio information, convert the specific instruction information into specific operation execution information described in an information format interpretable by the target apparatus, and output the specific operation execution information to the target apparatus.
    Type: Grant
    Filed: March 13, 2019
    Date of Patent: June 7, 2022
    Assignee: RICOH COMPANY, LTD.
    Inventor: Yutaka Nakamura
  • Patent number: 11334719
    Abstract: Systems, methods and computer program products are presented for a named entity recognition engine. The NER Engine initiates extraction of named entities from a document(s) and identifies one or more required parameters that correspond to a document outline type classification(s) of the document(s). The NER Engine applies a named entity recognition model to the extracted named entities to predict respective mappings between the extracted named entities and the one or more required parameters, wherein the said mapping depends on a Previous Number of Words model which is based on a same predefined number of words that appear before a named entity, as well as a model based on the named entity being included in a document sentence, and a model which depends on position of the named entity in the document. The NER Engine generates a user interface for display of the predicted respective mappings.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: May 17, 2022
    Assignee: The Abstract Operations Company
    Inventor: Bhavesh Kakadiya
  • Patent number: 11322172
    Abstract: Computer-generated feedback directed to whether user speech input meets subjective criteria is provided through the evaluation of multiple speaking traits. Initially, discrete instances of various multiple speaking traits are detected within the user speech input provided. Such multiple speaking traits include vocal fry, tag questions, uptalk, filler sounds and hedge words. Audio constructs indicative of individual instances of speaking traits are isolated and identified from appropriate samples. Speaking trait detectors then utilize such audio constructs to identify individual instances of speaking traits within the spoken input. The resulting quantities are scored based on reference to predetermined threshold quantities. The individual speaking trait scores are then amalgamated utilizing a weighting that is derived based on empirical relationships between those speaking traits and the criteria for which the user's speech input is being evaluated.
    Type: Grant
    Filed: June 1, 2017
    Date of Patent: May 3, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Oscar Roberto Morales Garrido, Paul Thackray, Kristen Kennedy
  • Patent number: 11315556
    Abstract: Systems and methods for distributed voice processing are disclosed herein. In one example, the method includes detecting sound via a microphone array of a first playback device and analyzing, via a first wake-word engine of the first playback device, the detected sound. The first playback device may transmit data associated with the detected sound to a second playback device over a local area network. A second wake-word engine of the second playback device may analyze the transmitted data associated with the detected sound. The method may further include identifying that the detected sound contains either a first wake word or a second wake word based on the analysis via the first and second wake-word engines, respectively. Based on the identification, sound data corresponding to the detected sound may be transmitted over a wide area network to a remote computing device associated with a particular voice assistant service.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: April 26, 2022
    Assignee: Sonos, Inc.
    Inventors: Connor Kristopher Smith, John Tolomei, Betty Lee
  • Patent number: 11308949
    Abstract: A method, computer system, and a computer program product for voice assistant responses is provided. The present invention may include configuring a behavioral rule. The present invention may include receiving a verbal request from a user and comparing the received verbal request to the behavioral rule. The present invention may then include determining that the received verbal request does not comply with the behavioral rule. The present invention may lastly include providing a response to the user based on determining that the received verbal request does not comply with the behavioral rule.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: April 19, 2022
    Assignee: International Business Machines Corporation
    Inventors: Lisa Seacat DeLuca, Jeremy R. Fox, Kelley Anders
  • Patent number: 11302326
    Abstract: Example techniques involve systems with multiple acoustic echo cancellers. An example implementation captures first audio within an acoustic environment and detecting, within the captured first audio content, a wake-word. In response to the wake-word and before playing an acknowledgement tone, the implementation activates (a) a first sound canceller when one or more speakers are playing back audio content or (b) a second sound canceller when the one or more speakers are idle. In response to the wake-word and after activating either (a) the first sound canceller or (b) the second sound canceller, the implementation outputs the acknowledgement tone via the one or more speakers. The implementation captures second audio within the acoustic environment and cancelling the acoustic echo of the acknowledgement tone from the captured second audio using the activated sound canceller.
    Type: Grant
    Filed: April 10, 2020
    Date of Patent: April 12, 2022
    Assignee: Sonos, Inc.
    Inventor: Saeed Bagheri Sereshki
  • Patent number: 11289066
    Abstract: A voice synthesis method includes: sequentially acquiring voice units comprising at least one of diphone or a triphone in accordance with synthesis information for synthesizing voices; generating statistical spectral envelopes using a statistical model built by machine learning in accordance with the synthesis information for synthesizing the voices; and concatenating the sequentially acquired voice units and modifying a frequency spectral envelope of each voice unit in accordance with the generated statistical spectral envelope, thereby synthesizing a voice signal based on the concatenated voice units having the modified frequency spectra.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: March 29, 2022
    Assignee: YAMAHA CORPORATION
    Inventors: Yuji Hisaminato, Ryunosuke Daido, Keijiro Saino, Jordi Bonada, Merlijn Blaauw
  • Patent number: 11289069
    Abstract: A statistical parameter modeling method is performed by a server. After obtaining model training data, the model training data including a text feature sequence and a corresponding original speech sample sequence, the server inputs an original vector matrix formed by matching a text feature sample point in the text feature sample sequence with a speech sample point in the original speech sample sequence into a statistical parameter model for training and then performs non-linear mapping calculation on the original vector matrix in a hidden layer, to output a corresponding prediction speech sample point. The server then obtains a model parameter of the statistical parameter model according to the prediction speech sample point and a corresponding original speech sample point by using a smallest difference principle, to obtain a corresponding target statistical parameter model.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: March 29, 2022
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Wei Li, Hangyu Yan, Ke Li, Yongjian Wu, Feiyue Huang
  • Patent number: 11288458
    Abstract: The present application relates to natural language processing and discloses a sequence conversion method. The method includes: obtaining a source sequence from an input signal; converting the source sequence into one or more source context vectors; obtaining a target context vector corresponding to each source context vector; combining the target context vectors to obtain the target sequence; and outputting the target sequence. A weight vector is applied on a source context vector and a reference context vector, to obtain a target context vector, wherein the weight of one or more elements in the source context vector associated with notional words or weight of a function word in the target context vector is increased. The source sequence and the target sequence are representations of natural language contents. The claimed process improves faithfulness of converting the source sequence to the target sequence.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: March 29, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zhaopeng Tu, Xiaohua Liu, Zhengdong Lu, Hang Li
  • Patent number: 11282537
    Abstract: Active speaker detection can include receiving speaker detection signals from a plurality of devices participating in an electronic meeting. Each speaker detection signal specifies a score indicating whether an active speaker is detected by a respective device of the plurality of devices that generates the speaker detection signal. Active speaker detection further can include determining, using a processor, a device of the plurality of devices that detects an active speaker based upon the speaker detection signals, wherein, in response to the determining, the method further comprises: providing video received from the determined device to the plurality of devices during the electronic meeting.
    Type: Grant
    Filed: June 9, 2017
    Date of Patent: March 22, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Al Chakra, Jonathan Dunne, James P. Galvin, Jr., Liam Harpur
  • Patent number: 11276390
    Abstract: An audio interval detection apparatus has a processor and a storage storing instructions that, when executed by the processor, control the processor to: detect, from a target audio signal, a specified audio interval including a specified audio signal representing a state of a phoneme of a same consonant produced continuously over a period longer than a specified time, and, by eliminating, from the target audio signal at least the detected specified audio interval, detect from the target audio signal an utterance audio interval that includes a speech utterance signal representing a speech utterance uttered by a speaker.
    Type: Grant
    Filed: March 13, 2019
    Date of Patent: March 15, 2022
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Hiroki Tomita
  • Patent number: 11264042
    Abstract: An apparatus for generating an enhanced signal from an input signal, wherein the enhanced signal has spectral values for an enhancement spectral region, the spectral values for the enhancement spectral regions not being contained in the input signal, includes a mapper for mapping a source spectral region of the input signal to a target region in the enhancement spectral region, the source spectral region including a noise-filling region; and a noise filler configured for generating first noise values for the noise-filling region in the source spectral region of the input signal and for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from the first noise values or for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from first noise values in the source region.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: March 1, 2022
    Inventors: Sascha Disch, Ralf Geiger, Andreas Niedermeier, Matthias Neusinger, Konstantin Schmidt, Stephan Wilde, Benjamin Schubert, Christian Neukam
  • Patent number: 11244692
    Abstract: To convey information using an audio channel, an audio signal is modulated to produce a modulated signal by embedding additional information into the audio signal. Modulating the audio signal processing the audio signal to produce a set of filter responses; creating a delayed version of the filter responses; modifying the delayed version of the filter responses based on the additional information to produce an echo audio signal; and combining the audio signal and the echo audio signal to produce the modulated signal. Modulating the audio signal may involve employing a modulation strength, and a psychoacoustic model may be used to modify the modulation strength based on a comparison of a distortion of the modified audio signal relative to the audio signal and a target distortion.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: February 8, 2022
    Assignee: Digital Voice Systems, Inc.
    Inventor: Daniel W. Griffin
  • Patent number: 11244694
    Abstract: A method is described that processes an audio signal. A discontinuity between a filtered past frame and a filtered current frame of the audio signal is removed using linear predictive filtering.
    Type: Grant
    Filed: January 23, 2017
    Date of Patent: February 8, 2022
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Emmanuel Ravelli, Manuel Jander, Grzegorz Pietrzyk, Martin Dietz, Marc Gayer
  • Patent number: 11238873
    Abstract: An apparatus for level estimation of an encoded audio signal is provided. The apparatus has a codebook determinator for determining a codebook from a plurality of codebooks as an identified codebook. The audio signal has been encoded by employing the identified codebook. Moreover, the apparatus has an estimation unit configured for deriving a level value associated with the identified codebook as a derived level value and for estimating a level estimate of the audio signal using the derived level value.
    Type: Grant
    Filed: April 4, 2013
    Date of Patent: February 1, 2022
    Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
    Inventors: Ralf Geiger, Markus Schnell, Manfred Lutzky, Marco Diatschuk
  • Patent number: 11232810
    Abstract: A non-transitory computer-readable recording medium storing a program that causes a computer to execute a process for evaluating a voice, the process includes analyzing a voice signal to detect a pitch frequency; selecting an evaluation target region to be evaluated in the detected pitch frequency based on a distribution of a detection rate of the detected pitch frequency; and evaluating a voice based on the distribution of the detection rate and the selected evaluation target region. An Impression of the voice using a correction to the distribution is evaluated and determined as good when a spread of the corrected distribution is larger than or equal to a certain threshold.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: January 25, 2022
    Assignee: FUJITSU LIMITED
    Inventors: Sayuri Nakayama, Taro Togawa, Takeshi Otani