Patents Examined by Farzad Kazeminezhad

Goal-oriented dialog generation using dialog template, API, and entity data

Patent number: 11393454

Abstract: A dialog generator receives data corresponding to desired dialog, such as application programming interface (API) information and sample dialog. A first model corresponding to an agent simulator and a second model corresponding to a user simulator take turns creating a plurality of dialog outlines of the desired dialog. The dialog generator may determine that one or more additional APIs are relevant to the dialog and may create further dialog outlines related thereto. The dialog outlines are converted to natural dialog to generate the dialog.

Type: Grant

Filed: December 13, 2018

Date of Patent: July 19, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Anish Acharya, Angeliki Metallinou, Tagyoung Chung, Shachi Paul, Shubhra Chandra, Chien-wei Lin, Dilek Hakkani-Tur, Arindam Mandal
Error concealment unit, audio decoder, and related method and computer program using characteristics of a decoded representation of a properly decoded audio frame

Patent number: 11386906

Abstract: There is provided an error concealment unit, method, and computer program, for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. In one embodiment, the error concealment unit provides an error concealment audio information for a lost audio frame on the basis of a properly decoded audio frame preceding the lost audio frame. The error concealment unit derives a damping factor on the basis of characteristics of a decoded representation of the properly decoded audio frame preceding the lost audio frame. The error concealment unit performs a fade out using the damping factor.

Type: Grant

Filed: August 28, 2020

Date of Patent: July 12, 2022

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung, e.V.

Inventors: Jérémie Lecomte, Adrian Tomasek
Characterizing accuracy of ensemble models for automatic speech recognition by determining a predetermined number of multiple ASR engines based on their historical performance

Patent number: 11380315

Abstract: One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.

Type: Grant

Filed: March 9, 2019

Date of Patent: July 5, 2022

Assignee: CISCO TECHNOLOGY, INC.

Inventors: Ahmad Abdulkader, Mohamed Gamal Mohamed Mahmoud
Household appliance control method, device and system, and intelligent air conditioner by determining user sound source location based on analysis of mouth shape

Patent number: 11373646

Abstract: A household appliance control method, device and system, and an intelligent air conditioner are provided. The method includes that: a sound source location is determined by means of a camera; voice information of a user is picked up according to the sound source location; and at least one control operation is performed on a household appliance according to the voice information. The voice information matches at least one corresponding control instruction. By means of the method, a location of at least one user can be detected according to the camera, thereby enhancing at least one pickup audio signal at the location of the at least one user.

Type: Grant

Filed: February 14, 2017

Date of Patent: June 28, 2022

Assignee: GREE ELECTRIC APPLIANCES, INC. OF ZHUHAI

Inventors: Guangyou Liu, Wencheng Zheng, Yuehui Mao, Zi Wang, Yao Chen, Bo Liang
Information processing apparatus, method of processing information and storage medium comprising dot per inch resolution for scan or copy

Patent number: 11355106

Abstract: An information processing apparatus includes circuitry configured to acquire audio information to be used for operating a target apparatus, recognize the audio information, obtain specific instruction information indicating specific information processing to be instructed to the target apparatus based on a recognition result of the acquired audio information, convert the specific instruction information into specific operation execution information described in an information format interpretable by the target apparatus, and output the specific operation execution information to the target apparatus.

Type: Grant

Filed: March 13, 2019

Date of Patent: June 7, 2022

Assignee: RICOH COMPANY, LTD.

Inventor: Yutaka Nakamura
Systems and methods for predicting mapping between named entities and parameters using a model based on same predefined number of words that occur prior to the named entity via machine learning techniques

Patent number: 11334719

Abstract: Systems, methods and computer program products are presented for a named entity recognition engine. The NER Engine initiates extraction of named entities from a document(s) and identifies one or more required parameters that correspond to a document outline type classification(s) of the document(s). The NER Engine applies a named entity recognition model to the extracted named entities to predict respective mappings between the extracted named entities and the one or more required parameters, wherein the said mapping depends on a Previous Number of Words model which is based on a same predefined number of words that appear before a named entity, as well as a model based on the named entity being included in a document sentence, and a model which depends on position of the named entity in the document. The NER Engine generates a user interface for display of the predicted respective mappings.

Type: Grant

Filed: November 9, 2020

Date of Patent: May 17, 2022

Assignee: The Abstract Operations Company

Inventor: Bhavesh Kakadiya
Computer-generated feedback of user speech traits meeting subjective criteria

Patent number: 11322172

Abstract: Computer-generated feedback directed to whether user speech input meets subjective criteria is provided through the evaluation of multiple speaking traits. Initially, discrete instances of various multiple speaking traits are detected within the user speech input provided. Such multiple speaking traits include vocal fry, tag questions, uptalk, filler sounds and hedge words. Audio constructs indicative of individual instances of speaking traits are isolated and identified from appropriate samples. Speaking trait detectors then utilize such audio constructs to identify individual instances of speaking traits within the spoken input. The resulting quantities are scored based on reference to predetermined threshold quantities. The individual speaking trait scores are then amalgamated utilizing a weighting that is derived based on empirical relationships between those speaking traits and the criteria for which the user's speech input is being evaluated.

Type: Grant

Filed: June 1, 2017

Date of Patent: May 3, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Oscar Roberto Morales Garrido, Paul Thackray, Kristen Kennedy
Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification

Patent number: 11315556

Abstract: Systems and methods for distributed voice processing are disclosed herein. In one example, the method includes detecting sound via a microphone array of a first playback device and analyzing, via a first wake-word engine of the first playback device, the detected sound. The first playback device may transmit data associated with the detected sound to a second playback device over a local area network. A second wake-word engine of the second playback device may analyze the transmitted data associated with the detected sound. The method may further include identifying that the detected sound contains either a first wake word or a second wake word based on the analysis via the first and second wake-word engines, respectively. Based on the identification, sound data corresponding to the detected sound may be transmitted over a wide area network to a remote computing device associated with a particular voice assistant service.

Type: Grant

Filed: February 8, 2019

Date of Patent: April 26, 2022

Assignee: Sonos, Inc.

Inventors: Connor Kristopher Smith, John Tolomei, Betty Lee
Voice assistant response system based on a tone, keyword, language or etiquette behavioral rule

Patent number: 11308949

Abstract: A method, computer system, and a computer program product for voice assistant responses is provided. The present invention may include configuring a behavioral rule. The present invention may include receiving a verbal request from a user and comparing the received verbal request to the behavioral rule. The present invention may then include determining that the received verbal request does not comply with the behavioral rule. The present invention may lastly include providing a response to the user based on determining that the received verbal request does not comply with the behavioral rule.

Type: Grant

Filed: March 12, 2019

Date of Patent: April 19, 2022

Assignee: International Business Machines Corporation

Inventors: Lisa Seacat DeLuca, Jeremy R. Fox, Kelley Anders
Tone interference cancellation

Patent number: 11302326

Abstract: Example techniques involve systems with multiple acoustic echo cancellers. An example implementation captures first audio within an acoustic environment and detecting, within the captured first audio content, a wake-word. In response to the wake-word and before playing an acknowledgement tone, the implementation activates (a) a first sound canceller when one or more speakers are playing back audio content or (b) a second sound canceller when the one or more speakers are idle. In response to the wake-word and after activating either (a) the first sound canceller or (b) the second sound canceller, the implementation outputs the acknowledgement tone via the one or more speakers. The implementation captures second audio within the acoustic environment and cancelling the acoustic echo of the acknowledgement tone from the captured second audio using the activated sound canceller.

Type: Grant

Filed: April 10, 2020

Date of Patent: April 12, 2022

Assignee: Sonos, Inc.

Inventor: Saeed Bagheri Sereshki
Voice synthesis apparatus and voice synthesis method utilizing diphones or triphones and machine learning

Patent number: 11289066

Abstract: A voice synthesis method includes: sequentially acquiring voice units comprising at least one of diphone or a triphone in accordance with synthesis information for synthesizing voices; generating statistical spectral envelopes using a statistical model built by machine learning in accordance with the synthesis information for synthesizing the voices; and concatenating the sequentially acquired voice units and modifying a frequency spectral envelope of each voice unit in accordance with the generated statistical spectral envelope, thereby synthesizing a voice signal based on the concatenated voice units having the modified frequency spectra.

Type: Grant

Filed: December 27, 2018

Date of Patent: March 29, 2022

Assignee: YAMAHA CORPORATION

Inventors: Yuji Hisaminato, Ryunosuke Daido, Keijiro Saino, Jordi Bonada, Merlijn Blaauw
Statistical parameter model establishing method, speech synthesis method, server and storage medium

Patent number: 11289069

Abstract: A statistical parameter modeling method is performed by a server. After obtaining model training data, the model training data including a text feature sequence and a corresponding original speech sample sequence, the server inputs an original vector matrix formed by matching a text feature sample point in the text feature sample sequence with a speech sample point in the original speech sample sequence into a statistical parameter model for training and then performs non-linear mapping calculation on the original vector matrix in a hidden layer, to output a corresponding prediction speech sample point. The server then obtains a model parameter of the statistical parameter model according to the prediction speech sample point and a corresponding original speech sample point by using a smallest difference principle, to obtain a corresponding target statistical parameter model.

Type: Grant

Filed: March 26, 2019

Date of Patent: March 29, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Wei Li, Hangyu Yan, Ke Li, Yongjian Wu, Feiyue Huang
Sequence conversion method and apparatus in natural language processing based on adjusting a weight associated with each word

Patent number: 11288458

Abstract: The present application relates to natural language processing and discloses a sequence conversion method. The method includes: obtaining a source sequence from an input signal; converting the source sequence into one or more source context vectors; obtaining a target context vector corresponding to each source context vector; combining the target context vectors to obtain the target sequence; and outputting the target sequence. A weight vector is applied on a source context vector and a reference context vector, to obtain a target context vector, wherein the weight of one or more elements in the source context vector associated with notional words or weight of a function word in the target context vector is increased. The source sequence and the target sequence are representations of natural language contents. The claimed process improves faithfulness of converting the source sequence to the target sequence.

Type: Grant

Filed: February 15, 2019

Date of Patent: March 29, 2022

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Zhaopeng Tu, Xiaohua Liu, Zhengdong Lu, Hang Li
Active speaker detection in electronic meetings for providing video from one device to plurality of other devices

Patent number: 11282537

Abstract: Active speaker detection can include receiving speaker detection signals from a plurality of devices participating in an electronic meeting. Each speaker detection signal specifies a score indicating whether an active speaker is detected by a respective device of the plurality of devices that generates the speaker detection signal. Active speaker detection further can include determining, using a processor, a device of the plurality of devices that detects an active speaker based upon the speaker detection signals, wherein, in response to the determining, the method further comprises: providing video received from the determined device to the plurality of devices during the electronic meeting.

Type: Grant

Filed: June 9, 2017

Date of Patent: March 22, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Al Chakra, Jonathan Dunne, James P. Galvin, Jr., Liam Harpur
Audio interval detection apparatus, method, and recording medium to eliminate a specified interval that does not represent speech based on a divided phoneme

Patent number: 11276390

Abstract: An audio interval detection apparatus has a processor and a storage storing instructions that, when executed by the processor, control the processor to: detect, from a target audio signal, a specified audio interval including a specified audio signal representing a state of a phoneme of a same consonant produced continuously over a period longer than a specified time, and, by eliminating, from the target audio signal at least the detected specified audio interval, detect from the target audio signal an utterance audio interval that includes a speech utterance signal representing a speech utterance uttered by a speaker.

Type: Grant

Filed: March 13, 2019

Date of Patent: March 15, 2022

Assignee: CASIO COMPUTER CO., LTD.

Inventor: Hiroki Tomita
Apparatus and method for generating an enhanced signal using independent noise-filling information which comprises energy information and is included in an input signal

Patent number: 11264042

Abstract: An apparatus for generating an enhanced signal from an input signal, wherein the enhanced signal has spectral values for an enhancement spectral region, the spectral values for the enhancement spectral regions not being contained in the input signal, includes a mapper for mapping a source spectral region of the input signal to a target region in the enhancement spectral region, the source spectral region including a noise-filling region; and a noise filler configured for generating first noise values for the noise-filling region in the source spectral region of the input signal and for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from the first noise values or for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from first noise values in the source region.

Type: Grant

Filed: November 21, 2019

Date of Patent: March 1, 2022

Inventors: Sascha Disch, Ralf Geiger, Andreas Niedermeier, Matthias Neusinger, Konstantin Schmidt, Stephan Wilde, Benjamin Schubert, Christian Neukam
Audio watermarking via correlation modification using an amplitude and a magnitude modification based on watermark data and to reduce distortion

Patent number: 11244692

Abstract: To convey information using an audio channel, an audio signal is modulated to produce a modulated signal by embedding additional information into the audio signal. Modulating the audio signal processing the audio signal to produce a set of filter responses; creating a delayed version of the filter responses; modifying the delayed version of the filter responses based on the additional information to produce an echo audio signal; and combining the audio signal and the echo audio signal to produce the modulated signal. Modulating the audio signal may involve employing a modulation strength, and a psychoacoustic model may be used to modify the modulation strength based on a comparison of a distortion of the modified audio signal relative to the audio signal and a target distortion.

Type: Grant

Filed: October 4, 2018

Date of Patent: February 8, 2022

Assignee: Digital Voice Systems, Inc.

Inventor: Daniel W. Griffin
Method and apparatus for processing an audio signal for removing a discontinuity using an FIR filter by an audio decoder, and audio encoder

Patent number: 11244694

Abstract: A method is described that processes an audio signal. A discontinuity between a filtered past frame and a filtered current frame of the audio signal is removed using linear predictive filtering.

Type: Grant

Filed: January 23, 2017

Date of Patent: February 8, 2022

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Emmanuel Ravelli, Manuel Jander, Grzegorz Pietrzyk, Martin Dietz, Marc Gayer
Apparatus and method for codebook level estimation of coded audio frames in a bit stream domain to determine a codebook from a plurality of codebooks

Patent number: 11238873

Abstract: An apparatus for level estimation of an encoded audio signal is provided. The apparatus has a codebook determinator for determining a codebook from a plurality of codebooks as an identified codebook. The audio signal has been encoded by employing the identified codebook. Moreover, the apparatus has an estimation unit configured for deriving a level value associated with the identified codebook as a derived level value and for estimating a level estimate of the audio signal using the derived level value.

Type: Grant

Filed: April 4, 2013

Date of Patent: February 1, 2022

Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Inventors: Ralf Geiger, Markus Schnell, Manfred Lutzky, Marco Diatschuk
Voice evaluation method, voice evaluation apparatus, and recording medium for evaluating an impression correlated to pitch

Patent number: 11232810

Abstract: A non-transitory computer-readable recording medium storing a program that causes a computer to execute a process for evaluating a voice, the process includes analyzing a voice signal to detect a pitch frequency; selecting an evaluation target region to be evaluated in the detected pitch frequency based on a distribution of a detection rate of the detected pitch frequency; and evaluating a voice based on the distribution of the detection rate and the selected evaluation target region. An Impression of the voice using a correction to the distribution is evaluated and determined as good when a spread of the corrected distribution is larger than or equal to a certain threshold.

Type: Grant

Filed: March 15, 2019

Date of Patent: January 25, 2022

Assignee: FUJITSU LIMITED

Inventors: Sayuri Nakayama, Taro Togawa, Takeshi Otani

prev 1 2 3 4 5 6 7 8 … next