Image To Speech Patents (Class 704/260)

Speech synthesis method and speech synthesis apparatus

Patent number: 11282498

Abstract: A speech synthesis method and a speech synthesis apparatus to synthesize speeches of different emotional intensities in the field of artificial intelligence, where the method includes obtaining a target emotional type and a target emotional intensity parameter that correspond to an input text, determining a corresponding target emotional acoustic model based on the target emotional type and the target emotional intensity parameter, inputting a text feature of the input text into the target emotional acoustic model to obtain an acoustic feature of the input text, and synthesizing a target emotional speech based on the acoustic feature of the input text.

Type: Grant

Filed: July 31, 2020

Date of Patent: March 22, 2022

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Liqun Deng, Yuezhi Hu, Zhanlei Yang, Wenhua Sun
System, method, and computer program for managing conference calls between a plurality of conference call systems

Patent number: 11283845

Abstract: A system, method, and computer program product are provided for managing conference calls between a plurality of conference call systems. In operation, a conference management system monitors a plurality of call conference systems to determine whether at least one first conference system is attempting to connect to at least one second conference system. The conference management system connects the at least one first conference system with the at least one second conference system such that communication between the at least one first conference system and the at least one second conference system is managed by the conference management system. Additionally, the conference management system provides one suite of services to users of the at least one first conference system and the at least one second conference system.

Type: Grant

Filed: May 20, 2020

Date of Patent: March 22, 2022

Assignee: AMDOCS DEVELOPMENT LIMITED

Inventors: Diego Moskovits, Golan Nuri, Ben Menashe, Aran Azarzar
Devices and methods for a speech-based user interface

Patent number: 11282496

Abstract: A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The device may also select a particular source that is associated with the requested speech output. The device may also generate speech having particular voice characteristics of a particular voice assigned to the particular source.

Type: Grant

Filed: June 12, 2020

Date of Patent: March 22, 2022

Assignee: Google LLC

Inventors: Ioannis Agiomyrgiannakis, Fergus James Henderson
Communication of transcriptions

Patent number: 11276392

Abstract: A method may include obtaining audio originating at a remote device during a communication session conducted between a first device and the remote device and obtaining a transcription of the audio. The method may also include processing the audio to generate processed audio. In some embodiments, the audio may be processed by a neural network that is trained with respect to an analog voice network and the processed audio may be formatted with respect to communication over the analog voice network. The method may further include processing the transcription to generate a processed transcription that is formatted with respect to communication over the analog voice network and multiplexing the processed audio with the processed transcription to obtain combined data. The method may also include communicating, to the first device during the communication session, the combined data over a same communication channel of the analog voice network.

Type: Grant

Filed: December 12, 2019

Date of Patent: March 15, 2022

Inventor: David Thomson
Artificial intelligence based response to a user based on engagement level

Patent number: 11269591

Abstract: Aspects of the present invention disclose a method for delivering an artificial intelligence-based response to a voice command to a user. The method includes one or more processors identifying an audio command received by a computing device. The method further includes determining a first engagement level of a user, wherein an engagement level corresponds to an attentiveness level of the user in relation to the computing device based at least in part on indications of activities of the user. The method further includes identifying a first set of conditions within an immediate operating environment of the computing device, wherein the first set of conditions indicate whether to deliver a voice response to the identified audio command. The method further includes determining whether to deliver the voice response to the identified audio command to the user based at least in part on the first engagement level and first set of conditions.

Type: Grant

Filed: June 19, 2019

Date of Patent: March 8, 2022

Assignee: International Business Machines Corporation

Inventors: Shilpa Shetty, Mithun Das, Amitabha Chanda, Sarbajit K. Rakshit
Method and apparatus for deciding dyschromatopsia

Patent number: 11263304

Abstract: A dyschromatopsia deciding method and apparatus is provided. The apparatus includes an I/O interface configured to receive an input for a program, a memory configured to store the input for the program and a processing result of the input, and a processor configured to execute the program, wherein the processor is configured to provide first CAPTCHA information for distinguishing between a person and a machine together with second CAPTCHA information for deciding dyschromatopsia, receive first CAPTCHA input information corresponding to the first CAPTCHA information and second CAPTCHA input information corresponding to the second CAPTCHA information together with authentication information, authenticate a user based on the first CAPTCHA input information, decide dyschromatopsia of the user based on the second CAPTCHA input information, and store a decision result of the dyschromatopsia in response to a decision that the user has the dyschromatopsia.

Type: Grant

Filed: October 8, 2019

Date of Patent: March 1, 2022

Assignee: NETMARBLE CORPORATION

Inventors: Il Hwan Seo, Hye Jeung Jeung, Min Jae Jeon
Managing agent engagement in a man-machine dialog

Patent number: 11250844

Abstract: Agents engage and disengage with users intelligently. Users can tell agents to remain engaged without requiring a wakeword. Engaged states can support modal dialogs and barge-in. Users can cause disengagement explicitly. Disengagement can be conditional based on timeout, change of user, or environmental conditions. Engagement can be one-time or recurrent. Recurrent states can be attentive or locked. Locked states can be unconditional or conditional, including being reserved to support user continuity. User continuity can be tested by matching parameters or tracking user by many modalities including microphone arrays, cameras, and other sensors.

Type: Grant

Filed: January 26, 2018

Date of Patent: February 15, 2022

Assignee: SoundHound, Inc.

Inventors: Bernard Mont-Reynaud, Scott Halstvedt, Keyvan Mohajer
Computing system for expressive three-dimensional facial animation

Patent number: 11238885

Abstract: A computer-implemented technique for animating a visual representation of a face based on spoken words of a speaker is described herein. A computing device receives an audio sequence comprising content features reflective of spoken words uttered by a speaker. The computing device generates latent content variables and latent style variables based upon the audio sequence. The latent content variables are used to synchronized movement of lips on the visual representation to the spoken words uttered by the speaker. The latent style variables are derived from an expected appearance of facial features of the speaker as the speaker utters the spoken words and are used to synchronize movement of full facial features of the visual representation to the spoken words uttered by the speaker. The computing device causes the visual representation of the face to be animated on a display based upon the latent content variables and the latent style variables.

Type: Grant

Filed: October 29, 2018

Date of Patent: February 1, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Gaurav Mittal, Baoyuan Wang
Inspection assistance device, inspection assistance method, and recording medium

Patent number: 11232530

Abstract: Provided is an inspection assistance device which includes: a first acquisition unit that acquires first data which is image data used to acquire results of inspection on a to-be-inspected object; a second acquisition unit that acquires second data that is different in type from the first data and is used to acquire results of inspection on the to-be-inspected object; and a display control unit that causes a display unit to display a result of comparison between first inspection result information pertaining to the inspection results based on the acquired first data and second inspection result information pertaining to the inspection results based on the acquired second data in such a manner as to be superimposed on an image in which the inspected object is displayed.

Type: Grant

Filed: February 28, 2017

Date of Patent: January 25, 2022

Assignee: NEC CORPORATION

Inventors: Takami Sato, Kota Iwamoto, Yoshinori Saida, Shin Norieda
Dialogue establishing utterances without content words

Patent number: 11232789

Abstract: The present invention keeps a dialogue continuing for a long time without causing uncomfortable feeling to a user. A dialogue system 10 includes at least an input part 1 that receives a user utterance, which is an utterance from a user and a presentation part 5 that presents the utterance. The input part 1 receives a user utterance performed by the user. A presentation part 5-1 presents a dialogue-establishing utterance which does not include any content words. A presentation part 5-2 presents a second utterance associated with a generation target utterance, which is one or more utterances performed before user utterance that includes at least the user utterance, after the dialogue-establishing utterance.

Type: Grant

Filed: May 19, 2017

Date of Patent: January 25, 2022

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, OSAKA UNIVERSITY

Inventors: Hiroaki Sugiyama, Toyomi Meguro, Junji Yamato, Yuichiro Yoshikawa, Hiroshi Ishiguro, Takamasa Iio, Tsunehiro Arimoto
Device and method for generating synchronous corpus

Patent number: 11222650

Abstract: A device and a method for generating synchronous corpus is disclosed. Firstly, script data and a dysarthria voice signal having a dysarthria consonant signal are received and the position of the dysarthria consonant signal is detected, wherein the script data have text corresponding to the dysarthria voice signal. Then, normal phoneme data corresponding to the text are searched and the text is converted into a normal voice signal based on the normal phoneme data corresponding to the text. The dysarthria consonant signal is replaced with the normal consonant signal based on the positions of the normal consonant signal and the dysarthria consonant signal, thereby synchronously converting the dysarthria voice signal into a synthesized voice signal. The synthesized voice signal and the dysarthria voice signal are provided to train a voice conversion model, retain the timbre of the dysarthria voices and improve the communication situations.

Type: Grant

Filed: March 18, 2020

Date of Patent: January 11, 2022

Assignee: NATIONAL CHUNG CHENG UNIVERSITY

Inventors: Tay Jyi Lin, Ching Wei Yeh, Shun Pu Yang, Chen Zong Liao
Wake word selection assistance architectures and methods

Patent number: 11222622

Abstract: Generally discussed herein are devices, systems, and methods for custom wake word selection assistance. A method can include receiving, at a device, data indicating a custom wake word provided by a user, determining one or more characteristics of the custom wake word, determining that use of the custom wake word will cause more than a threshold rate of false detections based on the characteristics, rejecting the custom wake word as the wake word for accessing a personal assistant in response to determining that use of the custom wake word will cause more than a threshold rate of false detections, and setting the custom wake word as the wake word in response to determining that use of the custom wake word will not cause more than the threshold rate of false detections.

Type: Grant

Filed: July 25, 2019

Date of Patent: January 11, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Emilian Stoimenov, Khuram Shahid, Guoli Ye, Hosam Adel Khalil, Yifan Gong
Contextual authentication system

Patent number: 11216901

Abstract: A system, a method, and computer-readable media for opportunistically authenticating a taxpayer. Specifically, embodiments of the invention leverage the fact that the user has possession of particular documents or access to certain information as evidence that the user is the person referred to in those documents or information. If the user provides sufficient evidence to authenticate themselves while providing the information required for the financial transaction, no separate authentication step may be required. At a high level, documents or other data imported in a first context (e.g., during the process of preparing a tax return for a user) are used as evidence of the user's authenticity in a second context.

Type: Grant

Filed: December 19, 2014

Date of Patent: January 4, 2022

Assignee: HRB Innovations, Inc.

Inventor: Eric Roebuck
Natural language understanding model generation

Patent number: 11211056

Abstract: Systems and techniques for generating natural language understanding (NLU) models are described. A developer of an NLU model may provide data representing runtime NLU functionality. For example, a developer may provide one or more sample natural language user inputs. The NLU model generation system may expand data, provided by the developer, to result in a more robust NLU model for use at runtime. For example, the NLU model generation system may expand sample natural language user inputs, may translate sample natural language user inputs into other languages, etc. The present disclosure also provides a mechanism for transitioning between using NLU models of a first NLU model generation system and NLU models of a second NLU model generation system.

Type: Grant

Filed: April 19, 2019

Date of Patent: December 28, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Anthony Bissell, Pragati Verma
Rendering messages having an in-message application

Patent number: 11212244

Abstract: A method for using an in-message application. The method includes: receiving a broadcast message; identifying, in the broadcast message, a reference to an external data provider; obtaining an identifier of the in-message application from the external data provider; using the identifier to identify a set of components of the in-message application, where placement of the set of components is defined by a visual structure of the in-message application, and where each of the set of components is a user interface (UI) element; associating data obtained from the external data provider with a component of the set of components; and serving the broadcast message and the data to a consumer client, where the consumer client renders the in-message application based on the visual structure.

Type: Grant

Filed: September 17, 2019

Date of Patent: December 28, 2021

Assignee: Twitter, Inc.

Inventors: William Morgan, Jeremy Gordon, Grant Monroe, Buster Benson, Russell D'Sa, Adam Singer, Ian Chan, Brian Ellin, Reeve Thompson, Lucas Alonso-Martinez
Automatically reconfiguring an input interface

Patent number: 11206182

Abstract: A computing system including a processor; and a memory communicatively coupled to the processor. The processor is configured to: analyze input received through an input interface of a computing device; determine a context based on the input; and reconfigure the input interface to comprise a key based on a domain associated with the context.

Type: Grant

Filed: October 19, 2010

Date of Patent: December 21, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Feng-wei Chen, Joseph B. Hall, Samuel R. McHan, Jr.
Communication assistant to bridge incompatible audience

Patent number: 11205057

Abstract: A cognitive communication assistant receives a message transmitted over a communication network from a sender to a recipient. A sender's industry identified with the sender and a recipient's industry identified with the recipient are determined. One or more terms associated with the sender's industry are extracted from the message. A definition associated with the one or more terms is searched for in an on-line reference text. The message is updated based on the definition. The message is transmitted over the communication network to the recipient.

Type: Grant

Filed: December 20, 2019

Date of Patent: December 21, 2021

Assignee: International Business Machines Corporation

Inventors: Tara Astigarraga, Itzhack Goldberg, Jose R. Mosqueda Mejia, Daniel J. Winarski
Expandable dialogue system

Patent number: 11195516

Abstract: A system that allows non-engineers administrators, without programming, machine language, or artificial intelligence system knowledge, to expand the capabilities of a dialogue system. The dialogue system may have a knowledge system, user interface, and learning model. A user interface allows non-engineers to utilize the knowledge system, defined by a small set of primitives and a simple language, to annotate a user utterance. The annotation may include selecting actions to take based on the utterance and subsequent actions and configuring associations. A dialogue state is continuously updated and provided to the user as the actions and associations take place. Rules are generated based on the actions, associations and dialogue state that allows for computing a wide range of results.

Type: Grant

Filed: February 26, 2020

Date of Patent: December 7, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Percy Shuo Liang, David Leo Wright Hall, Joshua James Clausman
Method and system for creating object-based audio content

Patent number: 11195511

Abstract: Described herein is a method for creating object-based audio content from a text input for use in audio books and/or audio play, the method including the steps of: a) receiving the text input; b) performing a semantic analysis of the received text input; c) synthesizing speech and effects based on one or more results of the semantic analysis to generate one or more audio objects; d) generating metadata for the one or more audio objects; and e) creating the object-based audio content including the one or more audio objects and the metadata. Described herein are further a computer-based system including one or more processors configured to perform said method and a computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.

Type: Grant

Filed: July 17, 2019

Date of Patent: December 7, 2021

Assignees: Dolby Laboratories Licensing Corporation, Dolby International AB

Inventors: Toni Hirvonen, Daniel Arteaga, Eduard Aylon Pla, Alex Cabrer Manning, Lie Lu, Karl Jonas Roeden
Generating phonemes of loan words using two converters

Patent number: 11195513

Abstract: A technique for estimating phonemes for a word written in a different language is disclosed. A sequence of graphemes of a given word in a source language is received. The sequence of the graphemes in the source language is converted into a sequence of phonemes in the source language. One or more sequences of phonemes in a target language are generated from the sequence of the phonemes in the source language by using a neural network model. One sequence of phonemes in the target language is determined for the given word. Also, technique for estimating graphemes of a word from phonemes in a different language is disclosed.

Type: Grant

Filed: September 27, 2017

Date of Patent: December 7, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Toru Nagano, Yuta Tsuboi
Automatic generation of descriptive video service tracks

Patent number: 11190855

Abstract: A system and method are provided for generating a descriptive video service track for a video asset. Different scenes and/or scene transitions are detected in a predetermined version of the video asset via automated media analysis. Gaps in dialogue are detected in the at least one scene via automated media analysis. Objects appearing in the at least one scene are recognized via automated media analysis, and text descriptive of at least one of the objects appearing in the at least one scene is automatically generated. An audio file of the text descriptive of the at least one of the objects appearing in the at least one scene of the predetermined version of the video asset is generated and used as part of a descriptive video service track for the video asset.

Type: Grant

Filed: August 30, 2017

Date of Patent: November 30, 2021

Assignee: ARRIS Enterprises LLC

Inventor: Michael R. Kahn
Image forming apparatus and voice input control

Patent number: 11184492

Abstract: An apparatus includes: an operation panel that displays an operation screen and accepts an operation from a user; a hardware processor that accepts an operation from a user by voice, turns off the operation panel in a case where an interval between operations received by the operation panel exceeds a first set time, and resets a setting content stored in the storage in a case where an interval between operations received by the operation panel or the hardware processor exceeds a second set time; a storage that stores a setting content corresponding to an operation received by the operation panel or the hardware processor; and a changer that changes a set time of a timer.

Type: Grant

Filed: April 27, 2020

Date of Patent: November 23, 2021

Assignee: KONICA MINOLTA, INC.

Inventor: Takeo Katsuda
Interaction control apparatus and method

Patent number: 11183170

Abstract: The present technology relates to an interaction control apparatus and a method that enable more appropriate interaction control to be performed. The interaction control apparatus includes an interaction progress controller that causes an utterance to be made in one or a plurality of understanding action request positions on the basis of utterance text that has been divided in the one or the plurality of understanding action request positions, the utterance inducing a user to perform an understanding action, and that controls a next utterance on the basis of a result of detecting the understanding action and the utterance text. The present technology is applicable to a speech interaction system.

Type: Grant

Filed: August 3, 2017

Date of Patent: November 23, 2021

Assignee: SONY CORPORATION

Inventors: Hiro Iwase, Mari Saito, Shinichi Kawano
Enhanced virtual singers generation by incorporating singing dynamics to personalized text-to-speech-to-singing

Patent number: 11183169

Abstract: A technique to enhance the quality of Text-to-Speech (TTS) based Singing Voice generation is disclosed. The present invention efficiently preserves the speaker identity and improves sound quality by incorporating speaker-independent natural singing information into TTS-based Speech-to-Singing (STS). The Template-based Text-to-Singing (TTTS) system merges qualities of a singing voice generated from a TTS system with qualities of a singing voice generated from an actual voice singing the song. The qualities are represented in terms of Mel-generalized cepstrum (MGC) coefficients. In particular, low-order MGC coefficients from the TTS-based singing voice with high-order MGC coefficients from the voice of an actual singer.

Type: Grant

Filed: November 8, 2019

Date of Patent: November 23, 2021

Assignee: OBEN, INC.

Inventors: Kantapon Kaewtip, Fernando Villavicencio
Systems and methods for improved call handling

Patent number: 11170757

Abstract: Systems and methods for sending text messages in audio form over voice calls. When a user receives an incoming voice call, the system can enable a user to type a “text” message to the caller. Rather than being sent as a text message, however, the system can send the text message directly to the microphone of the user's equipment (UE) as a voice synthesized audio file, or text-to microphone (TTM) message. The audio file is then sent from the user's UE to the caller's UE, in effect “reading” the text message to the caller. The caller hears the contents of the message, in the form of a voice synthesized audio file over the speaker of the caller's UE. The system can mute the microphones on one or both UEs during the TTM process to create a virtually silent process from the user's standpoint.

Type: Grant

Filed: September 30, 2016

Date of Patent: November 9, 2021

Assignee: T-Mobile USA, Inc.

Inventor: Hsin-Fu Henry Chiang
Vehicle fault diagnosis and analysis based on augmented design failure mode and effect analysis (DFMEA) data

Patent number: 11170585

Abstract: A system and method of performing fault diagnosis and analysis for one or more vehicles. The method includes: obtaining design failure mode and effect analysis (DFMEA) data that specifies a plurality of failure modes; receiving diagnostic association data; receiving vehicle operation signals association data; generating augmented DFMEA data that indicates a causal relationship between the diagnostic data and the first set of failure modes, and that indicates a causal relationship between the vehicle operation signals data and the second set of failure modes, wherein the augmented DFMEA data is generated based on the DFMEA data, the diagnostic association data, and the vehicle operation signals association data; and performing fault diagnosis and analysis for the one or more vehicles using the augmented DFMEA data.

Type: Grant

Filed: June 17, 2019

Date of Patent: November 9, 2021

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Chaitanya Sankavaram, Dnyanesh G. Rajpathak, Azeem Sarwar, Xiangxing Lu, Dean G. Sorrell, Layne K. Wiggins
Speech synthesis apparatus and method

Patent number: 11170755

Abstract: The present disclosure relates to a speech synthesis apparatus and method that can remove discontinuity between phoneme units when generating a synthesized sound from the phoneme units, thereby implementing natural utterances and producing a high-quality synthesized sound having stable prosody.

Type: Grant

Filed: April 30, 2020

Date of Patent: November 9, 2021

Assignee: SK TELECOM CO., LTD.

Inventors: Changheon Lee, Jongjin Kim, Jihoon Park
Information processing device, information processing method, and dialog control system

Patent number: 11170051

Abstract: An apparatus generates property data with a first context relation set between text display areas contained in a display image of a first webpage, and generates, based on the property data, dialog control data with a second context relation set between pieces of text extracted from structural elements of text display areas contained in a second webpage.

Type: Grant

Filed: August 28, 2018

Date of Patent: November 9, 2021

Assignee: FUJITSU LIMITED

Inventors: Takumi Baba, Takashi Imai, Kei Taira, Miwa Okabayashi, Tatsuro Matsumoto
Methods and systems for personalizing user experience based on nostalgia metrics

Patent number: 11159261

Abstract: A server system accesses a profile of a user of the media-providing service. The profile indicates a demographic group of the user. For each track of a plurality of tracks, the server system determines a year associated with the track. The server system selects content for the user based at least in part on an affinity of members of the demographic group, as compared to members of other demographic groups, of music from the year associated with the track. The server system provides the selected content to a client device associated with the user.

Type: Grant

Filed: June 3, 2020

Date of Patent: October 26, 2021

Assignee: Spotify AB

Inventors: Clay Gibson, Santiago Gil, Ian Anderson, Oguz Semerci, Scott Wolf, Margreth Mpossi
Gaze-activated voice services for interactive workspaces

Patent number: 11157075

Abstract: Systems and methods for providing gaze-activated voice services for interactive workspaces. In some embodiments, an Information Handling System (IHS), may include a processor and a memory coupled to the processor, the memory having program instructions stored thereon that, upon execution, cause the IHS to: transmit a voice command to a voice service provider; receive a textual instruction in response to the voice command; identify a gaze focus of the user; and execute the textual instruction using the gaze focus.

Type: Grant

Filed: May 1, 2018

Date of Patent: October 26, 2021

Assignee: Dell Products, L.P.

Inventors: Tyler Ryan Cox, Todd Erick Swierk, Marc Randall Hammons
System and method for providing audible explanation of documents upon request

Patent number: 11145289

Abstract: A system and method for providing an audible explanation of documents upon request is disclosed. The system and method use an intelligent voice assistant that can receive audible requests for document explanations. The intelligent voice assistant can retrieve document summary information and provide an audible response explaining key points of the document.

Type: Grant

Filed: May 7, 2019

Date of Patent: October 12, 2021

Assignee: United Services Automobile Association (USAA)

Inventors: Richard Daniel Graham, Ruthie D. Lyle
Systems and methods for a text-to-speech interface

Patent number: 11145288

Abstract: A computing system and related techniques for selecting content to be automatically converted to speech and provided as an audio signal are provided. A text-to-speech request associated with a first document can be received that includes data associated with a playback position of a selector associated with a text-to-speech interface overlaid on the first document. First content associated with the first document can be determined based at least in part on the playback position, the first content including content that is displayed in the user interface at the playback position. The first document can be analyzed to identify one or more structural features associated with the first content. Speech data can be generated based on the first content and the one or more structural features.

Type: Grant

Filed: May 21, 2019

Date of Patent: October 12, 2021

Assignee: Google LLC

Inventors: Benedict Davies, Guillaume Boniface, Jack Whyte, Jakub Adamek, Simon Tokumine, Alessio Macri, Matthias Quasthoff
Generating phonemes of loan words using two converters

Patent number: 11138965

Abstract: A technique for estimating phonemes for a word written in a different language is disclosed. A sequence of graphemes of a given word in a source language is received. The sequence of the graphemes in the source language is converted into a sequence of phonemes in the source language. One or more sequences of phonemes in a target language are generated from the sequence of the phonemes in the source language by using a neural network model. One sequence of phonemes in the target language is determined for the given word. Also, technique for estimating graphemes of a word from phonemes in a different language is disclosed.

Type: Grant

Filed: November 2, 2017

Date of Patent: October 5, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, Toru Nagano, Yuta Tsuboi
Method and apparatus with text-to-speech conversion

Patent number: 11138963

Abstract: A processor-implemented text-to-speech method includes determining, using a sub-encoder, a first feature vector indicating an utterance characteristic of a speaker from feature vectors of a plurality of frames extracted from a partial section of a first speech signal of the speaker, and determining, using an autoregressive decoder, into which the first feature vector is input as an initial value, from context information of the text, a second feature vector of a second speech signal in which a text is uttered according to the utterance characteristic.

Type: Grant

Filed: May 7, 2019

Date of Patent: October 5, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventor: Hoshik Lee
Dynamic interaction of a dynamic ideogram in an electronic messaging environment

Patent number: 11128591

Abstract: In one example, a trigger is obtained for a dynamic ideogram to dynamically interact with the electronic messaging environment. In response to the trigger, it is determined how the dynamic ideogram is to dynamically interact with the electronic messaging environment including performing an analysis of the electronic messaging environment. Based on the analysis of the electronic messaging environment, instructions to render the dynamic ideogram to dynamically interact with the electronic messaging environment are generated for a first user device configured to communicate with a second user device via the electronic messaging environment.

Type: Grant

Filed: August 27, 2020

Date of Patent: September 21, 2021

Assignee: CISCO TECHNOLOGY, INC.

Inventors: Christopher Deering, Colin Olivier Louis Vidal, Jimmy Coyne
Continuous calibration of an information handling system projected user interface

Patent number: 11106314

Abstract: Visual images projected on a projection surface by a projector provide an interactive user interface having end user inputs detected by a detection device, such as a depth camera. The detection device monitors projected images initiated in response to user inputs to determine calibration deviations, such as by comparing the distance between where a user makes an input and where the input is projected. Calibration is performed to align the projected outputs and detected inputs. The calibration may include a coordinate system anchored by its origin to a physical reference point of the projection surface, such as a display mat or desktop edge.

Type: Grant

Filed: April 21, 2015

Date of Patent: August 31, 2021

Assignee: DELL PRODUCTS L.P.

Inventors: Karthik Krishnakumar, Michiel Sebastiaan Emanuel Petrus Knoppert, Rocco Ancona, Abu S. Sanaullah, Mark R. Ligameri
Audio streaming of text-based articles from newsfeeds

Patent number: 11107458

Abstract: An example embodiment may involve receiving, from a client device, a selection of text-based articles from newsfeeds. The selection may specify that the text-based articles have been flagged for audible playout. The example embodiment may also involve, possibly in response to receiving the selection of the text-based articles, retrieving text-based articles from the newsfeeds. The example embodiment may also involve causing the text-based articles to be converted into audio files. The example embodiment may also involve receiving a request to stream the audio files to the client device or another device. The example embodiment may also involve causing the audio files to be streamed to the client device or the other device.

Type: Grant

Filed: December 30, 2019

Date of Patent: August 31, 2021

Assignee: Gracenote Digital Ventures, LLC

Inventor: Venkatarama Anilkumar Panguluri
Systems and methods for media content communication

Patent number: 11108721

Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for communication using multiple media content items stored on both a sending device and a receiving device. In particular, in one or more embodiments, the disclosed systems receive an application package. The application generates a message from input text and matches a portion of the text input to an audio content item using mapping data. The application generates a message including the text input and an identifier to the audio content item. A receiving system receives an application package. The application receives the message and locates the audio content item on the application package using the identifier and presents the message, including the text and the audio content item.

Type: Grant

Filed: April 21, 2020

Date of Patent: August 31, 2021

Inventors: David Roberts, Glenn Sugden
End-to-end text-to-speech conversion

Patent number: 11107457

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.

Type: Grant

Filed: November 26, 2019

Date of Patent: August 31, 2021

Assignee: Google LLC

Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
System and method for establishing an interactive communication session

Patent number: 11093691

Abstract: A system and method of establishing a communication session is disclosed herein. A computing system receives, from a client device, a content item comprising text-based content. The computing system generates a mark-up version of the content item by identifying one or more characters in the text-based content and a relative location of the one or more characters in the content item. The computing system receives, from the client device, an interrogatory related to the content item. The computing system analyzes the mark-up version of the content item to identify an answer to the interrogatory. The computing system generates a response message comprising the identified answer to the interrogatory. The computing system transmits the response message to the client device.

Type: Grant

Filed: February 14, 2020

Date of Patent: August 17, 2021

Assignee: Capital One Services, LLC

Inventors: Michael Mossoba, Abdelkader M'Hamed Benkreira, Joshua Edwards
Buying products within video content by voice command

Patent number: 11087379

Abstract: A user registers for an account with an account management system, configures account settings to permit the account management system to receive user computing device data from a user computing device associated with the user, and logs into the account via the user computing device. The account management system receives a user voice purchase command and determines a purchase command context based on the received user computing device data. The account management system identifies a product that the user desires to purchase based on the purchase command context and directs the user computing device web browser to a merchant website to set up a transaction for the identified product.

Type: Grant

Filed: February 12, 2015

Date of Patent: August 10, 2021

Assignee: GOOGLE LLC

Inventors: Filip Verley, IV, Stuart Ross Hobbie
Method and system for providing contextual responses to user interaction

Patent number: 11087091

Abstract: Disclosed herein is a method and response generation system for providing contextual responses to user interaction. In an embodiment, input data related to user interaction, which may be received from a plurality of input channels in real-time, may be processed using processing models corresponding to each of the input channels for extracting interaction parameters. Thereafter, the interaction parameters may be combined for computing a contextual variable, which in turn may be analyzed to determine a context of the user interaction. Finally, responses corresponding to the context of the user interaction may be generated and provided to the user for completing the user interaction. In some embodiments, the method of present disclosure accurately detects context of the user interaction and provides meaningful contextual responses to the user interaction.

Type: Grant

Filed: February 19, 2019

Date of Patent: August 10, 2021

Assignee: Wipro Limited

Inventors: Gopichand Agnihotram, Rajesh Kumar, Pandurang Naik
Calculations on sound associated with cells in spreadsheets

Patent number: 11080474

Abstract: Described herein is a system and method for associating audio files with one or more cells in a spreadsheet application. As described, one or more audio files may be associated with a single cell in a spreadsheet application or it may be associated with a range of cells in the spreadsheet application. Information about the audio file, such playback properties and other parameters, may be retrieved from the audio file. Once retrieved, a calculation engine of the spreadsheet application may perform one or more calculations on the information in order to change the content of audio file, the playback of the audio files and so on.

Type: Grant

Filed: November 1, 2016

Date of Patent: August 3, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Samuel C. Radakovitz, Christian M. Canton, Carlos A. Ortero, John Campbell, Allison Rutherford, Benjamin E. Rampson
Systems and methods for enhancing responsiveness to utterances having detectable emotion

Patent number: 11081111

Abstract: Methods, systems, and related products that provide emotion-sensitive responses to user's commands and other utterances received at an utterance-based user interface. Acknowledgements of user's utterances are adapted to the user and/or the user device, and emotions detected in the user's utterance that have been mapped from one or more emotion features extracted from the utterance. In some examples, extraction of a user's changing emotion during a sequence of interactions is used to generate a response to a user's uttered command. In some examples, emotion processing and command processing of natural utterances are performed asynchronously.

Type: Grant

Filed: March 17, 2020

Date of Patent: August 3, 2021

Assignee: Spotify AB

Inventors: Daniel Bromand, David Gustafsson, Richard Mitic, Sarah Mennicken
Methods for managing call traffic at a virtual assistant server

Patent number: 11082559

Abstract: A virtual assistant server receives a web request such as a HTTP request with one or more call parameters corresponding to a call redirected from an interactive voice response server. The virtual assistant server inputs the received one or more call parameters to a predictive model, which identifies, based on the one or more call parameters, an intelligent communication mode to route the redirected call to. Subsequently, the virtual assistant server routes the redirected call to the intelligent communication mode.

Type: Grant

Filed: July 31, 2020

Date of Patent: August 3, 2021

Assignee: KORE.AI, INC.

Inventors: Rajkumar Koneru, Prasanna Kumar Arikala Gunalan, Rajavardhan Nalluri
Speech synthesis method and apparatus based on emotion information

Patent number: 11074904

Abstract: A speech synthesis method and apparatus based on emotion information are disclosed. A speech synthesis method based on emotion information extracts speech synthesis target text from received data and determines whether the received data includes situation explanation information. First metadata corresponding to first emotion information is generated on the basis of the situation explanation information. When the extracted data does not include situation explanation information, second metadata corresponding to second emotion information generated on the basis of semantic analysis and context analysis is generated. One of the first metadata and the second metadata is added to the speech synthesis target text to synthesize speech corresponding to the extracted data.

Type: Grant

Filed: October 4, 2019

Date of Patent: July 27, 2021

Assignee: LG Electronics Inc.

Inventors: Siyoung Yang, Minook Kim, Sangki Kim, Yongchul Park, Juyeong Jang, Sungmin Han
Natural language dialog scoring

Patent number: 11074907

Abstract: Techniques for generating a prompt coverage score, which measures an extent to which data output to a user during a dialog is repetitive and monotonous, are described. User input data and system output, corresponding to a dialog exchange between a user and a skill, may be determined. A portion of the system output data, corresponding to a system prompt representing default output data, may be determined. A first number, representing possible variants of the prompt, may be determined along with a second number, representing variants of the prompt output during the dialog exchange. A prompt coverage score may be determined based on the first and second numbers.

Type: Grant

Filed: May 29, 2019

Date of Patent: July 27, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ravi Chikkanayakanahalli Mallikarjuniah, Priya Rao Chagaleti, Shiladitya Roy, Christopher Forbes Will, Cole Ira Brendel, Wei Huang, Sarthak Anand
Obtaining enhanced metadata for media content

Patent number: 11068526

Abstract: Methods, systems, and computer program products are provided for obtaining enhanced metadata for media content searches. In one embodiment, computer program logic embodies a metadata receiver and a media content metadata matcher and combiner. The metadata receiver receives program metadata for a plurality of programs from a plurality of metadata sources. The media content metadata matcher and combiner is configured to perform a matching process whereby metadata associated with each of the plurality of programs is compared to metadata of each of the other plurality of programs to determine if the compared programs are the same program and if so, to combine the metadata from each program into a single program including enhanced metadata and store such in a database. A subsequent search for a program corresponding to the stored program returns at least some of the metadata associated with the program, and that enables accessing the program.

Type: Grant

Filed: January 25, 2019

Date of Patent: July 20, 2021

Assignee: Caavo Inc

Inventors: Amrit P. Singh, Sravan K. Andavarapu, Jayanth Manklu, Anu Godara, Vinu Joseph, Vinod K. Gopinath, Ashish D. Aggarwal
Accessible audio switching for client devices in an online conference

Patent number: 11064000

Abstract: Techniques and systems are described for accessible audio switching options during the online conference. For example, a conferencing system receives presentation content and audio content as part of the online conference from a client device. The conferencing system generates voice-over content from the presentation content by converting text of the presentation content to audio. The conferencing system then divides the presentation content into presentation segments. The conferencing system also divides the audio content into audio segments that correspond to respective presentation segments, and the voice-over content into voice-over segments that correspond to respective presentation segments. As the online conference is output, the conferencing system enables switching between a corresponding audio segment and voice-over segment during output of a respective presentation segment.

Type: Grant

Filed: November 29, 2017

Date of Patent: July 13, 2021

Assignee: Adobe Inc.

Inventors: Ajay Jain, Sachin Soni, Amit Srivastava
Text-to-speech processing with emphasized output audio

Patent number: 11062694

Abstract: Systems and methods for generating output audio with emphasized portions are described. Spoken audio is obtained and undergoes speech processing (e.g., ASR and optionally NLU) to create text. It may be determined that the resulting text includes a portion that should be emphasized (e.g., an interjection) using at least one of knowledge of an application run on a device that captured the spoken audio, prosodic analysis, and/or linguistic analysis. The portion of text to be emphasized may be tagged (e.g., using a Speech Synthesis Markup Language (SSML) tag). TTS processing is then performed on the tagged text to create output audio including an emphasized portion corresponding to the tagged portion of the text.

Type: Grant

Filed: June 7, 2019

Date of Patent: July 13, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Marco Nicolis, Adam Franciszek Nadolski

prev 1 2 3 4 5 6 7 … next