Image To Speech Patents (Class 704/260)
  • Patent number: 10318236
    Abstract: Approaches provide for using a voice communications device to control, refine, or otherwise manage the playback of media content in response to a spoken instruction. For example, the voice communications device can receive a request to refine and/or initiate the playback of media content, such as music, news, audio books, audio broadcasts, and other such content. Audio input data that includes the request can be received by the voice communications device and an application executing on the voice communications device or otherwise in communication with the voice communications device can analyze the audio input data to determine how to carry out the request. The application can determine whether there is an active play queue of media content configured to play using the voice communications device. In the situation where there is no media content being played using the voice communications device, the application can determine media content using information in the request.
    Type: Grant
    Filed: May 5, 2016
    Date of Patent: June 11, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Rickesh Pal, Kintan Dilipkumar Brahmbhatt, Brandon Scott Durham, Jonathan Barnett Feinstein, Yun Suk Paik, Daniel Paul Ryan
  • Patent number: 10319365
    Abstract: Systems and methods for generating output audio with emphasized portions are described. Spoken audio is obtained and undergoes speech processing (e.g., ASR and optionally NLU) to create text. It may be determined that the resulting text includes a portion that should be emphasized (e.g., an interjection) using at least one of knowledge of an application run on a device that captured the spoken audio, prosodic analysis, and/or linguistic analysis. The portion of text to be emphasized may be tagged (e.g., using a Speech Synthesis Markup Language (SSML) tag). TTS processing is then performed on the tagged text to create output audio including an emphasized portion corresponding to the tagged portion of the text.
    Type: Grant
    Filed: June 27, 2016
    Date of Patent: June 11, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Marco Nicolis, Adam Franciszek Nadolski
  • Patent number: 10311437
    Abstract: Provided is a method and a telephone-based system with voice-verification capabilities that enable a user to safely and securely conduct transactions with his or her online financial transaction program account over the phone in a convenient and user-friendly fashion, without having to depend on an internet connection.
    Type: Grant
    Filed: November 13, 2017
    Date of Patent: June 4, 2019
    Assignee: PAYPAL, INC.
    Inventor: Will Tonini
  • Patent number: 10304430
    Abstract: An electronic musical instrument includes; a plurality of keys, each of the plurality of keys specifying a pitch; a memory storing musical piece data representing a musical piece; and a processor, wherein the processor executes the following: retrieving the musical piece data of a musical piece from the memory and determining whether the musical piece data contains data of a lyric; and when the musical piece data contains the data of the lyric, and if a note specified by an operation of a key by a user is accompanied by a part of the lyric in the musical piece, causing data of a singing voice sound having the pitch specified by said operated key to be generated in accordance with the part of the lyric in response to the operation of the key, and causing the singing voice sound to be audibly output.
    Type: Grant
    Filed: March 16, 2018
    Date of Patent: May 28, 2019
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Atsushi Nakamura
  • Patent number: 10275420
    Abstract: The disclosure includes a system and method for summarizing social interactions between users.
    Type: Grant
    Filed: May 11, 2017
    Date of Patent: April 30, 2019
    Assignee: Google LLC
    Inventors: Nadav Aharony, Alan Lee Gardner, III, George Cody Sumter
  • Patent number: 10276148
    Abstract: Some examples of assisted media representation can be implemented as a system and method that uses screen reader like functionality to speak information presented on a graphical user interface displayed by a media presentation system, including information that is not navigable by a remote control device. Information can be spoken in an order that follows a relative importance of the information based on a characteristic of the information or the location of the information within the graphical user interface. A history of previously spoken information is monitored to avoid speaking information more than once for a given graphical user interface. A different pitch can be used to speak information based on a characteristic of the information. Information that is not navigable by the remote control device can be spoken after time delay. Voice prompts can be provided for a remote-driven virtual keyboard displayed by the media presentation system. The voice prompts can be spoken with different voice pitches.
    Type: Grant
    Filed: November 4, 2010
    Date of Patent: April 30, 2019
    Assignee: APPLE INC.
    Inventors: Christopher B. Fleizach, Reginald Dean Hudson, Eric Taylor Seymour
  • Patent number: 10257340
    Abstract: Methods, computer program products, and systems are presented. The methods include, for instance: a voice delivery application, running on a mobile device of a user, receives a text message from a user; by use of sensor inputs of the mobile device, the mobile device stores data regarding environment of the mobile device including external audio equipment, speed of the user, and bystanders within a hearing range of the environment; various data describing a sender of the text message and the bystanders are analyzed for respective relationships with the user and with each other to determine a confidentiality group dictating whether or not the text message may be heard by the bystander; the text message may be scanned for content screening, then according to configuration of the voice delivery application, the text message is securely delivered to the user by voice.
    Type: Grant
    Filed: December 15, 2017
    Date of Patent: April 9, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Darryl M. Adderly, Jonathan W. Jackson, Ajit Jariwala, Eric B. Libow
  • Patent number: 10255904
    Abstract: According to an embodiment, a reading-aloud information editing device includes an acquirer, an analyzer, a first generator, a second generator, and an extractor. The acquirer is configured to acquire an edit region including a text added with reading-aloud information from a document. The analyzer is configured to analyze a document structure of the edit region. The first generator is configured to generate one or more condition patterns by abstracting the edit region on the basis of the document structure. The second generator is configured to generate an extraction condition that is for extracting a text from the document and includes at least one of the condition patterns. The extractor is configured to extract a text suitable for the extraction condition from the document.
    Type: Grant
    Filed: February 9, 2017
    Date of Patent: April 9, 2019
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kosei Fume, Masahiro Morita, Taira Ashikawa
  • Patent number: 10255905
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating word pronunciations. One of the methods includes determining, by one or more computers, spelling data that indicates the spelling of a word, providing the spelling data as input to a trained recurrent neural network, the trained recurrent neural network being trained to indicate characteristics of word pronunciations based at least on data indicating the spelling of words, receiving output indicating a stress pattern for pronunciation of the word generated by the trained recurrent neural network in response to providing the spelling data as input, using the output of the trained recurrent neural network to generate pronunciation data indicating the stress pattern for a pronunciation of the word, and providing, by the one or more computers, the pronunciation data to a text-to-speech system or an automatic speech recognition system.
    Type: Grant
    Filed: June 10, 2016
    Date of Patent: April 9, 2019
    Assignee: Google LLC
    Inventors: Mason Vijay Chua, Kanury Kanishka Rao, Daniel Jacobus Josef van Esch
  • Patent number: 10249174
    Abstract: An approach is described for sending alert messages and notifications to individual devices and in a format understandable by a user of the communication device.
    Type: Grant
    Filed: July 31, 2015
    Date of Patent: April 2, 2019
    Assignee: SIEMENS INDUSTRY, INC.
    Inventors: Emad El-Mankabady, Daniel S. Iasso, Robert Limlaw, Lester K. Perlak, George E. Baker
  • Patent number: 10242410
    Abstract: An image processing apparatus which acquires character string information and position information of the character string in a receipt image by an Optical Character Recognition (OCR), acquires value information corresponding to a keyword corresponding to predetermined item information from the acquired character string information, associates and stores the acquired value information, position information corresponding to the acquired value information and the predetermined item information, displays the receipt image and an object at a position of the receipt image corresponding to the stored position information, selects, in a case where it is determined that same type value information which is a plurality of the value information associated with a same type of the item information is stored, processing to be executed in accordance with a type of the item information related to the same type value information, and executes the selected processing.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: March 26, 2019
    Assignee: BROTHER KOGYO KABUSHIKI KAISHA
    Inventor: Tomofumi Nakayama
  • Patent number: 10242660
    Abstract: The present invention provides a method and a device for optimizing speech synthesis system. The method comprises: receiving speech synthesis requests contained text messages; and determining the load level of the speech synthesis system when the speech synthesis requests are received; and selecting speech synthesis paths corresponding to the load level and synthesizing the text into speech according to the speech synthesis paths.
    Type: Grant
    Filed: October 27, 2016
    Date of Patent: March 26, 2019
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (Beijing) CO., LTD.
    Inventors: Qingchang Hao, Xiulin Li, Jie Bai, Haiyuan Tang
  • Patent number: 10243912
    Abstract: A system that incorporates teachings of the present disclosure may include, for example, a server including a controller to receive audio signals and content identification information from a media processor, generate text representing a voice message based on the audio signals, determine an identity of media content based on the content identification information, generate an enhanced message having text and additional content where the additional content is obtained by the controller based on the identity of the media content, and transmit the enhanced message to the media processor for presentation on the display device, where the enhanced message is accessible by one or more communication devices that are associated with a social network and remote from the media processor. Other embodiments are disclosed.
    Type: Grant
    Filed: January 19, 2016
    Date of Patent: March 26, 2019
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Hisao Chang, Bernard S. Renger
  • Patent number: 10225621
    Abstract: Disclosed herein are systems and methods for converting audio-video content into audio-only content. Audio-video content is readily accessible, but for various reasons users often cannot consume content visually. In those circumstances, for example, when a user is interrupted during a movie to drive to pick up a spouse or child, the user may not want to forego consuming the audio-video content. The audio-video content can be converted into audio only content for the user to aurally consume, allowing the user to consume the content despite interruptions or other reasons for which the audio-video content cannot be consumed visually.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: March 5, 2019
    Assignee: DISH Network L.L.C.
    Inventors: Nicholas B. Newell, Sheshank Kodam
  • Patent number: 10224023
    Abstract: A speech recognition system and method thereof, a vocabulary establishing method and a computer program product are provided.
    Type: Grant
    Filed: March 15, 2017
    Date of Patent: March 5, 2019
    Assignee: Industrial Technology Research Institute
    Inventors: Shih-Chieh Chien, Chih-Chung Kuo
  • Patent number: 10216723
    Abstract: Method, system and apparatus for assembling a recording plan and data driven dialogs for automated communications are provided. At a computing device comprising a memory, a communication interface and a processor, the memory storing a database of statements comprising one or more of first names, last names, greeting statements, sentiment statements, influence statements, call to action statements, and legal statements: one or more statements from the database are automatically assembled, via the processor, into one or more phrases to be recorded; instructions for applying linguistic rules are associated with the one or more phrases, via the processor, including where to insert pauses in the one or more phrases; and, the recording plan, comprising the one or more phrases in association with the instructions, is stored at the memory.
    Type: Grant
    Filed: January 6, 2017
    Date of Patent: February 26, 2019
    Assignee: SPLICE SOFTWARE INC.
    Inventors: Tara Kelly, Andrew Hamill, Ken Hackl
  • Patent number: 10217351
    Abstract: There is provided an information processing apparatus including a matter extracting unit extracting a predetermined matter from text information, an action pattern specifying unit specifying one or multiple action patterns associated with the predetermined matter, an action extracting unit extracting each of the action patterns associated with the predetermined matter, from sensor information, and a state analyzing unit generating state information indicating a state related to the matter, based on each of the action patterns extracted from the sensor information, using a contribution level indicating a degree of contribution of each of the action patterns to the predetermined matter, for a combination of the predetermined matter and each of the action patterns associated with the predetermined matter.
    Type: Grant
    Filed: March 7, 2018
    Date of Patent: February 26, 2019
    Assignee: SONY CORPORATION
    Inventors: Seiichi Takamura, Yasuharu Asano
  • Patent number: 10210153
    Abstract: A language processing system for text normalization of an input string of a semiotic class. In an aspect, a method includes receiving an input string; accessing, for a semiotic class of non-standard words, a language universal covering grammar for a plurality of languages that generates, for each language of the plurality of languages, one or more sequences of word-level components for each instance of the semiotic class in the language; for each of the plurality of languages, accessing a lexical map specific to the language and that maps each sequence of word-level components for each instance of the semiotic class in the language verbalizations in the language; generating, from the language universal grammar and the lexical maps, a lattice of possible verbalizations of the input string; and selecting one of the possible verbalizations as a selected verbalization for the input string.
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: February 19, 2019
    Assignee: Google LLC
    Inventors: Richard Sproat, Ke Wu, Kyle Gorman
  • Patent number: 10210769
    Abstract: A non-transitory processor-readable medium stores code representing instructions to be executed by a processor. The code causes the processor to receive a request from a user of a client device to initiate a speech recognition engine for a web page displayed at the client device. In response to the request, the code causes the processor to (1) download, from a server associated with a first party, the speech recognition engine into the client device; and then (2) analyze, using the speech recognition engine, content of the web page including text in an identified language to produce analyzed content based on the identified language, where the content of the web page is received from a server associated with a second party. The code further causes the processor to send a signal to cause the client device to present the analyzed content to the user at the client device.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: February 19, 2019
    Assignee: ROSETTA STONE LTD.
    Inventors: Aaron M. Simmons, Bryan Pellom, Karl F. Ridgeway
  • Patent number: 10192541
    Abstract: A text-to-speech (TTS) system includes components capable of supporting the generation of speech output in any of multiple styles, and may switch seamlessly from producing speech output in one style to producing speech output in another style. For example, a concatenative TTS system may include a speech base storing speech units associated with multiple speech styles, and a linguistic analysis component to generate a phonetic transcription specifying speech output in any of multiple styles. Text input may include a style indication associated with a particular segment of the input text. The linguistic analysis component may invoke encoded rules and/or components based upon the style indication, and generate a phonetic transcription specifying a speech style, which may be processed to generate output speech.
    Type: Grant
    Filed: June 5, 2014
    Date of Patent: January 29, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Paolo Mairano, Corinne Bos-Plachez, Sourav Nandy, Johan Wouters, Silvia Maria Antonella Quazza, Dong-Jian Yue
  • Patent number: 10192542
    Abstract: A speaking-rate dependent prosodic model builder and a related method are disclosed. The proposed builder includes a first input terminal for receiving a first information of a first language spoken by a first speaker, a second input terminal for receiving a second information of a second language spoken by a second speaker and a functional information unit having a function, wherein the function includes a first plurality of parameters simultaneously relevant to the first language and the second language or a plurality of sub-parameters in a second plurality of parameters relevant to the second language alone, and the functional information unit under a maximum a posteriori condition and based on the first information, the second information and the first plurality of parameters or the plurality of sub-parameters produces speaking-rate dependent reference information and constructs a speaking-rate dependent prosodic model of the second language.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: January 29, 2019
    Assignee: NATIONAL TAIPEI UNIVERSITY
    Inventor: Chen-Yu Chiang
  • Patent number: 10187894
    Abstract: Systems and methods are described for improving capacity of voice services for data packet transmission through a wireless network. Application requirements including a data rate for a wireless device may be determined. An access node may determine available resources to transmit data as indicated by the application requirements. The wireless device and the access node may communicate data transmissions wirelessly for use by the wireless device application. Data transmission may be in a first mode or in a second mode depending whether there are sufficient available network resources for the determined data rate. The first and second transmission modes may be generated from a common input such as a wireless device user's voice; however, the second mode of data transmission may be converted in order to consume less network resources.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: January 22, 2019
    Assignee: Sprint Spectrum L.P.
    Inventors: Muhammad Ahsan Naim, Yu Zhou
  • Patent number: 10176798
    Abstract: A mechanism is described for facilitating dynamic and intelligent conversion of text into real user speech according to one embodiment. A method of embodiments, as described herein, includes receiving a textual message from a first user, and accessing a voice profile associated with the first user, where the voice profile includes a real voice of the first user and at least one of emotional patterns relating to the first user, context distinctions relating to the first user, and speech characteristics relating to the first user, where accessing further includes extracting the real voice and at least one of an emotional pattern, a context distinction, and a speech characteristic based on subject matter of the textual message. The method may further include converting the textual message into a real speech of the first user based on the voice profile including the real voice and at least one of the emotional pattern, the context distinction, and the speech characteristic.
    Type: Grant
    Filed: August 28, 2015
    Date of Patent: January 8, 2019
    Assignee: INTEL CORPORATION
    Inventors: Ofer Gueta, Sefi Kraemer
  • Patent number: 10178488
    Abstract: An audio metadata providing apparatus and method and a multichannel audio data playback apparatus and method to support a dynamic format conversion are provided. Dynamic format conversion information may include information about a plurality of format conversion schemes that are used to convert a first format set by an author of multichannel audio data into a second format that is based on a playback environment of the multichannel audio data and that are each set for corresponding playback periods of the multichannel audio data. The audio metadata providing apparatus may provide audio metadata including the dynamic format conversion information. The multichannel audio data playback apparatus may identify the dynamic format conversion information from the audio metadata, may convert the first format of the multichannel audio data into the second format based on the identified dynamic format conversion information, and may play back the multichannel audio data in the second format.
    Type: Grant
    Filed: September 25, 2017
    Date of Patent: January 8, 2019
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Jae Hyoun Yoo, Tae Jin Lee, Seok Jin Lee
  • Patent number: 10170111
    Abstract: A vehicular infotainment system, a vehicle and a method of controlling interaction between a vehicle driver and an infotainment system. A multimedia device, a human-machine interface and sensors are used to collect in-vehicle driver characteristic data and extra-vehicle driving conditions. The system additionally includes—or is otherwise coupled to—a computer to convert one or both of traffic pattern data and vehicular positional data into a driver elevated cognitive load profile. In addition, the computer converts the driver characteristic data into a driver mood profile. The system can process these profiles to selectively adjust one or both of the amount of time needed to accept audio commands from a driver and the amount of time needed to provide an audio response to the driver in situations where the system determines the presence of at least one of the elevated cognitive load and a driver mood.
    Type: Grant
    Filed: February 1, 2017
    Date of Patent: January 1, 2019
    Assignee: TOYOTA MOTOR ENGINEERING & MANUFACTURING NORTH AMERICA, INC.
    Inventor: Nishikant Narayan Puranik
  • Patent number: 10169337
    Abstract: Converting technical data from field oriented electronic data sources into natural language form is disclosed. An approach includes obtaining document data from an input document, wherein the document data is in a non-natural language form. The approach includes determining a data type of the document data from one of a plurality of data types defined in a detection and conversion database. The approach includes translating the document data to a natural language form based on the determined data type. The approach additionally includes outputting the translated document data in natural language form to an output data stream.
    Type: Grant
    Filed: November 20, 2017
    Date of Patent: January 1, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John J. Bird, Doyle J. McCoy
  • Patent number: 10164919
    Abstract: A method and system for sharing content in an instant messaging application are disclosed. According to one embodiment a computer-implemented method comprises logging content accessed by a first client, and a list of accessible content is updated and provided to the first client. A request is received from the first client to share first content of the list of accessible content with a second client, and a message is delivered to the second client, the message containing a link to the first content.
    Type: Grant
    Filed: October 25, 2016
    Date of Patent: December 25, 2018
    Assignee: Google LLC
    Inventor: Christopher Szeto
  • Patent number: 10155166
    Abstract: A system is provided, including the following: a computing device that executes a video game and renders a primary video feed of the video game to a display device, the primary video feed providing a first view into a virtual space; a robot, including, a camera that captures images of a user, a projector, and, a controller that processes the images of the user to identify a gaze direction of the user; wherein when the gaze direction of the user changes from a first gaze direction that is directed towards the display device, to a second gaze direction that is directed away from the display device, the computing device generates a secondary video feed providing a second view into the virtual space; wherein the controller of the robot activates the projector to project the secondary video feed onto the projection surface in the local environment.
    Type: Grant
    Filed: September 8, 2017
    Date of Patent: December 18, 2018
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Michael Taylor, Jeffrey Roger Stafford
  • Patent number: 10147415
    Abstract: Content is received at a receiving equipment from a transmitting user terminal over a network in a communication session between a transmitting user and a receiving user. The received content comprises audio data representing speech spoken by a voice of the transmitting user, and further comprises text data generated from speech spoken by the voice of the transmitting user during the communication session. At the receiving equipment, at least a portion of the received text data is converted to artificially-generated audible speech based on a model of the transmitting user's voice stored at the receiving equipment (and in embodiments in dependence on the receive audio quality). The received audio data and the artificially-generated speech are supplied to be played out to the receiving user through one or more speakers.
    Type: Grant
    Filed: February 2, 2017
    Date of Patent: December 4, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ross G. Cutler, Sriram Srinivasan, Ramin Mehran, Karlton David Sequeira, Jayant Ajit Gupchup, Senthil K. Velayutham
  • Patent number: 10137902
    Abstract: In one embodiment, an apparatus for adaptively interacting with a driver via voice interaction is provided. The apparatus includes a computational models block and an adaptive interactive voice system. The computational models block is configured to receive driver related parameters, vehicle related parameters, and vehicle environment parameters from a plurality of sensors. The computational models block is further configured to generate a driver state model based on the driver related parameters and to generate a vehicle state model based on the vehicle related parameter. The computational models block is further configured to generate a vehicle environment state model based on the vehicle environment parameters. The adaptive interactive voice system is configured to generate a voice output based on a driver's situation and context as indicated on information included within at least one of the driver state model, the vehicle state model, and the vehicle environment state model.
    Type: Grant
    Filed: February 12, 2015
    Date of Patent: November 27, 2018
    Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED
    Inventors: Ajay Juneja, Stefan Marti, Davide Di Censo
  • Patent number: 10141006
    Abstract: Described are techniques for automatically improving the accessibility of webpages and other content using machine learning and artificial intelligence systems. Webpage data may include visual data used to render visible elements and audio data used to render audible elements, such as digitized speech representative of at least a portion of the visible elements. In some cases, text data may be generated based on the audio data. The audio data may be modified based on target text strings, patterns, and characteristics determined in the text data, or the audio data may be analyzed directly. Additionally, user interactions with particular visible elements and corresponding audible elements may be compared. If the user interactions for a visible element exceed the user interactions for a corresponding audible element, the audio data associated with the audible element may be modified.
    Type: Grant
    Filed: June 27, 2016
    Date of Patent: November 27, 2018
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventor: Alexandru Burciu
  • Patent number: 10120866
    Abstract: Examples of the present disclosure describe systems and methods relating to conversational system user behavior identification. A user of the conversational system may be evaluated based on one or more factors. The one or more factors may be compared to an aggregated measure for a larger group of conversational system users, such that “anomalous” behavior (e.g., behavior that deviates from a normal behavior) may be identified. When a user is identified as exhibiting anomalous behavior, the conversational system may adapt its interactions with the user in order to encourage, discourage, or further observe the identified behavior. As a result, the conversational system may be able to verify a user's anomalous behavior, discourage the anomalous behavior, or take other action while interacting with the user.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: November 6, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Joseph Edwin Johnson, Jr., Emmanouil Koukoumidis, Donald Brinkman, Matthew Schuerman
  • Patent number: 10109270
    Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
    Type: Grant
    Filed: July 19, 2017
    Date of Patent: October 23, 2018
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Jakob Nicolaus Foerster
  • Patent number: 10108606
    Abstract: Provided are an automatic interpretation system and method for generating a synthetic sound having characteristics similar to those of an original speaker's voice. The automatic interpretation system for generating a synthetic sound having characteristics similar to those of an original speaker's voice includes a speech recognition module configured to generate text data by performing speech recognition for an original speech signal of an original speaker and extract at least one piece of characteristic information among pitch information, vocal intensity information, speech speed information, and vocal tract characteristic information of the original speech, an automatic translation module configured to generate a synthesis-target translation by translating the text data, and a speech synthesis module configured to generate a synthetic sound of the synthesis-target translation.
    Type: Grant
    Filed: July 19, 2016
    Date of Patent: October 23, 2018
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Seung Yun, Ki Hyun Kim, Sang Hun Kim, Yun Young Kim, Jeong Se Kim, Min Kyu Lee, Soo Jong Lee, Young Jik Lee, Mu Yeol Choi
  • Patent number: 10108603
    Abstract: Aspects of the disclosure are directed to natural language processing. An input interface of a computing device receives input (e.g., speech input) and generates a digital signal corresponding to that input. Text corresponding to the digital signal is obtained, and the text is processed using each of a context-free and a context-specific linguistic model to generate linguistic processing results for that text. The text and linguistic processing results may be processed using a NLU model to generate an NLU recognition result corresponding to the input received at the input interface. The text and the linguistic processing results may also be annotated and used to train a NLU model. The linguistic processing results may relate to, e.g., the tokenization of portions of the text, the normalization of portions of the text, sequences of normalizations for portions of the text, and rankings and prioritization of the linguistic processing results.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: October 23, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Jean-Francois Lavallee, Kenneth W. D. Smith
  • Patent number: 10109278
    Abstract: A content alignment service is described that may generate content synchronization information to facilitate the synchronous presentation of corresponding audio content and textual content. In some embodiments, portions of body text (as opposed to front matter, such as a table of contents; or back matter, such as an index) in the textual content are identified and synchronized with corresponding audio content. In one example application, an audiobook may be synchronized with an electronic book. As the body text portions of the electronic book are consumed, corresponding words of the audiobook may be audibly presented.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: October 23, 2018
    Assignee: Audible, Inc.
    Inventors: Steven C. Dzik, Guy A. Story, Jr.
  • Patent number: 10110725
    Abstract: Provided is a computer implemented method and system for delivering text messages, emails, and messages from a messenger application to a user while the user is engaged in an activity, such as driving, exercising, or working. Typically, the emails and other messages are announced to the user and read aloud without any user input. In Drive Mode, while the user is driving, a clean interface is shown to the user, and the user can hear announcements and messages/emails aloud without looking at the screen of the phone, and use gestures to operate the phone. After a determination is made that a new text message and/or email has arrived, the user is informed aloud of the text message/email/messenger message and in most instances, and if the user takes no further action, the body and/or subject of the text message/email/messenger message is read aloud to the user. All messages can be placed in a single queue, and read to the user in order of receipt.
    Type: Grant
    Filed: January 31, 2017
    Date of Patent: October 23, 2018
    Assignee: MessageLoud LLC
    Inventor: Garin Toren
  • Patent number: 10102852
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining a template that defines (i) trigger criteria for presenting a notification type and (ii) content rules for determining content to include in a notification of the notification type. Additional actions include accessing enterprise resources of an enterprise, the enterprise resources including data describing entities related to the enterprise and relationships among the entities. Further actions include, accessing user information specific to a user and determining that the trigger criteria is satisfied by the enterprise resources and the user information. Additional actions include generating a particular notification of the notification type based at least on the content rules and providing the particular notification to the user.
    Type: Grant
    Filed: April 14, 2015
    Date of Patent: October 16, 2018
    Assignee: Google LLC
    Inventors: Fuchun Peng, Jakob Nicolaus Foerster, Diego Melendo Casado, Fei Huang, Francoise Beaufays
  • Patent number: 10102880
    Abstract: An information processing device is provided with: an image meaning judgment section classifying and judging an inputted image as having a particular meaning by classifying characteristics of the image itself and referring to a database; an audio meaning judgment section classifying and judging an inputted audio as having a particular meaning by classifying characteristics of the audio itself and referring to a database; and an association control section outputting the inputted image and the inputted audio acquired at different timings mutually in association with each other on the basis of each of judgment results of the image meaning judgment section and the audio meaning judgment section; and the information processing device is capable of, even if an image without a corresponding audio or an audio without a corresponding image is inputted, outputting the image and the audio in association with each other.
    Type: Grant
    Filed: July 11, 2014
    Date of Patent: October 16, 2018
    Assignee: OLYMPUS CORPORATION
    Inventors: Takao Takasu, Kei Matsuoka, Yukari Okamoto
  • Patent number: 10083684
    Abstract: An approach is provided that assists visually impaired users. The approach analyzes a document that is being utilized by the visually impaired user. The analysis derives a sensitivity of the document. A vocal characteristic corresponding to the derived sensitivity is retrieved. Text from the document is audibly read to the visually impaired user with a text to speech process that utilizes the retrieved vocal characteristic. The retrieved vocal characteristic conveys the derived sensitivity of the document to the visually impaired user.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: September 25, 2018
    Assignee: International Business Machines Corporation
    Inventors: Maureen E. Kraft, Fang Lu, Azadeh Salehi, Weisong Wang
  • Patent number: 10079011
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for speech synthesis. A system practicing the method receives a set of ordered lists of speech units, for each respective speech unit in each ordered list in the set of ordered lists, constructs a sublist of speech units from a next ordered list which are suitable for concatenation, performs a cost analysis of paths through the set of ordered lists of speech units based on the sublist of speech units for each respective speech unit, and synthesizes speech using a lowest cost path of speech units through the set of ordered lists based on the cost analysis. The ordered lists can be ordered based on the respective pitch of each speech unit. In one embodiment, speech units which do not have an assigned pitch can be assigned a pitch.
    Type: Grant
    Filed: May 20, 2014
    Date of Patent: September 18, 2018
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Alistair D. Conkie
  • Patent number: 10075546
    Abstract: Techniques to automatically syndicate content over a network are described. An apparatus may comprise a client computer having a processing system with a processor and computer-readable medium. The computer readable medium may store program instructions for a syndication manager component communicatively coupled to a content producing component arranged to be executed by the processor. The syndication manager component may be operative to receive syndication content from the content producing component, and provide a syndication dialog through the content producing component to syndicate the syndication content using a content delivery platform. The syndication manager component may also syndicate the syndication content to form a syndication resource accessible from the content delivery platform over a network using a syndication referent. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 10, 2016
    Date of Patent: September 11, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Christian E. Stich, Gareth Howell, Tristan Davis, Dan Parish, Eran Megiddo, Sherman Der, Jeff Rambharack
  • Patent number: 10068666
    Abstract: Systems and methods are provided for data driven analysis, modeling, and semi-supervised machine learning for qualitative and quantitative determinations. The systems and methods include obtaining data associated with individuals, and determining features associated with the individuals based on the data and similarities among the individuals based on the features. The systems and methods can label some individuals as exemplary, generate a graph wherein nodes of the graph represent individuals, edges of the graph represent similarity among the individuals, and nodes associated labeled individuals are weighted. The disclosed system and methods can apply a weight to unweighted nodes of the graph based on propagating the labels through the graph where the propagation is based on influence exerted by the weighted nodes on the unweighted nodes. The disclosed systems and methods can provide output associated with the individuals represented on the graph and the associated weights.
    Type: Grant
    Filed: June 1, 2016
    Date of Patent: September 4, 2018
    Assignee: GRAND ROUNDS, INC.
    Inventors: Seiji James Yamamoto, Ranjit Chacko
  • Patent number: 10043510
    Abstract: A method and a computing device for building a reference system for determining a stress position of a new word form, the method comprising: sorting, in a reverse lexicographic order, a plurality of word forms being marked with a particular stress position; clustering the plurality of sorted word forms into a plurality of clusters, comprises a plurality of terminal clusters, each terminal cluster comprising word forms having both: (i) a same ending being a terminal common ending, and (ii) a same stress position, combination of the terminal common ending and said same stress position being unique; building, using the plurality of terminal clusters, the reference system having a reference to at least one terminal cluster of the plurality of terminal clusters, the at least one terminal cluster comprising an indication of the particular stress position proper to word forms which are included in that respective terminal cluster.
    Type: Grant
    Filed: December 1, 2016
    Date of Patent: August 7, 2018
    Assignee: Yandex Europe AG
    Inventor: Yury Grigorievich Zelenkov
  • Patent number: 10033474
    Abstract: A server system accesses a listening history of a user of the media-providing service, where the user is in a demographic group. For each track of a plurality of tracks in the listening history of the user, the server system calculates a first metric based at least in part on an affinity of members of the demographic group, as compared to members of other demographic groups, for the track. The server system averages the first metrics for the plurality of tracks in the listening history of the user to determine a second metric. In accordance with a determination that the second metric satisfies a threshold, the server system selects content for the user and provides the selected content to a client device associated with the user.
    Type: Grant
    Filed: July 28, 2017
    Date of Patent: July 24, 2018
    Assignee: SPOTIFY AB
    Inventors: Clay Gibson, Santiago Gil, Ian Anderson, Oguz Semerci, Scott Wolf, Margreth Mpossi
  • Patent number: 10026393
    Abstract: Systems and methods may provide non-lexical cues in synthesized speech. A system may generate response text and a response intent based on user input. Non-lexical cue insertion points are determined based on the characteristics of the text and/or the intent. One or more non-lexical cues are inserted at insertion points to generate augmented text. The augmented text is synthesized into speech using speech units associated with the response text and the inserted response intent.
    Type: Grant
    Filed: December 19, 2016
    Date of Patent: July 17, 2018
    Assignee: INTEL CORPORATION
    Inventors: Jessica M. Christian, Peter Graff, Crystal A. Nakatsu, Beth Ann Hockey
  • Patent number: 10019688
    Abstract: A method for providing incentive to mentors of at-risk mentees is described. The method comprises the steps of determining an at-risk mentee's behavior and progress in a period of time, determining the mentee's income and income tax payments during the same period of time, and calculating a financial incentive to the mentee's mentor, wherein the amount of the financial incentive is calculated based on the mentee's behavior and/or income tax payment during the period of time.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: July 10, 2018
    Inventor: David A. Dill
  • Patent number: 9992545
    Abstract: A method, devices, and a non-transitory recording medium for providing contents are provided. The method includes reproducing at least one content, at a first device; transmitting information corresponding to the at least one content from the first device to a second device to reproduce content corresponding to the at least one content in the second device; receiving a message from an external device while reproducing the at least one content at the first device; providing the received message while reproducing the at least one content at the first device; and transmitting information corresponding to the received message from the first device to the second device, the information corresponding to the received message being used to enable the second device to provide a message corresponding to the received message while reproducing the content corresponding to the at least one content.
    Type: Grant
    Filed: September 27, 2016
    Date of Patent: June 5, 2018
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Keum-koo Lee, Hee-jeong Choo, Ju-yun Sung, Hyun-joo Oh, Min-jeong Moon, Ji-young Kwahk
  • Patent number: 9990915
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in synthesizing the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.
    Type: Grant
    Filed: December 28, 2016
    Date of Patent: June 5, 2018
    Assignee: Nuance Communications, Inc.
    Inventor: Vincent Pollet
  • Patent number: 9984377
    Abstract: A method and system of advertising. In response to a request received from an advertiser, an audio advertisement is generated based on visual advertisement information. The audio advertisement is provided for presentation on behalf of the advertiser. In one embodiment, the audio advertisement is an abbreviated form of the visual advertisement information. A determination is made as to whether a call from a customer has been connected to the advertiser via the audio advertisement. The advertiser is charged a predefined fee if it is determined that a call from a customer has been connected to the advertiser via the audio advertisement. In one embodiment, a text for a first advertisement presentable in a first media type is received to generate an abbreviated text for a second advertisement presentable in a second media type.
    Type: Grant
    Filed: January 18, 2007
    Date of Patent: May 29, 2018
    Assignee: YELLOWPAGES.COM LLC
    Inventors: Ebbe Altberg, Scott Faber, Ron Hirson, Sean Van Der Linden, Ben Harris Lyon, Paul G. Manca