Image To Speech Patents (Class 704/260)
  • Patent number: 11145289
    Abstract: A system and method for providing an audible explanation of documents upon request is disclosed. The system and method use an intelligent voice assistant that can receive audible requests for document explanations. The intelligent voice assistant can retrieve document summary information and provide an audible response explaining key points of the document.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: October 12, 2021
    Assignee: United Services Automobile Association (USAA)
    Inventors: Richard Daniel Graham, Ruthie D. Lyle
  • Patent number: 11145288
    Abstract: A computing system and related techniques for selecting content to be automatically converted to speech and provided as an audio signal are provided. A text-to-speech request associated with a first document can be received that includes data associated with a playback position of a selector associated with a text-to-speech interface overlaid on the first document. First content associated with the first document can be determined based at least in part on the playback position, the first content including content that is displayed in the user interface at the playback position. The first document can be analyzed to identify one or more structural features associated with the first content. Speech data can be generated based on the first content and the one or more structural features.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: October 12, 2021
    Assignee: Google LLC
    Inventors: Benedict Davies, Guillaume Boniface, Jack Whyte, Jakub Adamek, Simon Tokumine, Alessio Macri, Matthias Quasthoff
  • Patent number: 11138965
    Abstract: A technique for estimating phonemes for a word written in a different language is disclosed. A sequence of graphemes of a given word in a source language is received. The sequence of the graphemes in the source language is converted into a sequence of phonemes in the source language. One or more sequences of phonemes in a target language are generated from the sequence of the phonemes in the source language by using a neural network model. One sequence of phonemes in the target language is determined for the given word. Also, technique for estimating graphemes of a word from phonemes in a different language is disclosed.
    Type: Grant
    Filed: November 2, 2017
    Date of Patent: October 5, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, Toru Nagano, Yuta Tsuboi
  • Patent number: 11138963
    Abstract: A processor-implemented text-to-speech method includes determining, using a sub-encoder, a first feature vector indicating an utterance characteristic of a speaker from feature vectors of a plurality of frames extracted from a partial section of a first speech signal of the speaker, and determining, using an autoregressive decoder, into which the first feature vector is input as an initial value, from context information of the text, a second feature vector of a second speech signal in which a text is uttered according to the utterance characteristic.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: October 5, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Hoshik Lee
  • Patent number: 11128591
    Abstract: In one example, a trigger is obtained for a dynamic ideogram to dynamically interact with the electronic messaging environment. In response to the trigger, it is determined how the dynamic ideogram is to dynamically interact with the electronic messaging environment including performing an analysis of the electronic messaging environment. Based on the analysis of the electronic messaging environment, instructions to render the dynamic ideogram to dynamically interact with the electronic messaging environment are generated for a first user device configured to communicate with a second user device via the electronic messaging environment.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: September 21, 2021
    Assignee: CISCO TECHNOLOGY, INC.
    Inventors: Christopher Deering, Colin Olivier Louis Vidal, Jimmy Coyne
  • Patent number: 11107458
    Abstract: An example embodiment may involve receiving, from a client device, a selection of text-based articles from newsfeeds. The selection may specify that the text-based articles have been flagged for audible playout. The example embodiment may also involve, possibly in response to receiving the selection of the text-based articles, retrieving text-based articles from the newsfeeds. The example embodiment may also involve causing the text-based articles to be converted into audio files. The example embodiment may also involve receiving a request to stream the audio files to the client device or another device. The example embodiment may also involve causing the audio files to be streamed to the client device or the other device.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: August 31, 2021
    Assignee: Gracenote Digital Ventures, LLC
    Inventor: Venkatarama Anilkumar Panguluri
  • Patent number: 11108721
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for communication using multiple media content items stored on both a sending device and a receiving device. In particular, in one or more embodiments, the disclosed systems receive an application package. The application generates a message from input text and matches a portion of the text input to an audio content item using mapping data. The application generates a message including the text input and an identifier to the audio content item. A receiving system receives an application package. The application receives the message and locates the audio content item on the application package using the identifier and presents the message, including the text and the audio content item.
    Type: Grant
    Filed: April 21, 2020
    Date of Patent: August 31, 2021
    Inventors: David Roberts, Glenn Sugden
  • Patent number: 11107457
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: August 31, 2021
    Assignee: Google LLC
    Inventors: Samuel Bengio, Yuxuan Wang, Zongheng Yang, Zhifeng Chen, Yonghui Wu, Ioannis Agiomyrgiannakis, Ron J. Weiss, Navdeep Jaitly, Ryan M. Rifkin, Robert Andrew James Clark, Quoc V. Le, Russell J. Ryan, Ying Xiao
  • Patent number: 11106314
    Abstract: Visual images projected on a projection surface by a projector provide an interactive user interface having end user inputs detected by a detection device, such as a depth camera. The detection device monitors projected images initiated in response to user inputs to determine calibration deviations, such as by comparing the distance between where a user makes an input and where the input is projected. Calibration is performed to align the projected outputs and detected inputs. The calibration may include a coordinate system anchored by its origin to a physical reference point of the projection surface, such as a display mat or desktop edge.
    Type: Grant
    Filed: April 21, 2015
    Date of Patent: August 31, 2021
    Assignee: DELL PRODUCTS L.P.
    Inventors: Karthik Krishnakumar, Michiel Sebastiaan Emanuel Petrus Knoppert, Rocco Ancona, Abu S. Sanaullah, Mark R. Ligameri
  • Patent number: 11093691
    Abstract: A system and method of establishing a communication session is disclosed herein. A computing system receives, from a client device, a content item comprising text-based content. The computing system generates a mark-up version of the content item by identifying one or more characters in the text-based content and a relative location of the one or more characters in the content item. The computing system receives, from the client device, an interrogatory related to the content item. The computing system analyzes the mark-up version of the content item to identify an answer to the interrogatory. The computing system generates a response message comprising the identified answer to the interrogatory. The computing system transmits the response message to the client device.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: August 17, 2021
    Assignee: Capital One Services, LLC
    Inventors: Michael Mossoba, Abdelkader M'Hamed Benkreira, Joshua Edwards
  • Patent number: 11087379
    Abstract: A user registers for an account with an account management system, configures account settings to permit the account management system to receive user computing device data from a user computing device associated with the user, and logs into the account via the user computing device. The account management system receives a user voice purchase command and determines a purchase command context based on the received user computing device data. The account management system identifies a product that the user desires to purchase based on the purchase command context and directs the user computing device web browser to a merchant website to set up a transaction for the identified product.
    Type: Grant
    Filed: February 12, 2015
    Date of Patent: August 10, 2021
    Assignee: GOOGLE LLC
    Inventors: Filip Verley, IV, Stuart Ross Hobbie
  • Patent number: 11087091
    Abstract: Disclosed herein is a method and response generation system for providing contextual responses to user interaction. In an embodiment, input data related to user interaction, which may be received from a plurality of input channels in real-time, may be processed using processing models corresponding to each of the input channels for extracting interaction parameters. Thereafter, the interaction parameters may be combined for computing a contextual variable, which in turn may be analyzed to determine a context of the user interaction. Finally, responses corresponding to the context of the user interaction may be generated and provided to the user for completing the user interaction. In some embodiments, the method of present disclosure accurately detects context of the user interaction and provides meaningful contextual responses to the user interaction.
    Type: Grant
    Filed: February 19, 2019
    Date of Patent: August 10, 2021
    Assignee: Wipro Limited
    Inventors: Gopichand Agnihotram, Rajesh Kumar, Pandurang Naik
  • Patent number: 11082559
    Abstract: A virtual assistant server receives a web request such as a HTTP request with one or more call parameters corresponding to a call redirected from an interactive voice response server. The virtual assistant server inputs the received one or more call parameters to a predictive model, which identifies, based on the one or more call parameters, an intelligent communication mode to route the redirected call to. Subsequently, the virtual assistant server routes the redirected call to the intelligent communication mode.
    Type: Grant
    Filed: July 31, 2020
    Date of Patent: August 3, 2021
    Assignee: KORE.AI, INC.
    Inventors: Rajkumar Koneru, Prasanna Kumar Arikala Gunalan, Rajavardhan Nalluri
  • Patent number: 11080474
    Abstract: Described herein is a system and method for associating audio files with one or more cells in a spreadsheet application. As described, one or more audio files may be associated with a single cell in a spreadsheet application or it may be associated with a range of cells in the spreadsheet application. Information about the audio file, such playback properties and other parameters, may be retrieved from the audio file. Once retrieved, a calculation engine of the spreadsheet application may perform one or more calculations on the information in order to change the content of audio file, the playback of the audio files and so on.
    Type: Grant
    Filed: November 1, 2016
    Date of Patent: August 3, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Samuel C. Radakovitz, Christian M. Canton, Carlos A. Ortero, John Campbell, Allison Rutherford, Benjamin E. Rampson
  • Patent number: 11081111
    Abstract: Methods, systems, and related products that provide emotion-sensitive responses to user's commands and other utterances received at an utterance-based user interface. Acknowledgements of user's utterances are adapted to the user and/or the user device, and emotions detected in the user's utterance that have been mapped from one or more emotion features extracted from the utterance. In some examples, extraction of a user's changing emotion during a sequence of interactions is used to generate a response to a user's uttered command. In some examples, emotion processing and command processing of natural utterances are performed asynchronously.
    Type: Grant
    Filed: March 17, 2020
    Date of Patent: August 3, 2021
    Assignee: Spotify AB
    Inventors: Daniel Bromand, David Gustafsson, Richard Mitic, Sarah Mennicken
  • Patent number: 11074907
    Abstract: Techniques for generating a prompt coverage score, which measures an extent to which data output to a user during a dialog is repetitive and monotonous, are described. User input data and system output, corresponding to a dialog exchange between a user and a skill, may be determined. A portion of the system output data, corresponding to a system prompt representing default output data, may be determined. A first number, representing possible variants of the prompt, may be determined along with a second number, representing variants of the prompt output during the dialog exchange. A prompt coverage score may be determined based on the first and second numbers.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: July 27, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Ravi Chikkanayakanahalli Mallikarjuniah, Priya Rao Chagaleti, Shiladitya Roy, Christopher Forbes Will, Cole Ira Brendel, Wei Huang, Sarthak Anand
  • Patent number: 11074904
    Abstract: A speech synthesis method and apparatus based on emotion information are disclosed. A speech synthesis method based on emotion information extracts speech synthesis target text from received data and determines whether the received data includes situation explanation information. First metadata corresponding to first emotion information is generated on the basis of the situation explanation information. When the extracted data does not include situation explanation information, second metadata corresponding to second emotion information generated on the basis of semantic analysis and context analysis is generated. One of the first metadata and the second metadata is added to the speech synthesis target text to synthesize speech corresponding to the extracted data.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: July 27, 2021
    Assignee: LG Electronics Inc.
    Inventors: Siyoung Yang, Minook Kim, Sangki Kim, Yongchul Park, Juyeong Jang, Sungmin Han
  • Patent number: 11068526
    Abstract: Methods, systems, and computer program products are provided for obtaining enhanced metadata for media content searches. In one embodiment, computer program logic embodies a metadata receiver and a media content metadata matcher and combiner. The metadata receiver receives program metadata for a plurality of programs from a plurality of metadata sources. The media content metadata matcher and combiner is configured to perform a matching process whereby metadata associated with each of the plurality of programs is compared to metadata of each of the other plurality of programs to determine if the compared programs are the same program and if so, to combine the metadata from each program into a single program including enhanced metadata and store such in a database. A subsequent search for a program corresponding to the stored program returns at least some of the metadata associated with the program, and that enables accessing the program.
    Type: Grant
    Filed: January 25, 2019
    Date of Patent: July 20, 2021
    Assignee: Caavo Inc
    Inventors: Amrit P. Singh, Sravan K. Andavarapu, Jayanth Manklu, Anu Godara, Vinu Joseph, Vinod K. Gopinath, Ashish D. Aggarwal
  • Patent number: 11062497
    Abstract: A method and system for creation of an audiovisual message that is personalized to a recipient. Information is received that is associated with the recipient. At least one representation of a visual media segment, including an animation component, and at least one representation of an audio media segment for use in creation of the audiovisual message is identified in memory storage. The information is added to at least one of the visual media segment and the audio media segment. The audio media segment is generated as an audio file. The audio file is synchronized to at least one transition in the animation component. The audio file is associated with the visual media segment.
    Type: Grant
    Filed: July 17, 2017
    Date of Patent: July 13, 2021
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Karthiksundar Sankaran, Mauricio Lopez
  • Patent number: 11062694
    Abstract: Systems and methods for generating output audio with emphasized portions are described. Spoken audio is obtained and undergoes speech processing (e.g., ASR and optionally NLU) to create text. It may be determined that the resulting text includes a portion that should be emphasized (e.g., an interjection) using at least one of knowledge of an application run on a device that captured the spoken audio, prosodic analysis, and/or linguistic analysis. The portion of text to be emphasized may be tagged (e.g., using a Speech Synthesis Markup Language (SSML) tag). TTS processing is then performed on the tagged text to create output audio including an emphasized portion corresponding to the tagged portion of the text.
    Type: Grant
    Filed: June 7, 2019
    Date of Patent: July 13, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Marco Nicolis, Adam Franciszek Nadolski
  • Patent number: 11064000
    Abstract: Techniques and systems are described for accessible audio switching options during the online conference. For example, a conferencing system receives presentation content and audio content as part of the online conference from a client device. The conferencing system generates voice-over content from the presentation content by converting text of the presentation content to audio. The conferencing system then divides the presentation content into presentation segments. The conferencing system also divides the audio content into audio segments that correspond to respective presentation segments, and the voice-over content into voice-over segments that correspond to respective presentation segments. As the online conference is output, the conferencing system enables switching between a corresponding audio segment and voice-over segment during output of a respective presentation segment.
    Type: Grant
    Filed: November 29, 2017
    Date of Patent: July 13, 2021
    Assignee: Adobe Inc.
    Inventors: Ajay Jain, Sachin Soni, Amit Srivastava
  • Patent number: 11055047
    Abstract: The waveform display device of the present invention is provided with a waveform pattern storage unit configured to store, in an associated manner, a control command and a waveform pattern of time-series data measured when the manufacturing machine is controlled by the control command, a waveform analysis unit configured to extract a characteristic waveform from the time-series data and identify the control command corresponding to the characteristic waveform with reference to the waveform pattern storage unit, a correspondence analysis unit configured to identify the correspondence between the characteristic waveform and a command included in the control program, based on the control program and the control command corresponding to the characteristic waveform, and a display unit configured to perform display such that the correspondence between the characteristic waveform and the command included in the control program is ascertainable.
    Type: Grant
    Filed: April 1, 2019
    Date of Patent: July 6, 2021
    Assignee: FANUC CORPORATION
    Inventor: Junichi Tezuka
  • Patent number: 11049491
    Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.
    Type: Grant
    Filed: March 24, 2020
    Date of Patent: June 29, 2021
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. Conkie, Ladan Golipour, Ann K. Syrdal
  • Patent number: 11047881
    Abstract: A measuring system for measuring signals with multiple measurement probes comprises a multi probe measurement device comprising at least two probe interfaces that each couple the multi probe measurement device with at least one of the measurement probes, a data interface that couples the multi probe measurement device to a measurement data receiver, and a processing unit coupled to the at least two probe interfaces that records measurement values via the at least two probe interfaces from the measurement probes, wherein the processing unit is further coupled to the data interface and provides the recorded measurement values to the measurement data receiver, and a measurement data receiver comprising a data interface, wherein the data interface of the measurement data receiver is coupled to the data interface of the multi probe measurement device.
    Type: Grant
    Filed: January 15, 2018
    Date of Patent: June 29, 2021
    Inventors: Gerd Bresser, Friedrich Reich
  • Patent number: 11044368
    Abstract: An application processor is provided. The application processor includes a system bus, a host processor, a voice trigger system and an audio subsystem that are electrically connected to the system bus. The voice trigger system performs a voice trigger operation and issues a trigger event based on a trigger input signal that is provided through a trigger interface. The audio subsystem processes audio streams through an audio interface. While an audio replay is performed through the audio interface, the application processor performs an echo cancellation with respect to microphone data received from a microphone to generate compensated data and the voice trigger system performs the voice trigger operation based on the compensated data.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: June 22, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Sun-Kyu Kim
  • Patent number: 11030209
    Abstract: Methods and systems for generating and evaluating fused query lists. A query on a corpus of documents is evaluated using a plurality of retrieval methods and a ranked list for each of the plurality of retrieval methods is obtained. A plurality of fused ranked lists is sampled, each fusing said ranked lists for said plurality of retrieval methods, and the sampled fused ranked lists are sorted. In an unsupervised manner, an objective comprising a likelihood that a fused ranked list, fusing said ranked lists for each of said plurality of retrieval methods, is relevant to a query and a relevance event, is optimized to optimize the sampling, until convergence is achieved. Documents of the fused ranked list are determined based on the optimization.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: June 8, 2021
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Bar Weiner, Shai Erera
  • Patent number: 11023913
    Abstract: A system and method for receiving and executing emoji based commands in messaging applications. The system and method may include processes such as identifying emojis in a message, determining one or more action based on the emoji, and completing the determined actions.
    Type: Grant
    Filed: October 8, 2019
    Date of Patent: June 1, 2021
    Assignee: PayPal, Inc.
    Inventor: Kent Griffin
  • Patent number: 11017763
    Abstract: During text-to-speech processing, a sequence-to-sequence neural network model may process text data and determine corresponding spectrogram data. A normalizing flow component may then process this spectrogram data to predict corresponding phase data. An inverse Fourier transform may then be performed on the spectrogram and phase data to create an audio waveform that includes speech corresponding to the text.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: May 25, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Vatsal Aggarwal, Nishant Prateek, Roberto Barra Chicote, Andrew Paul Breen
  • Patent number: 11016719
    Abstract: A method for producing an audio representation of aggregated content includes selecting preferred content from a number of sources, wherein the sources are emotion-tagged, aggregating the emotion-tagged preferred content sources, and creating an audio representation of the emotion-tagged aggregated content. The aggregation of emotion-tagged content sources and/or the creation of the audio representation may be performed by a mobile device. The emotion-tagged content include text with HTML tags that specify how text-to-speech conversion should be performed.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: May 25, 2021
    Assignee: Dish Technologies L.L.C.
    Inventor: John C. Calef, III
  • Patent number: 11017017
    Abstract: Systems, methods, and computer program products for a cognitive personalized channel on a computer network or telecommunications network, such as a 5G mobile communication network, which can be used for medical purposes to assist color-blind users and people afflicted with achromatopsia. The personalized channel can be a bidirectional channel capable of identifying color and serve as an enhanced medical service. The service operates by collecting collects inputs and streaming data, creates situation-based tags and embeds the tags on human-readable displays to assist users understanding of additional context of the streaming data that might otherwise not be understood due to the user's medical condition. The systems, methods and program products use the embedded tags to create a manifestation of the colors in images, videos, text and other collected visual streams by taking advantage of end-to-end service orchestration provided by 5G networks.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: May 25, 2021
    Assignee: International Business Machines Corporation
    Inventors: Craig M. Trim, Lakisha R. S. Hall, Gandhi Sivakumar, Kushal Patel, Sarvesh S. Patel
  • Patent number: 11011163
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for recognizing voice. A specific implementation of the method comprises: receiving voice information sent by a user through a terminal, and acquiring simultaneously a user identifier of the user; recognizing the voice information to obtain a first recognized text; determining a word information set stored in association with the user identifier of the user based on the user identifier of the user; and processing the first recognized text based on word information in the determined word information set to obtain a second recognized text, and sending the second recognized text to the terminal. The implementation improves the accuracy of voice recognition and meets a personalized need of a user.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: May 18, 2021
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Niandong Du, Yan Xie
  • Patent number: 11004451
    Abstract: A system, a user terminal, a method of the system, a method of the user terminal, and a computer program product are provided. The system includes a communication interface, at least one processor operatively coupled to the communication interface, and at least one piece of memory operatively coupled to the at least one processor, wherein the at least one piece of memory is configured to store instructions configured for the at least one processor to receive sound data from a first external device through the communication interface, obtain a voice signal and a noise signal from the sound data using at least some of an automatic voice recognition module, change the voice signal into text data, determine a noise pattern based on at least some of the noise signal, and determine a domain using the text data and the noise pattern when the memory operates.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: May 11, 2021
    Inventors: Taegu Kim, Sangyong Park, Jungwook Park, Dale Noh, Dongho Jang
  • Patent number: 11003418
    Abstract: An information processing apparatus includes a detection unit and a controller. The detection unit detects a specific operation which is an operation to output audio related to a setting screen, and refers to an operation not to be performed for each setting item displayed on the setting screen. The controller performs control so as to output first audio information related to setting items satisfying a predetermined standard by audio, among setting items included in the setting contents, in a case where the specific operation is detected by the detection unit at a first stage before the setting contents are determined.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: May 11, 2021
    Assignee: FUJI XEROX CO., LTD.
    Inventor: Satoshi Kawamura
  • Patent number: 10991360
    Abstract: A system and method are disclosed for generating customized text-to-speech voices for a particular application. The method comprises generating a custom text-to-speech voice by selecting a voice for generating a custom text-to-speech voice associated with a domain, collecting text data associated with the domain from a pre-existing text data source and using the collected text data, generating an in-domain inventory of synthesis speech units by selecting speech units appropriate to the domain via a search of a pre-existing inventory of synthesis speech units, or by recording the minimal inventory for a selected level of synthesis quality. The text-to-speech custom voice for the domain is generated utilizing the in-domain inventory of synthesis speech units. Active learning techniques may also be employed to identify problem phrases wherein only a few minutes of recorded data is necessary to deliver a high quality TTS custom voice.
    Type: Grant
    Filed: July 31, 2017
    Date of Patent: April 27, 2021
    Assignee: Cerence Operating Company
    Inventors: Srinivas Bangalore, Junlan Feng, Mazin Gilbert, Juergen Schroeter, Ann K. Syrdal, David Schulz
  • Patent number: 10977442
    Abstract: Methods and apparatus, including computer program products, are provided for a contextualized bot framework.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: April 13, 2021
    Assignee: SAP SE
    Inventors: Natesan Sivagnanam, Jayananda A. Kotri
  • Patent number: 10978043
    Abstract: An approach is provided in which an information handling system converts a first set of text to synthesized speech using a text-to-speech converter. The information handling system then converts the synthesized speech to a second set of text using a speech-to-text converter. In response to converting the synthesized speech to the second set of text, the information handling system analyzes the second set of text against a filtering criterion and prevents usage of the synthesized speech based on the analysis.
    Type: Grant
    Filed: October 1, 2018
    Date of Patent: April 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kyle M. Brake, Stanley J. Vernier, Stephen A. Boxwell, Keith G. Frost
  • Patent number: 10977271
    Abstract: A method of normalizing security log data can include receiving one or more security logs including unstructured data from a plurality of devices and reviewing unstructured data of the one or more security logs. The method also can include automatically applying a probabilistic model of one or more engines to identify one or more attributes or features of the unstructured data, and determine whether the identified attributes or features are indicative of identifiable entities, and tagging one or more identifiable entities of the identifiable entities, as well as organizing tagged entities into one or more normalized logs having a readable format with a prescribed schema. In addition, the method can include reviewing the one or more normalized logs for potential security events.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: April 13, 2021
    Assignee: Secureworks Corp.
    Inventor: Lewis McLean
  • Patent number: 10978184
    Abstract: A medical information server receives a signal from a client device over a network, representing a first user interaction of a user with respect to first medical information displayed to a user. A user interaction analyzer invokes a first set of ECCD rules associated with the user based on the first user interaction to determine medical data categories that the user is likely interested in. The first set of ECCD rules was generated by an ECCD engine based on prior user interactions of the user. A data retrieval module accesses medical data servers corresponding to the medical data categories to retrieve medical data of the medical data categories. A view generator integrates the retrieved medical data to generate one or more views of second medical information and transmits the views of second medical information to a client device to be displayed on a display of the client device.
    Type: Grant
    Filed: October 31, 2014
    Date of Patent: April 13, 2021
    Assignee: TeraRecon, Inc.
    Inventor: Jeffrey Sorenson
  • Patent number: 10971034
    Abstract: A method of automatically partitioning a refreshable braille display based on presence of pertinent ancillary alphanumeric content. In an unpartitioned configuration, every braille cell of the refreshable braille display is used to output the primary alphanumeric content. When the refreshable braille display outputs a segment of the primary alphanumeric content having associated ancillary alphanumeric content, such as a footnote or a comment, the braille display is automatically partitioned into a first partition and a second partition. The braille cells of the first partition are allocated for outputting the primary alphanumeric content, while the braille cells of the second partition are allocated for outputting the ancillary alphanumeric content.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: April 6, 2021
    Assignee: Freedom Scientific, Inc.
    Inventors: James T. Datray, Joseph Kelton Stephen, Glen Gordon
  • Patent number: 10970909
    Abstract: Embodiments of the present disclosure provide a method and an apparatus for eye movement synthesis, the method including: obtaining eye movement feature data and speech feature data, wherein the eye movement feature data reflects an eye movement behavior, and the speech feature data reflects a voice feature; obtaining a driving model according to the eye movement feature data and the speech feature data, wherein the driving model is configured to indicate an association between the eye movement feature data and the speech feature data; synthesizing an eye movement of a virtual human according to speech input data and the driving model and controlling the virtual human to exhibit the synthesized eye movement. The embodiment makes the virtual human to exhibit an eye movement corresponding to the voice data according to the eye movement feature data and the speech feature data, thereby improving the authenticity in the interaction.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: April 6, 2021
    Assignee: BEIHANG UNIVERSITY
    Inventors: Feng Lu, Qinping Zhao
  • Patent number: 10967223
    Abstract: Tracking and monitoring athletic activity offers individuals with additional motivation to continue such behavior. An individual may track his or her athletic activity by completing goals. These goals may be represented by real-world objects such as food items, landmarks, buildings, statues, other physical structures, toys and the like. Each object may correspond to an athletic activity goal and require an amount of athletic activity to complete the goal. For example, a donut goal object may correspond to an athletic activity goal of burning 350 calories. The user may progress from goal object to goal object. Goal objects may increase in difficulty (e.g., amount of athletic activity required) and might only be available for selection upon completing an immediately previous goal object, a number of goal objects, an amount of athletic activity and the like.
    Type: Grant
    Filed: March 16, 2020
    Date of Patent: April 6, 2021
    Assignee: NIKE, Inc.
    Inventors: Michael T. Hoffman, Kwamina Crankson, Jason Nims
  • Patent number: 10963068
    Abstract: An input device is provided having at least a plurality of touch-sensitive back surfaces, with ability to split into several independent units, with adjustable vertical and horizontal angles between those units; a method of dividing the keys of a keyboard into interface groups; each group having a home key and associated with a finger; a method of dynamically mapping and remapping the home keys of each interface group to the coordinates of the associated fingers at their resting position, and mapping non-home keys around theft associated home keys on the touch-sensitive back surfaces; a method of calculating and remapping home keys and non-home keys when the coordinates of the fingers at the resting position shift during operation; a method of reading each activated key immediately and automatically; a method of detecting typos and grammatical errors and notifying operators using human speech or other communication methods.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: March 30, 2021
    Inventor: Hovsep Giragossian
  • Patent number: 10942701
    Abstract: A method for performing voice dictation with an earpiece worn by a user includes receiving as input to the earpiece voice sound information from the user at one or more microphones of the earpiece, receiving as input to the earpiece user control information from one or more sensors within the earpiece independent from the one or more microphones of the earpiece, inserting a machine-generated transcription of the voice sound information from the user into a user input area associated with an application executing on a computing device and manipulating the application executing on the computing device based on the user control information.
    Type: Grant
    Filed: October 23, 2017
    Date of Patent: March 9, 2021
    Assignee: BRAGI GmbH
    Inventors: Peter Vincent Boesen, Luigi Belverato, Martin Steiner
  • Patent number: 10943060
    Abstract: A collaborative content management system allows multiple users to access and modify collaborative documents. When audio data is recorded by or uploaded to the system, the audio data may be transcribed or summarized to improve accessibility and user efficiency. Text transcriptions are associated with portions of the audio data representative of the text, and users can search the text transcription and access the portions of the audio data corresponding to search queries for playback. An outline can be automatically generated based on a text transcription of audio data and embedded as a modifiable object within a collaborative document. The system associates hot words with actions to modify the collaborative document upon identifying the hot words in the audio data. Collaborative content management systems can also generate custom lexicons for users based on documents associated with the user for use in transcribing audio data, ensuring that text transcription is more accurate.
    Type: Grant
    Filed: October 18, 2019
    Date of Patent: March 9, 2021
    Assignee: Dropbox, Inc.
    Inventors: Timo Mertens, Bradley Neuberg
  • Patent number: 10936360
    Abstract: Embodiments of the present disclosure provide a method and device for of processing a batch process including a plurality of content management service operations. The method, comprises: determining, at a client, a batch process template associated with the batch process, the batch process template including shareable information and at least one variable field of the plurality of content management service operations; determining a value of the at least one variable field; generating, based on the determined batch process template and the value, a first request for performing the batch process template; and sending the first request to a server. Embodiments of the present disclosure further provide a corresponding method performed at a server side, and a corresponding device.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: March 2, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Wei Ruan, Jason Chen, Wei Zhou, Chen Wang, Zed Minhong
  • Patent number: 10937412
    Abstract: According to an embodiment of the present invention, there is provided a terminal including a memory which stores a prosody correction model; a processor which corrects a first prosody prediction result of a text sentence to a second prosody prediction result based on the prosody correction model and generates a synthetic speech corresponding to the text sentence having a prosody according to the second prosody prediction result; and an audio output unit which outputs the generated synthetic speech.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: March 2, 2021
    Assignee: LG ELECTRONICS INC.
    Inventors: Jonghoon Chae, Sungmin Han, Yongchul Park, Siyoung Yang, Juyeong Jang
  • Patent number: 10922446
    Abstract: A computational accelerator for determination of linkages across disparate works in a model-based system engineering (MBSE) regime accesses textual content of MBSE works and performs preprocessing of each MBSE work to produce a preprocessed data structures representing the MBSE works. The preprocessing gatherings significant terms from each MBSE work, and delineates the textual content of each MBSE work into segments corresponding to separately identifiable textual statements. Segment-wise comparison between segment pairings of the preprocessed data structures corresponding to different MBSE works is performed to produce a set of segment-wise comparison results based on terms common to each segment pairing, and statement-wise linkages between statements of the MBSE works are determined based on the set of segment-wise comparison results.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: February 16, 2021
    Assignee: Raytheon Company
    Inventors: Susan N. Gottschlich, Gregory S. Schrecke, Patrick M. Killian
  • Patent number: 10923100
    Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: February 16, 2021
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Jakob Nicolaus Foerster
  • Patent number: 10916236
    Abstract: An output device includes a memory and a processor coupled to the memory. The processor obtains an utterance command and an action command, analyzes an utterance content of the utterance command inputted after an action performed in response to the action command, modifies the action command based on a result of the analysis, and outputs the modified action command and the utterance command.
    Type: Grant
    Filed: March 11, 2019
    Date of Patent: February 9, 2021
    Assignee: FUJITSU LIMITED
    Inventors: Kaoru Kinoshita, Masayoshi Shimizu, Shinji Kanda
  • Patent number: 10902323
    Abstract: Methods and apparatus, including computer program products, are provided for a bot framework. In some implementations, there may be provided a method which may include receiving a request comprising a text string, the request corresponding to a request for handling by a bot; generating, from the request, at least one token; determining whether the at least one token matches at least one stored token mapped to an address; selecting the address in response to the match between the at least one token and the at least one stored token; and presenting, at a client interface associated with the bot, data obtained at the selected address in order to form a response to the request. Related systems, methods, and articles of manufacture are also disclosed.
    Type: Grant
    Filed: August 11, 2017
    Date of Patent: January 26, 2021
    Assignee: SAP SE
    Inventors: Natesan Sivagnanam, Abhishek Jain