Image To Speech Patents (Class 704/260)
  • Patent number: 11064000
    Abstract: Techniques and systems are described for accessible audio switching options during the online conference. For example, a conferencing system receives presentation content and audio content as part of the online conference from a client device. The conferencing system generates voice-over content from the presentation content by converting text of the presentation content to audio. The conferencing system then divides the presentation content into presentation segments. The conferencing system also divides the audio content into audio segments that correspond to respective presentation segments, and the voice-over content into voice-over segments that correspond to respective presentation segments. As the online conference is output, the conferencing system enables switching between a corresponding audio segment and voice-over segment during output of a respective presentation segment.
    Type: Grant
    Filed: November 29, 2017
    Date of Patent: July 13, 2021
    Assignee: Adobe Inc.
    Inventors: Ajay Jain, Sachin Soni, Amit Srivastava
  • Patent number: 11062694
    Abstract: Systems and methods for generating output audio with emphasized portions are described. Spoken audio is obtained and undergoes speech processing (e.g., ASR and optionally NLU) to create text. It may be determined that the resulting text includes a portion that should be emphasized (e.g., an interjection) using at least one of knowledge of an application run on a device that captured the spoken audio, prosodic analysis, and/or linguistic analysis. The portion of text to be emphasized may be tagged (e.g., using a Speech Synthesis Markup Language (SSML) tag). TTS processing is then performed on the tagged text to create output audio including an emphasized portion corresponding to the tagged portion of the text.
    Type: Grant
    Filed: June 7, 2019
    Date of Patent: July 13, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Marco Nicolis, Adam Franciszek Nadolski
  • Patent number: 11062497
    Abstract: A method and system for creation of an audiovisual message that is personalized to a recipient. Information is received that is associated with the recipient. At least one representation of a visual media segment, including an animation component, and at least one representation of an audio media segment for use in creation of the audiovisual message is identified in memory storage. The information is added to at least one of the visual media segment and the audio media segment. The audio media segment is generated as an audio file. The audio file is synchronized to at least one transition in the animation component. The audio file is associated with the visual media segment.
    Type: Grant
    Filed: July 17, 2017
    Date of Patent: July 13, 2021
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Karthiksundar Sankaran, Mauricio Lopez
  • Patent number: 11055047
    Abstract: The waveform display device of the present invention is provided with a waveform pattern storage unit configured to store, in an associated manner, a control command and a waveform pattern of time-series data measured when the manufacturing machine is controlled by the control command, a waveform analysis unit configured to extract a characteristic waveform from the time-series data and identify the control command corresponding to the characteristic waveform with reference to the waveform pattern storage unit, a correspondence analysis unit configured to identify the correspondence between the characteristic waveform and a command included in the control program, based on the control program and the control command corresponding to the characteristic waveform, and a display unit configured to perform display such that the correspondence between the characteristic waveform and the command included in the control program is ascertainable.
    Type: Grant
    Filed: April 1, 2019
    Date of Patent: July 6, 2021
    Assignee: FANUC CORPORATION
    Inventor: Junichi Tezuka
  • Patent number: 11049491
    Abstract: Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.
    Type: Grant
    Filed: March 24, 2020
    Date of Patent: June 29, 2021
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. Conkie, Ladan Golipour, Ann K. Syrdal
  • Patent number: 11047881
    Abstract: A measuring system for measuring signals with multiple measurement probes comprises a multi probe measurement device comprising at least two probe interfaces that each couple the multi probe measurement device with at least one of the measurement probes, a data interface that couples the multi probe measurement device to a measurement data receiver, and a processing unit coupled to the at least two probe interfaces that records measurement values via the at least two probe interfaces from the measurement probes, wherein the processing unit is further coupled to the data interface and provides the recorded measurement values to the measurement data receiver, and a measurement data receiver comprising a data interface, wherein the data interface of the measurement data receiver is coupled to the data interface of the multi probe measurement device.
    Type: Grant
    Filed: January 15, 2018
    Date of Patent: June 29, 2021
    Inventors: Gerd Bresser, Friedrich Reich
  • Patent number: 11044368
    Abstract: An application processor is provided. The application processor includes a system bus, a host processor, a voice trigger system and an audio subsystem that are electrically connected to the system bus. The voice trigger system performs a voice trigger operation and issues a trigger event based on a trigger input signal that is provided through a trigger interface. The audio subsystem processes audio streams through an audio interface. While an audio replay is performed through the audio interface, the application processor performs an echo cancellation with respect to microphone data received from a microphone to generate compensated data and the voice trigger system performs the voice trigger operation based on the compensated data.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: June 22, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Sun-Kyu Kim
  • Patent number: 11030209
    Abstract: Methods and systems for generating and evaluating fused query lists. A query on a corpus of documents is evaluated using a plurality of retrieval methods and a ranked list for each of the plurality of retrieval methods is obtained. A plurality of fused ranked lists is sampled, each fusing said ranked lists for said plurality of retrieval methods, and the sampled fused ranked lists are sorted. In an unsupervised manner, an objective comprising a likelihood that a fused ranked list, fusing said ranked lists for each of said plurality of retrieval methods, is relevant to a query and a relevance event, is optimized to optimize the sampling, until convergence is achieved. Documents of the fused ranked list are determined based on the optimization.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: June 8, 2021
    Assignee: International Business Machines Corporation
    Inventors: Haggai Roitman, Bar Weiner, Shai Erera
  • Patent number: 11023913
    Abstract: A system and method for receiving and executing emoji based commands in messaging applications. The system and method may include processes such as identifying emojis in a message, determining one or more action based on the emoji, and completing the determined actions.
    Type: Grant
    Filed: October 8, 2019
    Date of Patent: June 1, 2021
    Assignee: PayPal, Inc.
    Inventor: Kent Griffin
  • Patent number: 11017017
    Abstract: Systems, methods, and computer program products for a cognitive personalized channel on a computer network or telecommunications network, such as a 5G mobile communication network, which can be used for medical purposes to assist color-blind users and people afflicted with achromatopsia. The personalized channel can be a bidirectional channel capable of identifying color and serve as an enhanced medical service. The service operates by collecting collects inputs and streaming data, creates situation-based tags and embeds the tags on human-readable displays to assist users understanding of additional context of the streaming data that might otherwise not be understood due to the user's medical condition. The systems, methods and program products use the embedded tags to create a manifestation of the colors in images, videos, text and other collected visual streams by taking advantage of end-to-end service orchestration provided by 5G networks.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: May 25, 2021
    Assignee: International Business Machines Corporation
    Inventors: Craig M. Trim, Lakisha R. S. Hall, Gandhi Sivakumar, Kushal Patel, Sarvesh S. Patel
  • Patent number: 11017763
    Abstract: During text-to-speech processing, a sequence-to-sequence neural network model may process text data and determine corresponding spectrogram data. A normalizing flow component may then process this spectrogram data to predict corresponding phase data. An inverse Fourier transform may then be performed on the spectrogram and phase data to create an audio waveform that includes speech corresponding to the text.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: May 25, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Vatsal Aggarwal, Nishant Prateek, Roberto Barra Chicote, Andrew Paul Breen
  • Patent number: 11016719
    Abstract: A method for producing an audio representation of aggregated content includes selecting preferred content from a number of sources, wherein the sources are emotion-tagged, aggregating the emotion-tagged preferred content sources, and creating an audio representation of the emotion-tagged aggregated content. The aggregation of emotion-tagged content sources and/or the creation of the audio representation may be performed by a mobile device. The emotion-tagged content include text with HTML tags that specify how text-to-speech conversion should be performed.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: May 25, 2021
    Assignee: Dish Technologies L.L.C.
    Inventor: John C. Calef, III
  • Patent number: 11011163
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for recognizing voice. A specific implementation of the method comprises: receiving voice information sent by a user through a terminal, and acquiring simultaneously a user identifier of the user; recognizing the voice information to obtain a first recognized text; determining a word information set stored in association with the user identifier of the user based on the user identifier of the user; and processing the first recognized text based on word information in the determined word information set to obtain a second recognized text, and sending the second recognized text to the terminal. The implementation improves the accuracy of voice recognition and meets a personalized need of a user.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: May 18, 2021
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Niandong Du, Yan Xie
  • Patent number: 11003418
    Abstract: An information processing apparatus includes a detection unit and a controller. The detection unit detects a specific operation which is an operation to output audio related to a setting screen, and refers to an operation not to be performed for each setting item displayed on the setting screen. The controller performs control so as to output first audio information related to setting items satisfying a predetermined standard by audio, among setting items included in the setting contents, in a case where the specific operation is detected by the detection unit at a first stage before the setting contents are determined.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: May 11, 2021
    Assignee: FUJI XEROX CO., LTD.
    Inventor: Satoshi Kawamura
  • Patent number: 11004451
    Abstract: A system, a user terminal, a method of the system, a method of the user terminal, and a computer program product are provided. The system includes a communication interface, at least one processor operatively coupled to the communication interface, and at least one piece of memory operatively coupled to the at least one processor, wherein the at least one piece of memory is configured to store instructions configured for the at least one processor to receive sound data from a first external device through the communication interface, obtain a voice signal and a noise signal from the sound data using at least some of an automatic voice recognition module, change the voice signal into text data, determine a noise pattern based on at least some of the noise signal, and determine a domain using the text data and the noise pattern when the memory operates.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: May 11, 2021
    Inventors: Taegu Kim, Sangyong Park, Jungwook Park, Dale Noh, Dongho Jang
  • Patent number: 10991360
    Abstract: A system and method are disclosed for generating customized text-to-speech voices for a particular application. The method comprises generating a custom text-to-speech voice by selecting a voice for generating a custom text-to-speech voice associated with a domain, collecting text data associated with the domain from a pre-existing text data source and using the collected text data, generating an in-domain inventory of synthesis speech units by selecting speech units appropriate to the domain via a search of a pre-existing inventory of synthesis speech units, or by recording the minimal inventory for a selected level of synthesis quality. The text-to-speech custom voice for the domain is generated utilizing the in-domain inventory of synthesis speech units. Active learning techniques may also be employed to identify problem phrases wherein only a few minutes of recorded data is necessary to deliver a high quality TTS custom voice.
    Type: Grant
    Filed: July 31, 2017
    Date of Patent: April 27, 2021
    Assignee: Cerence Operating Company
    Inventors: Srinivas Bangalore, Junlan Feng, Mazin Gilbert, Juergen Schroeter, Ann K. Syrdal, David Schulz
  • Patent number: 10977271
    Abstract: A method of normalizing security log data can include receiving one or more security logs including unstructured data from a plurality of devices and reviewing unstructured data of the one or more security logs. The method also can include automatically applying a probabilistic model of one or more engines to identify one or more attributes or features of the unstructured data, and determine whether the identified attributes or features are indicative of identifiable entities, and tagging one or more identifiable entities of the identifiable entities, as well as organizing tagged entities into one or more normalized logs having a readable format with a prescribed schema. In addition, the method can include reviewing the one or more normalized logs for potential security events.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: April 13, 2021
    Assignee: Secureworks Corp.
    Inventor: Lewis McLean
  • Patent number: 10978043
    Abstract: An approach is provided in which an information handling system converts a first set of text to synthesized speech using a text-to-speech converter. The information handling system then converts the synthesized speech to a second set of text using a speech-to-text converter. In response to converting the synthesized speech to the second set of text, the information handling system analyzes the second set of text against a filtering criterion and prevents usage of the synthesized speech based on the analysis.
    Type: Grant
    Filed: October 1, 2018
    Date of Patent: April 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kyle M. Brake, Stanley J. Vernier, Stephen A. Boxwell, Keith G. Frost
  • Patent number: 10978184
    Abstract: A medical information server receives a signal from a client device over a network, representing a first user interaction of a user with respect to first medical information displayed to a user. A user interaction analyzer invokes a first set of ECCD rules associated with the user based on the first user interaction to determine medical data categories that the user is likely interested in. The first set of ECCD rules was generated by an ECCD engine based on prior user interactions of the user. A data retrieval module accesses medical data servers corresponding to the medical data categories to retrieve medical data of the medical data categories. A view generator integrates the retrieved medical data to generate one or more views of second medical information and transmits the views of second medical information to a client device to be displayed on a display of the client device.
    Type: Grant
    Filed: October 31, 2014
    Date of Patent: April 13, 2021
    Assignee: TeraRecon, Inc.
    Inventor: Jeffrey Sorenson
  • Patent number: 10977442
    Abstract: Methods and apparatus, including computer program products, are provided for a contextualized bot framework.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: April 13, 2021
    Assignee: SAP SE
    Inventors: Natesan Sivagnanam, Jayananda A. Kotri
  • Patent number: 10971034
    Abstract: A method of automatically partitioning a refreshable braille display based on presence of pertinent ancillary alphanumeric content. In an unpartitioned configuration, every braille cell of the refreshable braille display is used to output the primary alphanumeric content. When the refreshable braille display outputs a segment of the primary alphanumeric content having associated ancillary alphanumeric content, such as a footnote or a comment, the braille display is automatically partitioned into a first partition and a second partition. The braille cells of the first partition are allocated for outputting the primary alphanumeric content, while the braille cells of the second partition are allocated for outputting the ancillary alphanumeric content.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: April 6, 2021
    Assignee: Freedom Scientific, Inc.
    Inventors: James T. Datray, Joseph Kelton Stephen, Glen Gordon
  • Patent number: 10970909
    Abstract: Embodiments of the present disclosure provide a method and an apparatus for eye movement synthesis, the method including: obtaining eye movement feature data and speech feature data, wherein the eye movement feature data reflects an eye movement behavior, and the speech feature data reflects a voice feature; obtaining a driving model according to the eye movement feature data and the speech feature data, wherein the driving model is configured to indicate an association between the eye movement feature data and the speech feature data; synthesizing an eye movement of a virtual human according to speech input data and the driving model and controlling the virtual human to exhibit the synthesized eye movement. The embodiment makes the virtual human to exhibit an eye movement corresponding to the voice data according to the eye movement feature data and the speech feature data, thereby improving the authenticity in the interaction.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: April 6, 2021
    Assignee: BEIHANG UNIVERSITY
    Inventors: Feng Lu, Qinping Zhao
  • Patent number: 10967223
    Abstract: Tracking and monitoring athletic activity offers individuals with additional motivation to continue such behavior. An individual may track his or her athletic activity by completing goals. These goals may be represented by real-world objects such as food items, landmarks, buildings, statues, other physical structures, toys and the like. Each object may correspond to an athletic activity goal and require an amount of athletic activity to complete the goal. For example, a donut goal object may correspond to an athletic activity goal of burning 350 calories. The user may progress from goal object to goal object. Goal objects may increase in difficulty (e.g., amount of athletic activity required) and might only be available for selection upon completing an immediately previous goal object, a number of goal objects, an amount of athletic activity and the like.
    Type: Grant
    Filed: March 16, 2020
    Date of Patent: April 6, 2021
    Assignee: NIKE, Inc.
    Inventors: Michael T. Hoffman, Kwamina Crankson, Jason Nims
  • Patent number: 10963068
    Abstract: An input device is provided having at least a plurality of touch-sensitive back surfaces, with ability to split into several independent units, with adjustable vertical and horizontal angles between those units; a method of dividing the keys of a keyboard into interface groups; each group having a home key and associated with a finger; a method of dynamically mapping and remapping the home keys of each interface group to the coordinates of the associated fingers at their resting position, and mapping non-home keys around theft associated home keys on the touch-sensitive back surfaces; a method of calculating and remapping home keys and non-home keys when the coordinates of the fingers at the resting position shift during operation; a method of reading each activated key immediately and automatically; a method of detecting typos and grammatical errors and notifying operators using human speech or other communication methods.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: March 30, 2021
    Inventor: Hovsep Giragossian
  • Patent number: 10943060
    Abstract: A collaborative content management system allows multiple users to access and modify collaborative documents. When audio data is recorded by or uploaded to the system, the audio data may be transcribed or summarized to improve accessibility and user efficiency. Text transcriptions are associated with portions of the audio data representative of the text, and users can search the text transcription and access the portions of the audio data corresponding to search queries for playback. An outline can be automatically generated based on a text transcription of audio data and embedded as a modifiable object within a collaborative document. The system associates hot words with actions to modify the collaborative document upon identifying the hot words in the audio data. Collaborative content management systems can also generate custom lexicons for users based on documents associated with the user for use in transcribing audio data, ensuring that text transcription is more accurate.
    Type: Grant
    Filed: October 18, 2019
    Date of Patent: March 9, 2021
    Assignee: Dropbox, Inc.
    Inventors: Timo Mertens, Bradley Neuberg
  • Patent number: 10942701
    Abstract: A method for performing voice dictation with an earpiece worn by a user includes receiving as input to the earpiece voice sound information from the user at one or more microphones of the earpiece, receiving as input to the earpiece user control information from one or more sensors within the earpiece independent from the one or more microphones of the earpiece, inserting a machine-generated transcription of the voice sound information from the user into a user input area associated with an application executing on a computing device and manipulating the application executing on the computing device based on the user control information.
    Type: Grant
    Filed: October 23, 2017
    Date of Patent: March 9, 2021
    Assignee: BRAGI GmbH
    Inventors: Peter Vincent Boesen, Luigi Belverato, Martin Steiner
  • Patent number: 10937412
    Abstract: According to an embodiment of the present invention, there is provided a terminal including a memory which stores a prosody correction model; a processor which corrects a first prosody prediction result of a text sentence to a second prosody prediction result based on the prosody correction model and generates a synthetic speech corresponding to the text sentence having a prosody according to the second prosody prediction result; and an audio output unit which outputs the generated synthetic speech.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: March 2, 2021
    Assignee: LG ELECTRONICS INC.
    Inventors: Jonghoon Chae, Sungmin Han, Yongchul Park, Siyoung Yang, Juyeong Jang
  • Patent number: 10936360
    Abstract: Embodiments of the present disclosure provide a method and device for of processing a batch process including a plurality of content management service operations. The method, comprises: determining, at a client, a batch process template associated with the batch process, the batch process template including shareable information and at least one variable field of the plurality of content management service operations; determining a value of the at least one variable field; generating, based on the determined batch process template and the value, a first request for performing the batch process template; and sending the first request to a server. Embodiments of the present disclosure further provide a corresponding method performed at a server side, and a corresponding device.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: March 2, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Wei Ruan, Jason Chen, Wei Zhou, Chen Wang, Zed Minhong
  • Patent number: 10923100
    Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: February 16, 2021
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Jakob Nicolaus Foerster
  • Patent number: 10922446
    Abstract: A computational accelerator for determination of linkages across disparate works in a model-based system engineering (MBSE) regime accesses textual content of MBSE works and performs preprocessing of each MBSE work to produce a preprocessed data structures representing the MBSE works. The preprocessing gatherings significant terms from each MBSE work, and delineates the textual content of each MBSE work into segments corresponding to separately identifiable textual statements. Segment-wise comparison between segment pairings of the preprocessed data structures corresponding to different MBSE works is performed to produce a set of segment-wise comparison results based on terms common to each segment pairing, and statement-wise linkages between statements of the MBSE works are determined based on the set of segment-wise comparison results.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: February 16, 2021
    Assignee: Raytheon Company
    Inventors: Susan N. Gottschlich, Gregory S. Schrecke, Patrick M. Killian
  • Patent number: 10916236
    Abstract: An output device includes a memory and a processor coupled to the memory. The processor obtains an utterance command and an action command, analyzes an utterance content of the utterance command inputted after an action performed in response to the action command, modifies the action command based on a result of the analysis, and outputs the modified action command and the utterance command.
    Type: Grant
    Filed: March 11, 2019
    Date of Patent: February 9, 2021
    Assignee: FUJITSU LIMITED
    Inventors: Kaoru Kinoshita, Masayoshi Shimizu, Shinji Kanda
  • Patent number: 10902323
    Abstract: Methods and apparatus, including computer program products, are provided for a bot framework. In some implementations, there may be provided a method which may include receiving a request comprising a text string, the request corresponding to a request for handling by a bot; generating, from the request, at least one token; determining whether the at least one token matches at least one stored token mapped to an address; selecting the address in response to the match between the at least one token and the at least one stored token; and presenting, at a client interface associated with the bot, data obtained at the selected address in order to form a response to the request. Related systems, methods, and articles of manufacture are also disclosed.
    Type: Grant
    Filed: August 11, 2017
    Date of Patent: January 26, 2021
    Assignee: SAP SE
    Inventors: Natesan Sivagnanam, Abhishek Jain
  • Patent number: 10885904
    Abstract: A natural language processing system and method includes a computing device that applies a phonetic code algorithm to a received proper name uttered by a user and determines from a phonetic name database whether multiple different spellings of the name exist. The computing device recognizes an utterance of the user providing a natural language cue regarding the correct spelling of the name or provides a voice prompt to the user including a natural language cue regarding the correct spelling of the name, and converts the name to text including the correct spelling.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: January 5, 2021
    Assignee: MASTERCARD INTERNATIONAL INCORPORATED
    Inventor: Robert Collins
  • Patent number: 10884771
    Abstract: The present disclosure provides a method and a device for displaying multi-language typesetting, a browser, a terminal and a computer readable storage medium. The method includes: obtaining a text to be typeset; identifying embedded language content in a principal language text of the text to be typeset, wherein the embedded language content comprises at least one non-principal language content embedded in the principal language text; determining replacement content of the embedded language content, wherein the replacement content comprises a principal language text corresponding to the embedded language content or an abbreviation of a non-principal language text in the embedded language content; and replacing the embedded language content with the replacement content.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: January 5, 2021
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventor: Qiugen Xiao
  • Patent number: 10885908
    Abstract: An embodiment of the present application discloses a method and apparatus for processing information. A specific implementation of the method comprises: receiving a weather-related voice request sent by a user; identifying the voice request, and obtaining weather information corresponding to the voice request; extracting key information based on the weather information to generate a weather data set; and feeding weather data in the weather data set back to the user. The implementation can help to enrich broadcasting ways and/or broadcasting contents of the weather information.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: January 5, 2021
    Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
    Inventors: Hualong Zu, Haiguang Yuan, Ran Xu, Fang Duan, Chen Chen, Lei Shi
  • Patent number: 10878803
    Abstract: A method, device, and storage medium for converting text to speech are described. The method includes obtaining target text; synthesizing first machine speech corresponding to the target text; and selecting an asynchronous machine speech whose prosodic feature matches a prosodic feature of the first machine speech from an asynchronous machine speech library. The method also includes searching a synchronous machine speech library for a first synchronous machine speech corresponding to the asynchronous machine speech; synthesizing, based on a prosodic feature of the first synchronous machine speech, second machine speech corresponding to the target text; and selecting a second synchronous machine speech matching an acoustic feature of the second machine speech from the synchronous machine speech library. The method further includes splicing speaker speech units corresponding to the synchronous machine speech unit in a speaker speech library, to obtain a target speaker speech.
    Type: Grant
    Filed: March 22, 2019
    Date of Patent: December 29, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Haolei Yuan, Xiao Mei
  • Patent number: 10872692
    Abstract: Systems and methods are provided for data driven analysis, modeling, and semi-supervised machine learning for qualitative and quantitative determinations. The systems and methods include obtaining data associated with individuals, and determining features associated with the individuals based on the data and similarities among the individuals based on the features. The systems and methods can label some individuals as exemplary, generate a graph wherein nodes of the graph represent individuals, edges of the graph represent similarity among the individuals, and nodes associated labeled individuals are weighted. The disclosed system and methods can apply a weight to unweighted nodes of the graph based on propagating the labels through the graph where the propagation is based on influence exerted by the weighted nodes on the unweighted nodes. The disclosed systems and methods can provide output associated with the individuals represented on the graph and the associated weights.
    Type: Grant
    Filed: August 31, 2018
    Date of Patent: December 22, 2020
    Assignee: GRAND ROUNDS, INC.
    Inventors: Seiji James Yamamoto, Ranjit Chacko
  • Patent number: 10872116
    Abstract: Disclosed herein are systems, devices, and methods for contextualizing media. In some variations, a method of organizing audio may comprise generating first graph data nodes from structured text data comprising a predetermined audio data model and generating second graph data nodes from unstructured data. The first and second graph data nodes may be associated with the audio. The one or more first graph data nodes may be linked to the one or more corresponding second graph data nodes using a natural language processing model.
    Type: Grant
    Filed: September 24, 2019
    Date of Patent: December 22, 2020
    Assignee: TIMECODE ARCHIVE CORP.
    Inventors: Ji Yim, Baik Hoh
  • Patent number: 10856067
    Abstract: A hearing protection device configured to measure the noise exposure of a user and to provide a time remaining estimate until a maximum noise exposure level is achieved. Generally, the hearing protection device may comprise a sound sealing section, a microphone configured to generate a detected sound signal indicative of the measured sound level at the user's ear canal, a memory/storage configured to record the detected sound signal as sound exposure history, a timer configured to measure the amount of time the hearing protection device is exposed to sound, and a processor configured to receive a timer signal from the timer and the detected sound signal to calculate a time remaining estimate until reaching the maximum noise exposure level. Additionally, the maximum noise exposure level may be based on industry regulations or personalized hearing thresholds.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: December 1, 2020
    Assignee: Honeywell International Inc.
    Inventors: John Jenkins, Neal Muggleton, Viggo Henriksen, May Wilson, Trym Holter, Claes Haglund
  • Patent number: 10845909
    Abstract: Method, apparatus, and computer-readable media for touch and speech interface, with audio location, includes structure and/or function whereby at least one processor: (i) receives a touch input from a touch device; (ii) establishes a touch-speech time window; (iii) receives a speech input from a speech device; (iii) determines whether the speech input is present in a global dictionary; (iv) determines a location of a sound source from the speech device; (v) determines whether the touch input and the location of the speech input are both within a same region; (vi) if the speech input is in the dictionary, determines whether the speech input has been received within the window; and (vii) if the speech input has been received within the window, and the touch input and the speech input are both within the same region, activates an action corresponding to both the touch input and the speech input.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: November 24, 2020
    Inventors: David Popovich, David Douglas Springgay, David Frederick Gurnsey
  • Patent number: 10841702
    Abstract: A BGM signal of sound of BGM is outputted over time, and a sound effect signal of sound of a sound effect is outputted at a timing determined based on information processing. If it is determined that a sound intensity at a predetermined frequency component of the sound effect to be outputted is low or a sound intensity at the predetermined frequency component of the BGM is high, adjustment is performed such that the sound intensity at the predetermined frequency component of the BGM to be outputted at the same timing as the sound effect is decreased. Then, a final sound signal including the sound of the sound effect and the sound of the BGM on which the adjustment has been performed, is synthesized and outputted.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: November 17, 2020
    Assignee: Nintendo Co., Ltd.
    Inventors: Kaoru Kita, Yoshito Sekigawa
  • Patent number: 10832652
    Abstract: A method is performed by at least one processor, and includes acquiring training speech data by concatenating speech segments having a lowest target cost among candidate concatenation solutions, and extracting training speech segments of a first annotation type, from the training speech data, the first annotation type being used for annotating that a speech continuity of a respective one of the training speech segments is superior to a preset condition.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: November 10, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Haolei Yuan, Fuzhang Wu, Binghua Qian
  • Patent number: 10825444
    Abstract: The present disclosure provides a speech synthesis method and apparatus, a computer device and a readable medium. The method comprises: when problematic speech appears in speech splicing and synthesis, predicting a time length of a state of each phoneme corresponding to a target text corresponding to the problematic speech and a base frequency of each frame, according to pre-trained time length predicting model and base frequency predicting model; according to the time length of the state of each phoneme corresponding to the target text and the base frequency of each frame, using a pre-trained speech synthesis model to synthesize speech corresponding to the target text; wherein the time length predicting model, the base frequency predicting model and the speech synthesis model are all obtained by training based on a speech library resulting from speech splicing and synthesis.
    Type: Grant
    Filed: December 7, 2018
    Date of Patent: November 3, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Yu Gu, Xiaohui Sun
  • Patent number: 10817782
    Abstract: A system for textual analysis of task performances. The system includes a receiving module operating on at least a server configured to receive at least a request for a task performance. The system includes a language processing module operating on the at least a server configured to parse the at least a request for a task performance and retrieve at least a task performance datum, categorize the at least a request for a task performance to at least a task performance list, and assign the at least a request for a task performance to a task performance owner. The system includes a task generator module configured to generate at least a task performance data element containing a task performance list label and a priority label.
    Type: Grant
    Filed: July 23, 2019
    Date of Patent: October 27, 2020
    Inventor: Joseph Rando
  • Patent number: 10810590
    Abstract: There is provided a client device, method and system for facilitating a payment from a customer to a merchant. Payment is carried out upon use of voice data for authentication of a user and subsequent transmission of a payment authorization message.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: October 20, 2020
    Assignee: MASTERCARD ASIA/PACIFIC PTE. LTD.
    Inventors: Frédéric Fortin, Rajat Maheshwari, Benjamin Charles Gilbey
  • Patent number: 10810381
    Abstract: A speech converter for inter-translation between Chinese and the official language of the other party includes a matching card for forming unidirectional matches of a word, a phrase a word group, a short sentence, a common expression and, a sentence expressing the same meaning. The matching cards include a word matching card, a phrase matching card, a word group matching card, a short sentence matching card, a common expression matching card and a sentence matching card. The present invention enables a user to translate what he/she wants to say into a language that a listener can understand immediately, so that the listener can hear and understand these words and answer immediately, and the answer is sent back in mandarin.
    Type: Grant
    Filed: July 10, 2018
    Date of Patent: October 20, 2020
    Assignee: SHENZHEN TONGYIKA TECHNOLOGY CO., LTD.
    Inventor: Yong Chen
  • Patent number: 10806388
    Abstract: Methods and apparatus to identify an emotion evoked by media are disclosed. An example apparatus includes a synthesizer to generate a first synthesized sample based on a pre-verbal utterance associated with a first emotion. A feature extractor is to identify a first value of a first feature of the first synthesized sample. The feature extractor to identify a second value of the first feature of first media evoking an unknown emotion. A classification engine is to create a model based on the first feature. The model is to establish a relationship between the first value of the first feature and the first emotion. The classification engine is to identify the first media as evoking the first emotion when the model indicates that the second value corresponds to the first value.
    Type: Grant
    Filed: October 16, 2017
    Date of Patent: October 20, 2020
    Assignee: The Nielsen Company (US), LLC
    Inventors: Robert T. Knight, Ramachandran Gurumoorthy, Alexander Topchy, Ratnakar Dev, Padmanabhan Soundararajan, Anantha Pradeep
  • Patent number: 10795671
    Abstract: Audiovisual documentation of source code in an integrated development environment. A computing device initiates a knowledge transfer session for discussion of source code and generation of audiovisual source code documentation explaining segments of source code from a code base. An audiovisual interface containing a segment of code from the code base is displayed within the integrated development environment. Audio during the knowledge transfer session is recorded with a recording device. Code tracking indicators from an optical tracking device operated by a user are received when the user is reviewing and focused on the segment of code. The computing device determines via the code tracking indicators a module of the segment of code under review. Portions of the recorded audio are associated with the determined module of the segment of code to generate audiovisual source code documentation. The knowledge transfer session is terminated.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: October 6, 2020
    Assignee: International Business Machines Corporation
    Inventors: Aniya Aggarwal, Danish Contractor, Varun Parashar
  • Patent number: 10783890
    Abstract: In a particular aspect, a speech generator includes a signal input configured to receive a first audio signal. The speech generator also includes at least one speech signal processor configured to generate a second audio signal based on information associated with the first audio signal and based further on automatic speech recognition (ASR) data associated with the first audio signal.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: September 22, 2020
    Assignee: Moore Intellectual Property Law, PLLC
    Inventors: Erik Visser, Shuhua Zhang, Lae-Hoon Kim, Yinyi Guo, Sunkuk Moon
  • Patent number: 10780350
    Abstract: A robot, including: a controller that communicates with a computing device, wherein the computing device executes a video game and renders a primary video feed of the video game to a display device, the primary video feed providing a first view into a virtual space that is defined by the executing video game; a camera that captures images of a local environment in which the robot is disposed; a projector; wherein the controller processes the images of the local environment to identify a projection surface in the local environment; wherein the computing device generates a secondary video feed of the video game and transmits the secondary video feed of the video game to the robot, the secondary video feed providing a second view into the virtual space; wherein the controller of the robot activates the projector to project the secondary video feed onto the projection surface in the local environment.
    Type: Grant
    Filed: December 18, 2018
    Date of Patent: September 22, 2020
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Michael Taylor, Jeffrey Roger Stafford