Dynamic Time Warping Patents (Class 704/241)
  • Patent number: 11902690
    Abstract: Techniques performed by a data processing system for a machine learning driven teleprompter include displaying a teleprompter transcript associated with a presentation on a display of a computing device associated with a presenter; receiving audio content of the presentation including speech of the presenter in which the presenter is reading the teleprompter transcript; analyzing the audio content of the presentation using a first machine learning model to obtain a real-time textual translation of the audio content, the first machine learning model being a natural language processing model trained to receive audio content including speech and to translate the audio content into a textual representation of the speech; analyzing the real-time textual representation and the teleprompter transcript with a second machine learning model to obtain transcript position information; and automatically scrolling the teleprompter transcript on the display of the computing device based on the transcript position informatio
    Type: Grant
    Filed: January 19, 2022
    Date of Patent: February 13, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Chakkaradeep Chinnakonda Chandran, Stephanie Lorraine Horn, Michael Jay Gilmore, Tarun Malik, Sarah Zaki, Tiffany Michelle Smith, Shivani Gupta, Pranjal Saxena, Ridhima Gupta
  • Patent number: 11849193
    Abstract: Methods, Systems, and Apparatuses are described to implement voice search in media content for requesting media content of a video clip of a scene contained in the media content streamed to the client device; for capturing the voice request for the media content of the video clip to display at the client device wherein the streamed media content is a selected video streamed from a video source; for applying a NLP solution to convert the voice request to text for matching to a set of one or more words contained in at least close caption text of the selected video; for associating matched words to close caption text with a start index and an end index of the video clip contained in the selected video; and for streaming the video clip to the client device based on the start index and the end index associated with matched closed caption text.
    Type: Grant
    Filed: October 19, 2022
    Date of Patent: December 19, 2023
    Inventor: Mayank Verma
  • Patent number: 11830473
    Abstract: A system for synthesising expressive speech includes: an interface configured to receive an input text for conversion to speech; a memory; and at least one processor coupled to the memory. The processor is configured to generate, using an expressivity characterisation module, a plurality of expression vectors, wherein each expression vector is a representation of prosodic information in a reference audio style file, and synthesise expressive speech from the input text, using an expressive acoustic model comprising a deep convolutional neural network that is conditioned by at least one of the plurality of expression vectors.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: November 28, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jesus Monge Alvarez, Holly Francois, Hosang Sung, Seungdo Choi, Kihyun Choo, Sangjun Park
  • Patent number: 11785278
    Abstract: Alignment between closed caption and audio/video content may be improved by determining text associated with a portion of the audio or a portion of the video and comparing the determined text to a portion of closed caption text. Based on the comparison, a delay may be determined and the audio/video content may be buffered based on the determined delay.
    Type: Grant
    Filed: March 18, 2022
    Date of Patent: October 10, 2023
    Assignee: Comcast Cable Communications, LLC
    Inventor: Christopher Stone
  • Patent number: 11736773
    Abstract: Systems and methods for generating audible pronunciation of a closed captioning word in a content item. For example, a system generates for output on a first device a content item comprising dialogue. The system generates for display on the first device a closed captioning word corresponding to the dialogue where the closed captioning word is selectable via a user interface of the first device. The system receives a selection of the closed captioning word via the user interface of the first device. In response to receiving the selection of the closed captioning word, the system generates for playback on the first device at least a portion of the dialogue corresponding to the selected closed captioning word.
    Type: Grant
    Filed: October 15, 2021
    Date of Patent: August 22, 2023
    Assignee: Rovi Guides, Inc.
    Inventor: Serhad Doken
  • Patent number: 11683558
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to determine the speed-up of media programs using speech recognition. An example apparatus disclosed herein is to perform speech recognition on a first audio clip collected by a media meter to recognize a first text string associated with the first audio clip, compare the first text string to a plurality of reference text strings associated with a corresponding plurality of reference audio clips to identify a matched one of the reference text strings, and estimate a presentation rate of the first audio clip based on a first time associated with the first audio clip and a second time associated with a first one of the reference audio clips corresponding to the matched one of the reference text strings.
    Type: Grant
    Filed: December 29, 2021
    Date of Patent: June 20, 2023
    Assignee: THE NIELSEN COMPANY (US), LLC
    Inventor: Morris Lee
  • Patent number: 11615800
    Abstract: A speaker recognition system for assessing the identity of a speaker through a speech signal based on speech uttered by said speaker is provided. The system includes a framing module that subdivides the speech signal over time into a set of frames, and a filtering module that analyzes the frames of the set to discard frames affected by noise and frames which do not comprise a speech, based on a spectral analysis of the frames. A feature extraction module extracts audio features from frames which have not been discarded, and a classification module processes the audio features extracted from the frames which have not been discarded for assessing the identity of the speaker.
    Type: Grant
    Filed: April 18, 2018
    Date of Patent: March 28, 2023
    Assignee: TELECOM ITALIA S.p.A.
    Inventors: Igor Bisio, Cristina Fra', Chiara Garibotto, Fabio Lavagetto, Andrea Sciarrone, Massimo Valla
  • Patent number: 11509969
    Abstract: Methods, Systems, and Apparatuses are described to implement voice search in media content for requesting media content of a video clip of a scene contained in the media content streamed to the client device; for capturing the voice request for the media content of the video clip to display at the client device wherein the streamed media content is a selected video streamed from a video source; for applying a NLP solution to convert the voice request to text for matching to a set of one or more words contained in at least close caption text of the selected video; for associating matched words to close caption text with a start index and an end index of the video clip contained in the selected video; and for streaming the video clip to the client device based on the start index and the end index associated with matched closed caption text.
    Type: Grant
    Filed: May 25, 2021
    Date of Patent: November 22, 2022
    Inventor: Mayank Verma
  • Patent number: 11445266
    Abstract: Audiovisual content in the form of video clip files, streamed or broadcasted may further contain subtitles. Such subtitles are provided with timing information so that each subtitle should be displayed synchronously with the spoken words. However, at times such synchronization with the audio portion of the audiovisual content has a timing offset which when above a predetermined threshold is bothersome. The system and method determine time spans in which a human speaks and attempts to synchronize those time spans with the subtitle content. Indication is provided when an incurable synchronization exists as well as the case where the subtitles and audio are well synchronized. It further is able to determine, when an offset exists, the type of offset (constant or dynamic) and providing the necessary adjustment information so that the timing used in conjunction with the subtitles timing provided may be corrected and synchronization deficiency resolved.
    Type: Grant
    Filed: March 12, 2021
    Date of Patent: September 13, 2022
    Assignee: IChannel.IO Ltd.
    Inventor: Oren Jack Maurice
  • Patent number: 11270123
    Abstract: Embodiments described herein provide a system for localized contextual video annotation. During operation, the system can segment a video into a plurality of segments based on a segmentation unit and parse a respective segment for generating multiple input modalities for the segment. A respective input modality can indicate a form of content in the segment. The system can then classify the segment into a set of semantic classes based on the input modalities and determine an annotation for the segment based on the set of semantic classes.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: March 8, 2022
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Karunakaran Sureshkumar, Raja Bala
  • Patent number: 11223878
    Abstract: An electronic device is disclosed. The electronic device comprises: a microphone for receiving voice; a memory for storing a plurality of text sets; and a processor for converting the voice, received via the microphone, into text, searching for words common to the converted text with respect to each of the plurality of text sets, and determining at least one text set of the plurality of text sets on the basis of the ratio of the searched common words.
    Type: Grant
    Filed: October 25, 2018
    Date of Patent: January 11, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Jae Hyun Bae
  • Patent number: 11205419
    Abstract: Low energy deep-learning networks for generating auditory features such as mel frequency cepstral coefficients in audio processing pipelines are provided. In various embodiments, a first neural network is trained to output auditory features such as mel-frequency cepstral coefficients, linear predictive coding coefficients, perceptual linear predictive coefficients, spectral coefficients, filter bank coefficients, and/or spectro-temporal receptive fields based on input audio samples. A second neural network is trained to output a classification based on input auditory features such as mel-frequency cepstral coefficients. An input audio sample is provided to the first neural network. Auditory features such as mel-frequency cepstral coefficients are received from the first neural network. The auditory features such as mel-frequency cepstral coefficients are provided to the second neural network. A classification of the input audio sample is received from the second neural network.
    Type: Grant
    Filed: August 28, 2018
    Date of Patent: December 21, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Davis Barch, Andrew S. Cassidy, Myron D. Flickner
  • Patent number: 11195527
    Abstract: Apparatus and method for processing speech recognition may include a speech recognition module that recognizes a voice uttered from a user, and a processing module that calls a user DB where information associated with the user is registered when a voice command of the user is input by the speech recognition module, verifies setting information related to a domain corresponding to the voice command, and processes the voice command through a content provider linked to the associated domain.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: December 7, 2021
    Assignees: Hyundai Motor Company, Kia Motors Corporation
    Inventor: Jae Min Joh
  • Patent number: 11178463
    Abstract: An electronic device is disclosed. The electronic device comprises: a microphone for receiving voice; a memory for storing a plurality of text sets; and a processor for converting the voice, received via the microphone, into text, searching for words common to the converted text with respect to each of the plurality of text sets, and determining at least one text set of the plurality of text sets on the basis of the ratio of the searched common words.
    Type: Grant
    Filed: October 25, 2018
    Date of Patent: November 16, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Jae Hyun Bae
  • Patent number: 11032620
    Abstract: Methods, Systems, and Apparatuses are described to implement voice search in media content for requesting media content of a video clip of a scene contained in the media content streamed to the client device; for capturing the voice request for the media content of the video clip to display at the client device wherein the streamed media content is a selected video streamed from a video source; for applying a NLP solution to convert the voice request to text for matching to a set of one or more words contained in at least close caption text of the selected video; for associating matched words to close caption text with a start index and an end index of the video clip contained in the selected video; and for streaming the video clip to the client device based on the start index and the end index associated with matched closed caption text.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: June 8, 2021
    Assignee: SLING MEDIA PVT LTD
    Inventor: Mayank Verma
  • Patent number: 11024318
    Abstract: A method of speaker verification comprises: comparing a test input against a model of a user's speech obtained during a process of enrolling the user; obtaining a first score from comparing the test input against the model of the user's speech; comparing the test input against a first plurality of models of speech obtained from a first plurality of other speakers respectively; obtaining a plurality of cohort scores from comparing the test input against the plurality of models of speech obtained from a plurality of other speakers; obtaining statistics describing the plurality of cohort scores; modifying said statistics to obtain adjusted statistics; normalising the first score using the adjusted statistics to obtain a normalised score; and using the normalised score for speaker verification.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: June 1, 2021
    Assignee: Cirrus Logic, Inc.
    Inventors: John Paul Lesso, Gordon Richard McLeod
  • Patent number: 11011177
    Abstract: A voice identification method comprises: obtaining audio data, and extracting an audio feature of the audio data; determining whether a voice identification feature having a similarity with the audio feature above a preset matching threshold exists in an associated feature library; and in response to determining that the voice identification feature exists in the associated feature library, updating, by using the audio feature, the voice identification feature obtained through matching.
    Type: Grant
    Filed: June 14, 2018
    Date of Patent: May 18, 2021
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventors: Gang Liu, Qingen Zhao, Guangxing Liu
  • Patent number: 11010645
    Abstract: A method and system for an AI-based communication training system for individuals and organizations is disclosed. A video analyzer is used to convert a video signal into a plurality of human morphology features with an accompanying audio analyzer converting an audio signal into a plurality of human speech features. A transformation module transforms the morphology features and the speech features into a current multi-dimensional performance vector and combinatorial logic generates an integration of the current multi-dimensional performance vector and one or more prior multi-dimensional performance vectors to generate a multi-session rubric. Backpropagation logic applies a current multi-dimensional performance vector from the combinatorial logic to the video analyzer and the audio analyzer.
    Type: Grant
    Filed: August 26, 2019
    Date of Patent: May 18, 2021
    Assignee: TalkMeUp
    Inventors: JiaoJiao Xu, Yi Xu, Chenchen Zhu, Matthew Thomas Spettel
  • Patent number: 11005620
    Abstract: Various aspects described herein relate to techniques for uplink reference signal sequence design in wireless communications systems. A method, a computer-readable medium, and an apparatus are provided. In an aspect, the method includes identifying a set of sequences to include at least a base sequence, a reverse order sequence of the base sequence, a complex conjugate sequence of the base sequence, or a reverse order complex conjugate sequence of the base sequence, and transmitting an uplink reference signal based on at least one of the sequences in the set. The techniques described herein may apply to different communications technologies, including 5th Generation (5G) New Radio (NR) communications technology.
    Type: Grant
    Filed: June 14, 2018
    Date of Patent: May 11, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Seyong Park, Renqiu Wang, Yi Huang, Hao Xu, Peter Gaal
  • Patent number: 10884096
    Abstract: An object of the present invention is to facilitate recognition of a voice command of a user in a situation where multiple devices including microphones are connected through a sensor network. A relative location of each device is determined and a location and a direction of the user are tracked through a time difference in which the voice command is applied. The command is interpreted based on the location and the direction of the user. Such a method as a method for a sensor network, Machine to Machine (M2M), Machine Type Communication (MTC), and Internet of Things (IoT) may be used for an intelligent service (smart home, smart building, etc.), digital education, security and safety related services, and the like.
    Type: Grant
    Filed: February 13, 2018
    Date of Patent: January 5, 2021
    Assignee: LUXROBO CO., LTD.
    Inventors: Seungmin Baek, Seungbae Son
  • Patent number: 10747957
    Abstract: In some applications, it may be desired to process a message to determine an intent of the message, where the intent indicates the meaning of the message. An intent classifier may be used to determine the meaning of a message by processing the message to compute a message embedding vector that represents the message in a vector space. Each possible intent may be represented by a prototype vector, and the intent of the message may be determined by comparing the message embedding to one or more prototype vectors, such as by selecting an intent whose prototype vector is closest to the message embedding. An intent classifier may be used, for example, (i) to implement an automated communications system with states where each state is associated with a subset of the possible intents or (ii) for processing usage data of a communications system to update the intents of the communications system.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: August 18, 2020
    Assignee: ASAPP, INC.
    Inventor: Jeremy Elliot Azriel Wohlwend
  • Patent number: 10540991
    Abstract: In various example embodiments, a system and method for determining a crowd response for a crowd are presented. One method is disclosed that includes receiving an audio signal that includes concurrent responses from two or more respondents, determining the concurrent responses from the audio signal without regard to the identity of the respondents, and generating a crowd based on the concurrent responses.
    Type: Grant
    Filed: August 20, 2015
    Date of Patent: January 21, 2020
    Assignee: eBay Inc.
    Inventor: Sergio Pinzon Gonzales, Jr.
  • Patent number: 10482893
    Abstract: A sound processing method includes a step of applying a nonlinear filter to a temporal sequence of spectral envelope of an acoustic signal, wherein the nonlinear filter smooths a fine temporal perturbation of the spectral envelope without smoothing out a large temporal change. A sound processing apparatus includes a smoothing processor configured to apply a nonlinear filter to a temporal sequence of spectral envelope of an acoustic signal, wherein the nonlinear filter smooths a fine temporal perturbation of the spectral envelope without smoothing out a large temporal change.
    Type: Grant
    Filed: November 1, 2017
    Date of Patent: November 19, 2019
    Assignee: YAMAHA CORPORATION
    Inventors: Ryunosuke Daido, Hiraku Kayama
  • Patent number: 10451710
    Abstract: The present disclosure relates to a user identification method applicable to a vehicle, the vehicle including at least two microphone arrays, the respective microphone arrays being disposed at different positions of the vehicle, respectively. The user identification method includes: receiving a voice of a user within the vehicle through the at least two microphone arrays; determining directions from the user to the microphone arrays, respectively, according to the voice; calculating an angle between any two of the directions; and identifying a type of the user based at least on the angle.
    Type: Grant
    Filed: October 26, 2018
    Date of Patent: October 22, 2019
    Assignee: BOE TECHNOLOGY GROUP CO., LTD.
    Inventors: Hongyang Li, Xin Li, Xiangdong Yang
  • Patent number: 10418028
    Abstract: Technologies for detecting an end of a sentence in automatic speech recognition are disclosed. An automatic speech recognition device may acquire speech data, and identify phonemes and words of the speech data. The automatic speech recognition device may perform a syntactic parse based on the recognized words, and determine an end of a sentence based on the syntactic parse. For example, if the syntactic parse indicates that a certain set of consecutive recognized words form a syntactically complete and correct sentence, the automatic speech recognition device may determine that there is an end of a sentence at the end of that set of words.
    Type: Grant
    Filed: November 15, 2017
    Date of Patent: September 17, 2019
    Assignee: Intel Corporation
    Inventors: Oren Shamir, Oren Pereg, Moshe Wasserblat, Jonathan Mamou, Michel Assayag
  • Patent number: 10402924
    Abstract: A system and method are presented for relationship management workflow processes. At least one embodiment may apply to process automation to health care. More specifically, the system and method may be applied to patient management of healthcare, such as the management of Diabetes or other medical conditions. Other embodiments may apply to process automation in other areas utilizing management workflow software.
    Type: Grant
    Filed: February 25, 2014
    Date of Patent: September 3, 2019
    Inventors: Zachary Hinkle, Jason Andrew Loucks, Logan H. Weilenman, Ryan Collins
  • Patent number: 10403267
    Abstract: A method of updating speech recognition data including a language model used for speech recognition, the method including obtaining language data including at least one word; detecting a word that does not exist in the language model from among the at least one word; obtaining at least one phoneme sequence regarding the detected word; obtaining components constituting the at least one phoneme sequence by dividing the at least one phoneme sequence into predetermined unit components; determining information regarding probabilities that the respective components constituting each of the at least one phoneme sequence appear during speech recognition; and updating the language model based on the determined probability information.
    Type: Grant
    Filed: January 16, 2015
    Date of Patent: September 3, 2019
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Chi-youn Park, Il-hwan Kim, Kyung-min Lee, Nam-hoon Kim, Jae-won Lee
  • Patent number: 10360215
    Abstract: Pattern queries are evaluated in parallel over large N-dimensional datasets to identify features of interest.
    Type: Grant
    Filed: March 30, 2015
    Date of Patent: July 23, 2019
    Assignee: EMC Corporation
    Inventors: Angelo E. M. Ciarlini, Fabio A. M. Porto, Amir H. K. Moghadam, Jonas F. Bias, Paulo de Figueiredo Pires, Fabio A. Perosi, Alex L. Bordignon, Bruno Carlos da Cunha Costa, Wagner dos Santos Vieira
  • Patent number: 10311863
    Abstract: There is provided a system including a microphone configured to receive an input speech, an analog to digital (A/D) converter configured to convert the input speech to a digital form and generate a digitized speech including a plurality of segments having acoustic features, a memory storing an executable code, and a processor executing the executable code to extract a plurality of acoustic feature vectors from a first segment of the digitized speech, determine, based on the plurality of acoustic feature vectors, a plurality of probability distribution vectors corresponding to the probabilities that the first segment includes each of a first keyword, a second keyword, both the first keyword and the second keyword, a background, and a social speech, and assign a first classification label to the first segment based on an analysis of the plurality of probability distribution vectors of one or more segments preceding the first segment and the probability distribution vectors of the first segment.
    Type: Grant
    Filed: September 2, 2016
    Date of Patent: June 4, 2019
    Assignee: Disney Enterprises, Inc.
    Inventors: Jill Fain Lehman, Nikolas Wolfe, Andre Pereira
  • Patent number: 10217460
    Abstract: A speech recognition circuit comprises an input buffer for receiving processed speech parameters. A lexical memory contains lexical data for word recognition. The lexical data comprises a plurality of lexical tree data structures. Each lexical tree data structure comprises a model of words having common prefix components. An initial component of each lexical tree structure is unique. A plurality of lexical tree processors are connected in parallel to the input buffer for processing the speech parameters in parallel to perform parallel lexical tree processing for word recognition by accessing the lexical data in the lexical memory. A results memory is connected to the lexical tree processors for storing processing results from the lexical tree processors and lexical tree identifiers to identify lexical trees to be processed by the lexical tree processors.
    Type: Grant
    Filed: December 28, 2016
    Date of Patent: February 26, 2019
    Assignee: ZENTIAN LIMITED.
    Inventor: Mark Catchpole
  • Patent number: 10036818
    Abstract: Method for reducing computational time in inversion of geophysical data to infer a physical property model (91), especially advantageous in full wavefield inversion of seismic data. An approximate Hessian is pre-calculated by computing the product of the exact Hessian and a sampling vector composed of isolated point diffractors (82), and the approximate Hessian is stored in computer hard disk or memory (83). The approximate Hessian is then retrieved when needed (99) for computing its product with the gradient (93) of an objective function or other vector. Since the approximate Hessian is very sparse (diagonally dominant), its product with a vector may therefore be approximated very efficiently with good accuracy. Once the approximate Hessian is computed and stored, computing its product with a vector requires no simulator calls (wavefield propagations) at all. The pre-calculated approximate Hessian can also be reused in the subsequent steps whenever necessary.
    Type: Grant
    Filed: July 14, 2014
    Date of Patent: July 31, 2018
    Assignee: ExxonMobil Upstream Research Company
    Inventors: Yaxun Tang, Sunwoong Lee
  • Patent number: 9886954
    Abstract: One or more context aware processing parameters and an ambient audio stream are received. One or more sound characteristics associated with the ambient audio stream are identified using a machine learning model. One or more actions to perform are determined using the machine learning model and based on the one or more context aware processing parameters and the identified one or more sound characteristics. The one or more actions are performed.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: February 6, 2018
    Assignee: Doppler Labs, Inc.
    Inventors: Jacob Meacham, Matthew Sills, Richard Fritz Lanman, III, Jeffrey Baker
  • Patent number: 9881604
    Abstract: A system and method for identifying special information is provided. Endpoints are defined within a voice recording. One or more of the endpoints are identified within the voice recording and the voice recording is partitioned into segments based on the identified endpoints. Elements of text are identified by applying speech recognition to each of the segments and a list of prompt list candidates are applied to the text elements. The segments with text elements that match one or more prompt list candidates are identified. Portions of the voice recording following the prompt list candidates that include special information are identified and the special information is rendered unintelligible within the voice recording.
    Type: Grant
    Filed: February 9, 2015
    Date of Patent: January 30, 2018
    Assignee: Intellisist, Inc.
    Inventors: Howard M. Lee, Steven Lutz, Gilad Odinak
  • Patent number: 9858922
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for caching speech recognition scores. In some implementations, one or more values comprising data about an utterance are received. An index value is determined for the one or more values. An acoustic model score for the one or more received values is selected, from a cache of acoustic model scores that were computed before receiving the one or more values, based on the index value. A transcription for the utterance is determined using the selected acoustic model score.
    Type: Grant
    Filed: June 23, 2014
    Date of Patent: January 2, 2018
    Assignee: Google Inc.
    Inventors: Eugene Weinstein, Sanjiv Kumar, Ignacio L. Moreno, Andrew W. Senior, Nikhil Prasad Bhat
  • Patent number: 9837069
    Abstract: Technologies for detecting an end of a sentence in automatic speech recognition are disclosed. An automatic speech recognition device may acquire speech data, and identify phonemes and words of the speech data. The automatic speech recognition device may perform a syntactic parse based on the recognized words, and determine an end of a sentence based on the syntactic parse. For example, if the syntactic parse indicates that a certain set of consecutive recognized words form a syntactically complete and correct sentence, the automatic speech recognition device may determine that there is an end of a sentence at the end of that set of words.
    Type: Grant
    Filed: December 22, 2015
    Date of Patent: December 5, 2017
    Assignee: Intel Corporation
    Inventors: Oren Shamir, Oren Pereg, Moshe Wasserblat, Jonathan Mamou, Michel Assayag
  • Patent number: 9805111
    Abstract: A pattern analysing device (27) in a pattern processing node {21} of a data collection system (10) comprises a pattern updating unit equipped with a pattern collecting element configured to obtain an existing pattern of historical data according to at least one existing data model, where the existing pattern relates to an entity (11) associated with the data collection system and obtain a further pattern of newer data according to a further data model, where the further pattern also relates to the entity, <<pattern updating element configured to compare the patterns with each other, determine if the existing data model can be mapped on the further data model and update the existing pattern with the further pattern in relation to the historical data if the existing data model can be mapped on the further data model.
    Type: Grant
    Filed: October 4, 2010
    Date of Patent: October 31, 2017
    Assignee: TELEFONAKTIEBOLAGET L M ERICSSON
    Inventors: Johan Hjelm, Mattias Lidstrom, Mona Matti
  • Patent number: 9703350
    Abstract: The invention relates to an electronic device that includes a wake-up system that operates at a substantially low power level and is applied to wake up the electronic device from a sleep mode. The wake-up system comprises a sound transducer that converts a received sound signal to an electrical signal and a keyword detection logic that preliminarily identifies a speech energy profile that corresponds to at least one of a plurality of keywords in a part of the electrical signal. In some embodiments, a keyword finder is further activated to identify with an enhanced accuracy whether the at least one keyword exists in the part of the electrical signal, and generates a wake-up control to activate a host of the electronic device from its sleep mode.
    Type: Grant
    Filed: June 24, 2013
    Date of Patent: July 11, 2017
    Assignee: Maxim Integrated Products, Inc.
    Inventors: Vivek Nigam, Yadong Wang, Anthony Stephen Doy, Todd D. Moore
  • Patent number: 9684872
    Abstract: A method and an apparatus for generating data in a missing segment of a target time data sequence are disclosed. The method includes: determining whether there is a breakpoint in the missing segment; determining candidate values of the data in the missing segment; and generating values of the data in the missing segment by selectively using the candidate values of the data in the missing segment, according to whether there is the breakpoint in the missing segment. With the method and the apparatus, the data in the missing segment of the target time data sequence can be generated more accurately.
    Type: Grant
    Filed: June 4, 2015
    Date of Patent: June 20, 2017
    Assignee: International Business Machines Corporation
    Inventors: Wei S. Dong, Wen Q. Huang, Chang S. Li, Yu Wang, Junchi Yan, Chao Zhang, Xin Zhang, Xiu F. Zhu
  • Patent number: 9454976
    Abstract: A method is disclosed for discriminating voiced and unvoiced sounds in speech. The method detects characteristic waveform features of voiced and unvoiced sounds, by applying integral and differential functions to the digitized sound signal in the time domain. Laboratory tests demonstrate extremely high reliability in separating voiced and unvoiced sounds. The method is very fast and computationally efficient. The method enables voice activation in resource-limited and battery-limited devices, including mobile devices, wearable devices, and embedded controllers. The method also enables reliable command identification in applications that recognize only predetermined commands. The method is suitable as a pre-processor for natural language speech interpretation, improving recognition and responsiveness. The method enables realtime coding or compression of speech according to the sound type, improving transmission efficiency.
    Type: Grant
    Filed: April 15, 2014
    Date of Patent: September 27, 2016
    Inventor: David Edward Newman
  • Patent number: 9202520
    Abstract: This disclosure relates to systems and methods for determining when a user likes a piece of content based, at least in part, on analyzing user responses to the content. In one embodiment, the user's response may be monitored by audio and motion detection devices to determine when the user's vocals or movements are emulating the content. When the user's emulation exceeds a threshold amount the content may be designated as “liked.” In certain instances, a similar piece of content may be selected to play when the current content is finished.
    Type: Grant
    Filed: October 17, 2012
    Date of Patent: December 1, 2015
    Assignee: Amazon Technologies, Inc.
    Inventor: Joshua K. Tang
  • Patent number: 9190063
    Abstract: A speech recognition system includes distributed processing across a client and server for recognizing a spoken query by a user. A number of different speech models for different natural languages are used to support and detect a natural language spoken by a user. In some implementations an interactive electronic agent responds in the user's native language to facilitate an real-time, human like dialog.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: November 17, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Ian Bennett, Bandi Ramesh Babu, Kishor Morkhandikar, Pallaki Gururaj
  • Patent number: 9165555
    Abstract: A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.
    Type: Grant
    Filed: November 26, 2014
    Date of Patent: October 20, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Vincent Goffin, Andrej Ljolje, Murat Saraclar
  • Patent number: 9123350
    Abstract: A method and system for extracting audio features from an encoded bitstream for audio classification. The method comprises partially decoding the encoded bitstream; obtaining uniform window block size spectral coefficients of the encoded bitstream; and extracting audio features based on the uniform window block spectral coefficients.
    Type: Grant
    Filed: December 14, 2005
    Date of Patent: September 1, 2015
    Assignee: Panasonic Intellectual Property Management Co., Ltd.
    Inventor: Ying Zhao
  • Patent number: 9076448
    Abstract: A real-time system incorporating speech recognition and linguistic processing for recognizing a spoken query by a user and distributed between client and server, is disclosed. The system accepts user's queries in the form of speech at the client where minimal processing extracts a sufficient number of acoustic speech vectors representing the utterance. These vectors are sent via a communications channel to the server where additional acoustic vectors are derived. Using Hidden Markov Models (HMMs), and appropriate grammars and dictionaries conditioned by the selections made by the user, the speech representing the user's query is fully decoded into text (or some other suitable form) at the server. This text corresponding to the user's query is then simultaneously sent to a natural language engine and a database processor where optimized SQL statements are constructed for a full-text search from a database for a recordset of several stored questions that best matches the user's query.
    Type: Grant
    Filed: October 10, 2003
    Date of Patent: July 7, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Ian M. Bennett, Bandi Ramesh Babu, Kishor Morkhandikar, Pallaki Gururaj
  • Patent number: 9066112
    Abstract: A method of designing a code book for super resolution encoding. The method includes, for example, via a processor, creating a first group of entries in the code book that includes a plurality of gray font values for encoding data; via the processor, creating a second group of entries in the code book that includes a set of values for each of the gray font values for decoding data; via the processor, creating a third group of entries in the code book that includes a pattern corresponding to each of the plurality of gray font values; and storing the code book in a database in communication with the processor.
    Type: Grant
    Filed: August 2, 2012
    Date of Patent: June 23, 2015
    Assignee: XEROX CORPORATION
    Inventors: Guo-Yau Lin, Farzin Blurfrushan
  • Patent number: 9025777
    Abstract: An audio signal decoder for providing a decoded multi-channel audio signal representation on the basis of an encoded multi-channel audio signal representation has a time warp decoder configured to selectively use individual audio channel specific time warp contours or a joint multi-channel time warp contour for a reconstruction of a plurality of audio channels represented by the encoded multi-channel audio signal representation.
    Type: Grant
    Filed: July 1, 2009
    Date of Patent: May 5, 2015
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Bayer, Sascha Disch, Ralf Geiger, Guillaume Fuchs, Max Neuendorf, Gerald Schuller, Bernd Edler
  • Patent number: 8996380
    Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.
    Type: Grant
    Filed: May 4, 2011
    Date of Patent: March 31, 2015
    Assignee: Shazam Entertainment Ltd.
    Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
  • Patent number: 8909518
    Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: December 9, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Patent number: 8909527
    Abstract: A method and system for training an automatic speech recognition system are provided. The method includes separating training data into speaker specific segments, and for each speaker specific segment, performing the following acts: generating spectral data, selecting a first warping factor and warping the spectral data, and comparing the warped spectral data with a speech model. The method also includes iteratively performing the steps of selecting another warping factor and generating another warped spectral data, comparing the other warped spectral data with the speech model, and if the other warping factor produces a closer match to the speech model, saving the other warping factor as the best warping factor for the speaker specific segment. The system includes modules configured to control a processor in the system to perform the steps of the method.
    Type: Grant
    Filed: June 24, 2009
    Date of Patent: December 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Vincent Goffin, Andrej Ljolje, Murat Saraclar
  • Patent number: 8909538
    Abstract: Improved methods of presenting speech prompts to a user as part of an automated system that employs speech recognition or other voice input are described. The invention improves the user interface by providing in combination with at least one user prompt seeking a voice response, an enhanced user keyword prompt intended to facilitate the user selecting a keyword to speak in response to the user prompt. The enhanced keyword prompts may be the same words as those a user can speak as a reply to the user prompt but presented using a different audio presentation method, e.g., speech rate, audio level, or speaker voice, than used for the user prompt. In some cases, the user keyword prompts are different words from the expected user response keywords, or portions of words, e.g., truncated versions of keywords.
    Type: Grant
    Filed: November 11, 2013
    Date of Patent: December 9, 2014
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: James Mark Kondziela