Patents Examined by Farzad Kazeminezhad
  • Patent number: 11961507
    Abstract: A transcription of a query for content discovery is generated, and a context of the query is identified, as well as a first plurality of candidate entities to which the query refers. A search is performed based on the context of the query and the first plurality of candidate entities, and results are generated for output. A transcription of a second voice query is then generated, and it is determined whether the second transcription includes a trigger term indicating a corrective query. If so, the context of the first query is retrieved. A second term of the second query similar to a term of the first query is identified, and a second plurality of candidate entities to which the second term refers is determined. A second search is performed based on the second plurality of candidates and the context, and new search results are generated for output.
    Type: Grant
    Filed: March 2, 2023
    Date of Patent: April 16, 2024
    Assignee: Rovi Guides, Inc.
    Inventors: Jeffry Copps Robert Jose, Sindhuja Chonat Sri
  • Patent number: 11942091
    Abstract: Speech processing techniques are disclosed that enable determining a text representation of alphanumeric sequences in captured audio data. Various implementations include determining a contextual biasing finite state transducer (FST) based on contextual information corresponding to the captured audio data. Additional or alternative implementations include modifying probabilities of one or more candidate recognitions of the alphanumeric sequence using the contextual biasing FST, where the FST further comprises a grammar as well as a speller finite state transducer.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: March 26, 2024
    Assignee: GOOGLE LLC
    Inventors: Benjamin Haynor, Petar Aleksic
  • Patent number: 11942071
    Abstract: An information processing system includes at least one memory storing a program and at least one processor. The at least one processor implements the program to input a piece of sound source data obtained by encoding a first identification data representative of a sound source, a piece of style data obtained by encoding a second identification data representative of a performance style, and synthesis data representative of sounding conditions into a synthesis model generated by machine learning, and to generate, using the synthesis model, feature data representative of acoustic features of a target sound of the sound source to be generated in the performance style and according to the sounding conditions, and to generate an audio signal corresponding to the target sound using the generated feature data.
    Type: Grant
    Filed: May 4, 2021
    Date of Patent: March 26, 2024
    Assignee: YAMAHA CORPORATION
    Inventors: Ryunosuke Daido, Merlijn Blaauw, Jordi Bonada
  • Patent number: 11935549
    Abstract: An apparatus for encoding an audio signal includes: a core encoder for core encoding first audio data in a first spectral band; a parametric coder for parametrically coding second audio data in a second spectral band being different from the first spectral band, wherein the parametric coder includes: an analyzer for analyzing first audio data in the first spectral band to obtain a first analysis result and for analyzing second audio data in the second spectral band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and a parameter calculated for calculating a parameter from the second audio data in the second spectral band using the compensation value.
    Type: Grant
    Filed: August 11, 2022
    Date of Patent: March 19, 2024
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Franz Reutelhuber, Jan Büthe, Markus Multrus, Bernd Edler
  • Patent number: 11908478
    Abstract: A method for generating speech includes uploading a reference set of features that were extracted from sensed movements of one or more target regions of skin on faces of one or more reference human subjects in response to words articulated by the subjects and without contacting the one or more target regions. A test set of features is extracted a from the sensed movements of at least one of the target regions of skin on a face of a test subject in response to words articulated silently by the test subject and without contacting the one or more target regions. The extracted test set of features is compared to the reference set of features, and, based on the comparison, a speech output is generated, that includes the articulated words of the test subject.
    Type: Grant
    Filed: March 7, 2023
    Date of Patent: February 20, 2024
    Assignee: Q (Cue) Ltd.
    Inventors: Aviad Maizels, Avi Barliya, Yonatan Wexler
  • Patent number: 11908484
    Abstract: An apparatus for generating an enhanced signal from an input signal, wherein the enhanced signal has spectral values for an enhancement spectral region, the spectral values for the enhancement spectral regions not being contained in the input signal, includes a mapper for mapping a source spectral region of the input signal to a target region in the enhancement spectral region, the source spectral region including a noise-filling region; and a noise filler configured for generating first noise values for the noise-filling region in the source spectral region of the input signal and for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from the first noise values or for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from first noise values in the source region.
    Type: Grant
    Filed: January 19, 2022
    Date of Patent: February 20, 2024
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Ralf Geiger, Andreas Niedermeier, Matthias Neusinger, Konstantin Schmidt, Stephan Wilde, Benjamin Schubert, Christian Neukam
  • Patent number: 11900959
    Abstract: A plurality of pieces of emotional state information corresponding to a plurality of speech frames in a current utterance are obtained based on a first neural network model; statistical operation is performed on the plurality of pieces of emotional state information, to obtain a statistical result, and then the emotional state information corresponding to the current utterance is obtained based on a second neural network device, the statistical result corresponding to the current utterance, and statistical results corresponding to a plurality of utterances before the current utterance.
    Type: Grant
    Filed: October 15, 2021
    Date of Patent: February 13, 2024
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Yang Zhang, Oxana Verkholyak, Alexey Karpov, Li Qian
  • Patent number: 11893359
    Abstract: This application discloses an audio processing method and a terminal. The method may include: collecting, by a first terminal, an original speech of a first user, translating the original speech of the first user into a translated speech of the first user, receiving an original speech of a second user that is sent by a second terminal, and translating the original speech of the second user into a translated speech of the second user; sending at least one of the original speech of the first user, the translated speech of the first user, and the translated speech of the second user to the second terminal based on a first setting; and playing at least one of the original speech of the second user, the translated speech of the second user, and the translated speech of the first user based on a second setting.
    Type: Grant
    Filed: April 14, 2021
    Date of Patent: February 6, 2024
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Xin Zhang, Gan Zhao
  • Patent number: 11881224
    Abstract: The present invention provides a multilingual speech recognition and translation method for a conference. The conference includes at least one attendee, and the method includes: receiving, at a server, at least one piece of audio data and at least one piece of video data generated by at least one terminal apparatus; analyzing the video data to generate a video recognition result related to an attendance, and an ethnic of the attendee and a body movement, and a facial movement of the attendee when talking; generating at least one language family recognition result according to the video recognition result and the audio data, and obtaining a plurality of audio segments corresponding to the attendee; performing speech recognition on and translating the audio segments; and displaying a translation result on the terminal apparatus. The method further determines a quantity of conference attendees according to their respective distances from their device microphones.
    Type: Grant
    Filed: August 5, 2021
    Date of Patent: January 23, 2024
    Assignee: PEGATRON CORPORATION
    Inventors: Yueh-Tung Wu, Jun-Ying Li
  • Patent number: 11869525
    Abstract: A method is described that processes an audio signal. A discontinuity between a filtered past frame and a filtered current frame of the audio signal is removed using linear predictive filtering.
    Type: Grant
    Filed: February 3, 2022
    Date of Patent: January 9, 2024
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e. V.
    Inventors: Emmanuel Ravelli, Manuel Jander, Grzegorz Pietrzyk, Martin Dietz, Marc Gayer
  • Patent number: 11862184
    Abstract: An apparatus for processing an encoded audio signal, which includes a sequence of access units, each access unit including a core signal with a first spectral width and parameters describing a spectrum above the first spectral width, has a demultiplexer generating, from an access unit of the encoded audio signal, the core signal and a set of the parameters, an upsampler upsampling the core signal of the access unit and outputting a first upsampled spectrum and a timely consecutive second upsampled spectrum, the first upsampled spectrum and the second upsampled spectrum, both, having a same content as the core signal and having a second spectral width being greater than the first spectral width of the core spectrum, a parameter converter converting parameters of the set of parameters of the access unit to obtain converted parameters, and a spectral gap filling processor processing the first upsampled spectrum and the second upsampled spectrum using the converted parameters.
    Type: Grant
    Filed: August 19, 2021
    Date of Patent: January 2, 2024
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Andreas Niedermeier, Sascha Disch
  • Patent number: 11853699
    Abstract: A method and system for extracting and labeling Named-Entity Recognition (NER) data in a target language for use in a multi-lingual software module has been developed. First, a textual sentence is translated to the target language using a translation module. A named entity is identified and extracted within the translated sentence. The named entity is identified by either: exact mapping; a semantically similar translated named entity that meets a predetermined minimum threshold of similarity; or utilizing a rule-based library for the target language. Once identified, the named entity is labeled with a pre-determined category and stored in a retrievable electronic database.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: December 26, 2023
    Inventors: Shubham Mehrotra, Ankit Chadha
  • Patent number: 11842155
    Abstract: Systems and methods for matching entities to target objects using an ensemble model are disclosed. The ensemble model includes a general trained machine learning (ML) model (which is trained using the entirety of a training dataset) and a subarea trained ML model (which is trained using a subset of the training dataset corresponding to a specific, defined subarea) that provides potential matches to a meta-model of the ensemble model to generate a final match. The ensemble model may also include a general trained natural language processing (NLP) model and a subarea trained NLP model that provides potential matches to the meta-model. The meta-model of a quad-ensemble ML model combines the four potential matches (such as probabilities and similarities of matching specific pairs of targets objects and entities) to generate a final match (such as a final probability used to identify the final match).
    Type: Grant
    Filed: November 21, 2022
    Date of Patent: December 12, 2023
    Assignee: Intuit Inc.
    Inventors: Natalie Bar Eliyahu, Noga Noff, Omer Wosner, Yair Horesh
  • Patent number: 11842721
    Abstract: The system provides a synthesized speech response to a voice input, based on the prosodic character of the voice input. The system receives the voice input and calculates at least one prosodic metric of the voice input. The at least one prosodic metric can be associated with a word, phrase, grouping thereof, or the entire voice input. The system also determines a response to the voice input, which may include the sequence of words that form the response. The system generates the synthesized speech response, by determining prosodic characteristics based on the response, and on the prosodic character of the voice input. The system outputs the synthesized speech response, which includes a more natural, relevant, or both answer to the call of the voice input. The prosodic character of the voice input and/or response may include pitch, note, duration, prominence, timbre, rate, and rhythm, for example.
    Type: Grant
    Filed: August 5, 2022
    Date of Patent: December 12, 2023
    Assignee: Rovi Guides, Inc.
    Inventors: Ankur Aher, Jeffry Copps Robert Jose
  • Patent number: 11836457
    Abstract: According to one embodiment, a signal processing apparatus correlates a plurality of communication terminals as a group and enables one-to-many communications in the group. The signal processing apparatus includes processing circuitry. The processing circuitry assigns a transmission right to one of the communication terminals in the group. The processing circuitry generates text data based on voice data from said one of the communication terminals in possession of the transmission right. The processing circuitry gives a texting completion notice indicative of completion of texting processing to the communication terminals in the group. The processing circuitry transmits, after the texting completion notice is given, the generated text data to at least one of the communication terminals in the group.
    Type: Grant
    Filed: December 29, 2022
    Date of Patent: December 5, 2023
    Inventors: Hidekazu Hiraoka, Kazuaki Okimoto, Katsumi Yokomichi
  • Patent number: 11804210
    Abstract: The present disclosure provides techniques for graphics translation. A plurality of natural language image descriptions is collected for an image of a product. An overall description for the image is generated using one or more models, based on the plurality of natural language image descriptions, by: identifying a set of shared descriptors used in at least a subset of the plurality of natural language image descriptions, and aggregating the set of shared descriptors to form the overall description. A first request to provide a description of the first image is received, and the overall description is returned in response to the first request, where the overall description is output using one or more text-to-speech techniques.
    Type: Grant
    Filed: July 27, 2021
    Date of Patent: October 31, 2023
    Assignee: Toshiba Global Commerce Solutions Holdings Corporation
    Inventors: Manda Miller, Kirk Goldman, Jon A. Hoffman, John Pistone, Dimple Nanwani, Theodore Clark
  • Patent number: 11798555
    Abstract: A system of reducing transmissions of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify candidate interfaces and determine if prior instances of the packetized data was transmitted to the candidate interfaces. The interface management component can prevent the transmission of the packetized data if determined to be redundant, such as having previously received the data, and instead transmit it to a separate client device of a different device type.
    Type: Grant
    Filed: August 3, 2021
    Date of Patent: October 24, 2023
    Assignee: GOOGLE LLC
    Inventors: Gaurav Bhaya, Tarun Jain, Anshul Kothari
  • Patent number: 11776527
    Abstract: Methods and systems for voice-based identification of related products/services are provided. Exemplary systems may include a wireless communication-based tag reader that polls for a wireless transmission-based tag and reads information associated with the wireless transmission-based tag and a processor that executes instructions to identify a product/service associated with the wireless transmission-based tag, identify a plurality of products/services stored in a product/service database identified as related to the product/service associated with the wireless transmission-based tag based on a trend related to prior purchases to identify a related product/service, and generate a voice-based utterance based on the identified set of one or more related products/services.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: October 3, 2023
    Assignee: DIGIPRINT IP LLC
    Inventor: Avery Levy
  • Patent number: 11769505
    Abstract: Example techniques involve systems with multiple acoustic echo cancellers. An example implementation captures first audio within an acoustic environment and detecting, within the captured first audio content, a wake-word. In response to the wake-word and before playing an acknowledgement tone, the implementation activates (a) a first sound canceller when one or more speakers are playing back audio content or (b) a second sound canceller when the one or more speakers are idle. In response to the wake-word and after activating either (a) the first sound canceller or (b) the second sound canceller, the implementation outputs the acknowledgement tone via the one or more speakers. The implementation captures second audio within the acoustic environment and cancelling the acoustic echo of the acknowledgement tone from the captured second audio using the activated sound canceller.
    Type: Grant
    Filed: April 11, 2022
    Date of Patent: September 26, 2023
    Assignee: Sonos, Inc.
    Inventor: Saeed Bagheri Sereshki
  • Patent number: 11755840
    Abstract: Extracting data from documents is challenging due to the variation in structure, content, styles across geographies and functional areas. Further complex relation types are characterized by one or more of N-ary entity mention arguments, cross sentence span of entity mentions for a relation mention, missing entity mention arguments and entity mention arguments being multi-valued. The present disclosure addresses these gaps in the art to extract entity mentions and relation mentions using a joint neural network model including two sequence labelling layers which are trained jointly. The mentions are extracted from documents to facilitate downstream processing. A first RNN layer creates sentence embeddings for each sentence in the document being processed and predicts entity mentions. A second RNN layer predicts labels for each sentence span corresponding to a relation type.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: September 12, 2023
    Assignee: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Sachin Sharad Pawar, Nitin Ramrakhiyani, Girish Keshav Palshikar, Anindita Sinha Banerjee, Rajiv Srivastava, Devavrat Shailesh Thosar