Patents Issued in April 14, 2022

AFTERIMAGE COMPENSATION DEVICE AND DISPLAY DEVICE INCLUDING THE SAME

Publication number: 20220114986

Abstract: An afterimage compensation device includes: an afterimage area detector to receive an input image, and detect an afterimage area including an afterimage in the input image; an afterimage area corrector to detect a false detection area, and generate a corrected afterimage area, the false detection area being a part of a general area that is not detected as the afterimage area and surrounded in a plurality of directions by the detected afterimage area; and a compensation data generator to adjust a luminance of the corrected afterimage area to generate compensation data.

Type: Application

Filed: August 16, 2021

Publication date: April 14, 2022

Inventor: Jun Gyu LEE
SYNCHRONOUS DISPLAY METHOD AND SYSTEM, ELECTRONIC DEVICE AND COMPUTER READABLE MEDIUM

Publication number: 20220114987

Abstract: The present disclosure provides a synchronous display method of a spliced screen, a display system, an electronic device and a computer readable medium. The spliced screen includes display screens spliced together, and the display method is based on wireless communication and includes: sending control information to the spliced screen for N times at intervals so as to control the display screens to display simultaneously, where N is not less than 2 and is an integer, the control information sent for previous N?1 times includes first information and second information, the control information sent for an Nth time at least includes the second information, the first information is configured for controlling the display screen receiving the control information to turn off a receiving component of the display screen, the second information is configured for controlling the display screen receiving the control information to display after a preset time duration.

Type: Application

Filed: November 3, 2020

Publication date: April 14, 2022

Inventors: Genyu LIU, Tao LI, Xingqun JIANG, Chao YU, Quanzhong WANG
ENHANCED AUGMENTED REALITY VEHICLE PATHWAY

Publication number: 20220114988

Abstract: A User-centric Enhanced Pathway (UEP) system may provide vehicle pathway guidance while driving using an augmented reality display system in the event of sun glare. The system uses a route observer module programmed to predict sun glare, by obtaining real-time information about the interior and exterior vehicle environment, heading, speed, and other data. The route observer module may also use an inward-facing camera to determine when the driver is squinting, which may increase the probability function that predicts when the driver is experiencing sun glare. When sun glare is predicted, the route observer sends the weighted prediction function and input signals to a decision module that uses vehicle speed, location, and user activity to determine appropriate guidance output for an enhanced pathway using an Augmented Reality (AR) module. The AR module may adjust brightness, contrast, and color of the AR information based on observed solar glare.

Type: Application

Filed: October 8, 2020

Publication date: April 14, 2022

Applicant: Ford Global Technologies, LLC

Inventors: Kwaku O. Prakah-Asante, Jian Wan, Prayat Hegde
VEHICLE DISPLAY SYSTEM AND VEHICLE DISPLAY METHOD

Publication number: 20220114989

Abstract: A vehicle display system includes a vehicle detection device detecting at least one other vehicle present in surroundings of a host vehicle, a vehicle speed detection device detecting a speed of the host vehicle, a display device displaying the host vehicle and the at least one other vehicle respectively as vehicle icons, and a display control part configured to control a content of display of the display device. When the speed of the host vehicle is equal to or less than a predetermined value, the display control part is configured to blank out the display of or display transparently at least a part of a vehicle icon of a following vehicle detected by the vehicle detection device and positioned at a rear of the host vehicle in a driving lane of the host vehicle.

Type: Application

Filed: October 1, 2021

Publication date: April 14, 2022

Inventor: Hironori ITO
ATTACHABLE STRING BENDER

Publication number: 20220114990

Abstract: Some implementations can include a string bender including a body configured to attach to a bridge of a guitar or other stringed instrument. Some implementations can include a bridge with an integrated string bender. The string bender can be constructed to bend one or more strings, for example the B and/or G strings. The string bender can also be constructed to fit a variety of guitar styles, such as Fender Telecaster or Stratocaster-style guitars.

Type: Application

Filed: July 17, 2021

Publication date: April 14, 2022

Inventor: Waylon Dale Baker
Adjustable Length Barrel with Haptic Feedback for Musical Instrument

Publication number: 20220114991

Abstract: A barrel for a musical instrument is disclosed herein. The barrel includes a top section, a bottom section, an adjustment ring and a bore through the center. The barrel and bore length are configured to be adjusted with the adjustment ring. The barrel may include haptic features which provide feedback to a user regarding adjustments to the length of the bore and barrel. A trilobe socket and trilobe plug may be included and are configured to connect to prevent the barrel from slipping when manipulated and provide structural support. A tapered bore and bore choke provide additional structural stability and acoustic flexibility to the barrel.

Type: Application

Filed: October 13, 2020

Publication date: April 14, 2022

Applicant: BBSR Holdings, LLC

Inventor: Bradford Behn
PEDAL DEVICE FOR PERFORMANCE

Publication number: 20220114992

Abstract: A pedal device for performance (1) has a pedal (4) where a player can perform a depressing operation with his/her foot. A rod (6) is movable in an axial direction in accordance with the depressing operation of the pedal (4). A detecting mechanism detects an amount of movement of the rod (6) and outputs a detection signal corresponding to the amount of movement. The detecting mechanism has an optical sensor (10) that is a non-contact sensor detecting an amount of displacement in the axial direction L of the rod (6) in a non-contact manner.

Type: Application

Filed: December 22, 2021

Publication date: April 14, 2022

Inventor: Yoshiaki MORI
INSTRUMENT AND METHOD FOR REAL-TIME MUSIC GENERATION

Publication number: 20220114993

Abstract: A virtual instrument for real-time musical generation includes a musical rule set unit for defining musical rules, a time constrained pitch generator for synchronizing generated music, an audio generator for generating audio signals, wherein the rule definitions describe real-time morphable music parameters, and said morphable music parameters are controllable directly by the real-time control signal. With this virtual instrument, the user can create new musical content in a simple and interactive way regardless of the level of musical training obtained before using the instrument.

Type: Application

Filed: September 24, 2019

Publication date: April 14, 2022

Applicant: GESTRUMENT AB

Inventors: Jesper NORDIN, Jonatan LILJEDAHL, Jonas KJELLBERG, Pär GUNNARS RISBERG
SYSTEMS, DEVICES, AND METHODS FOR GENERATING SYMBOL SEQUENCES AND FAMILIES OF SYMBOL SEQUENCES

Publication number: 20220114994

Abstract: The present systems, devices, and methods generally relate to generating families of symbol sequences with controllable degree of correlation within and between them using quantum computers, and particularly to the exploitation of this capability to generate families of symbol sequences representing musical events such as, but not limited to, musical notes, musical chords, musical percussion strikes, musical time intervals, musical note intervals, and musical key changes that comprise a musical composition. Quantum random walks on graphs representing allowed transitions between musical events are also employed in some implementations.

Type: Application

Filed: December 17, 2021

Publication date: April 14, 2022

Inventors: Colin P. Williams, Gregory Gabrenya
AUDIO SIGNAL DEREVERBERATION

Publication number: 20220114995

Abstract: Audio signal dereverberation can be carried out in accordance instructions on a machine readable storage medium, using a processor. In an example, a location of a person in a room can be determined. An audio signal received from the location of the person can be captured using beamforming. Room properties can be determined based in part on a signal sweep of the room. A dereverberation parameter can be determined based in part on the location of the person and the room properties. The dereverberation parameter can be applied to the audio signal.

Type: Application

Filed: July 3, 2019

Publication date: April 14, 2022

Applicant: Hewlett-Packard Development Company, L.P.

Inventors: Srikanth Kuthuru, Sunil Bharitkar, Madhu Sudan Athreya
ROTOR CRAFT NOISE CANCELLATION SYSTEM AND METHOD

Publication number: 20220114996

Abstract: Rotor noise cancellation through the use of mechanical means for a personal aerial drone vehicle. Active noise cancellation is achieved by creating an antiphase amplitude wave by modulation of the propeller blades, by utilizing embedded magnets through an electromagnetic coil encircling the propeller blades. A noise level sensor signals the rotor control system to adjust the frequency of the electromagnetic field surrounding the rotor and control the speed of the rotor. An additional method comprises of incorporating a phase lock loop within the control system configured to determine the frequencies corresponding to the rotors and generate corrective audio signals to achieve active noise cancellation.

Type: Application

Filed: October 3, 2021

Publication date: April 14, 2022

Inventor: Alan Richard Greenberg
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Publication number: 20220114997

Abstract: An information processing apparatus including a noise reduction unit that reduces noise generated from an unmanned aerial vehicle, included in an audio signal picked up by a microphone mounted on the unmanned aerial vehicle, on the basis of state information on a noise source.

Type: Application

Filed: November 7, 2019

Publication date: April 14, 2022

Inventors: NAOYA TAKAHASHI, WEIHSIANG LIAO
Noninvasive Neural Stimulation Through Audio

Publication number: 20220114998

Abstract: First data comprising a first range of audio frequencies is received. The first range of audio frequencies corresponds to a predetermined cochlear region of a listener. Second data comprising a second range of audio frequencies is also received. Third data comprising a first modulated range of audio frequencies is acquired. The third data is acquired by modulating the first range of audio frequencies according to a stimulation protocol that is configured to provide neural stimulation of a brain of the listener. The second data and the third data are arranged to generate an audio composition from the second data and the third data.

Type: Application

Filed: December 20, 2021

Publication date: April 14, 2022

Applicant: BrainFM, Inc.

Inventor: Adam HEWETT
VOICE SYSTEM AND VOICE OUTPUT METHOD OF MOVING MACHINE

Publication number: 20220114999

Abstract: A voice system of a moving machine is a voice system of a moving machine driven by a driver who is exposed to an outside of the moving machine and includes: a noise estimating section which estimates a future noise state based on information related to a noise generation factor; and a voice control section which changes an attribute of voice in accordance with the estimated noise state, the voice being voice to be output to the driver.

Type: Application

Filed: May 30, 2019

Publication date: April 14, 2022

Inventors: Masanori KINUHATA, Daisuke KAWAI, Shohei TERAI, Hisanosuke KAWADA, Hirotoshi SHIMURA
ON-DEVICE PERSONALIZATION OF SPEECH SYNTHESIS FOR TRAINING OF SPEECH RECOGNITION MODEL(S)

Publication number: 20220115000

Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using an on-device TTS generator model, to generate synthesized speech audio data that includes synthesized speech of the textual segment; process the synthesized speech, using an on-device ASR model to generate predicted ASR output; and generate a gradient based on comparing the predicted ASR output to ground truth output corresponding to the textual segment. Processor(s) of the client device can also: process the synthesized speech audio data using an on-device TTS generator model to make a prediction; and generate a gradient based on the prediction. In these implementations, the generated gradient(s) can be used to update weight(s) of the respective on-device model(s) and/or transmitted to a remote system for use in remote updating of respective global model(s). The updated weight(s) and/or the updated model(s) can be transmitted to client device(s).

Type: Application

Filed: October 28, 2020

Publication date: April 14, 2022

Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
Method, System and Apparatus for Understanding and Generating Human Conversational Cues

Publication number: 20220115001

Abstract: A voice-based digital assistant (VDA) uses a conversation intelligence (CI) manager module having a rule-based engine on conversational intelligence to process information from one or more modules to make determinations on both i) understanding the human conversational cues and ii) generating the human conversational cues, including at least understanding and generating a backchannel utterance, in a flow and exchange of human communication in order to at least one of grab or yield a conversational floor between a user and the VDA. The CI manager module uses the rule-based engine to analyze and make a determination on a conversational cue of, at least, prosody in a user's flow of speech to generate the backchannel utterance to signal any of i) an understanding, ii) a correction, iii) a confirmation, and iv) a questioning of verbal communications conveyed by the user in the flow of speech during a time frame when the user still holds the conversational floor.

Type: Application

Filed: May 7, 2020

Publication date: April 14, 2022

Inventors: Harry Bratt, Kristin Precoda, Dimitra Vergyri
SPEECH RECOGNITION METHOD, SPEECH RECOGNITION DEVICE, AND ELECTRONIC EQUIPMENT

Publication number: 20220115002

Abstract: Disclosed are a speech recognition method and device, and an electronic equipment. In the speech recognition method, when a user performs speech input, the lip of the user may be shot while audio is collected, then a second lip region of the user in a current frame image is obtained based on the current frame image and at least one first lip region in a historical frame image; concurrently, a second speech feature of current frame audio may be obtained based on current frame audio and at least one first speech feature of historical frame audio. Then, the phoneme probability distribution of the current frame audio may be obtained according to the speech features and the lip regions, and then the speech recognition result of the current frame audio may be obtained according to the phoneme probability distribution.

Type: Application

Filed: October 14, 2021

Publication date: April 14, 2022

Applicant: BEIJING HORIZON ROBOTICS TECHNOLOGY RESEARCH AND DEVELOPMENT CO., LTD.

Inventor: Yichen GONG
SYSTEMS AND METHODS FOR ALIGNING A REFERENCE SEQUENCE OF SYMBOLS WITH HYPOTHESIS REQUIRING REDUCED PROCESSING AND MEMORY

Publication number: 20220115003

Abstract: A method of determining an alignment sequence between a reference sequence of symbols and a hypothesis sequence of symbols includes loading a reference sequence of symbols to a computing system and creating a reference finite state automaton for the reference sequence of symbols. The method further includes loading a hypothesis sequence of symbols to the computing system and creating a hypothesis finite state automaton for the hypothesis sequence of symbols. The method further includes traversing the reference finite state automaton, adding new reference arcs and new reference transforming properties arcs and traversing the hypothesis finite state automaton, adding new hypothesis arcs and new hypothesis transforming properties arcs. The method further includes composing the hypothesis finite state automaton with the reference finite state automaton creating alternative paths to form a composed finite state automaton and tracking a number of the alternative paths created.

Type: Application

Filed: October 13, 2020

Publication date: April 14, 2022

Inventors: Jean-Philippe Robichaud, Miguel Jette, Joshua Ian Dong, Quinten McNamara, Nishchal Bhandari, Michelle Kai Yu Huang
SYSTEMS AND METHODS FOR GENERATING DISAMBIGUATED TERMS IN AUTOMATICALLY GENERATED TRANSCRIPTIONS INCLUDING INSTRUCTIONS WITHIN A PARTICULAR KNOWLEDGE DOMAIN

Publication number: 20220115004

Abstract: System and method for generating disambiguated terms in automatically generated transcriptions including instructions within a knowledge domain and employing the system are disclosed.

Type: Application

Filed: December 21, 2021

Publication date: April 14, 2022

Inventor: Ahmad Badary
MULTI-TASK TRAINING ARCHITECTURE AND STRATEGY FOR ATTENTION-BASED SPEECH RECOGNITION SYSTEM

Publication number: 20220115005

Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.

Type: Application

Filed: December 22, 2021

Publication date: April 14, 2022

Applicant: TENCENT AMERICA LLC

Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
Long-context End-to-end Speech Recognition System

Publication number: 20220115006

Abstract: This invention relates generally to speech processing and more particularly to end-to-end automatic speech recognition (ASR) that utilizes long contextual information. Some embodiments of the invention provide a system and a method for end-to-end ASR suitable for recognizing long audio recordings such as lecture and conversational speeches. This disclosure includes a Transformer-based ASR system that utilizes contextual information, wherein the Transformer accepts multiple utterances at the same time and predicts transcript for the last utterance. This is repeated in a sliding-window fashion with one-utterance shifts to recognize the entire recording. In addition, some embodiments of the present invention may use acoustic and/or text features obtained from only the previous utterances spoken by the same speaker as the last utterance when the long audio recording includes multiple speakers.

Type: Application

Filed: October 13, 2020

Publication date: April 14, 2022

Applicant: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux
USER VOICE ACTIVITY DETECTION USING DYNAMIC CLASSIFIER

Publication number: 20220115007

Abstract: A device includes a memory configured to store instructions and one or more processors configured execute the instructions. The one or more processors are configured execute the instructions to receive audio data including first audio data corresponding to a first output of a first microphone and second audio data corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a dynamic classifier. The dynamic classifier is configured to generate a classification output corresponding to the audio data. The one or more processors are further configured to execute the instructions to determine, at least partially based on the classification output, whether the audio data corresponds to user voice activity.

Type: Application

Filed: May 5, 2021

Publication date: April 14, 2022

Inventors: Taher SHAHBAZI MIRZAHASANLOO, Rogerio Guedes ALVES, Erik VISSER, Lae-Hoon KIM
SYSTEM AND/OR METHOD FOR SEMANTIC PARSING OF AIR TRAFFIC CONTROL AUDIO

Publication number: 20220115008

Abstract: The method S200 can include: at an aircraft, receiving an audio utterance from air traffic control S210, converting the audio utterance to text, determining commands from the text using a question-and-answer model S240, and optionally controlling the aircraft based on the commands S250. The method functions to automatically interpret flight commands from the air traffic control (ATC) stream.

Type: Application

Filed: October 13, 2021

Publication date: April 14, 2022

Inventors: Michael Pust, Joseph Bondaryk, Matthew George
CROSS-DEVICE DATA SYNCHRONIZATION BASED ON SIMULTANEOUS HOTWORD TRIGGERS

Publication number: 20220115009

Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers.

Type: Application

Filed: December 8, 2020

Publication date: April 14, 2022

Inventors: Matthew Sharifi, Victor Carbune
TECHNIQUES FOR DYNAMIC AUDITORY PHRASE COMPLETION

Publication number: 20220115010

Abstract: Embodiments of the present disclosure set forth a computer-implemented method comprising detecting an initial phrase portion included in a first auditory signal generated by a user, identifying, based on the initial phrase portion, a supplemental phrase portion that complements the initial phrase portion to form a complete phrase, and providing a command signal that drives an output device to generate an audio output corresponding to the supplemental phrase portion.

Type: Application

Filed: October 8, 2020

Publication date: April 14, 2022

Inventors: Stefan MARTI, Joseph VERBEKE, Evgeny BURMISTROV, Priya SESHADRI
DETECTING NEAR MATCHES TO A HOTWORD OR PHRASE

Publication number: 20220115011

Abstract: Techniques are described herein for identifying a failed hotword attempt. A method includes: receiving first audio data; processing the first audio data to generate a first predicted output; determining that the first predicted output satisfies a secondary threshold but does not satisfy a primary threshold; receiving second audio data; processing the second audio data to generate a second predicted output; determining that the second predicted output satisfies the secondary threshold but does not satisfy the primary threshold; in response to the first predicted output and the second predicted output satisfying the secondary threshold but not satisfying the primary threshold, and in response to the first spoken utterance and the second spoken utterance satisfying one or more temporal criteria relative to one another, identifying a failed hotword attempt; and in response to identifying the failed hotword attempt, providing a hint that is responsive to the failed hotword attempt.

Type: Application

Filed: October 27, 2020

Publication date: April 14, 2022

Inventors: Matthew Sharifi, Victor Carbune
METHOD AND APPARATUS FOR PROCESSING VOICES, DEVICE AND COMPUTER STORAGE MEDIUM

Publication number: 20220115012

Abstract: The present application discloses a method and apparatus for processing voices, a device and a computer storage medium, and relates to the technical field of voices. An implementation includes: recognizing a received voice request by a server of a first voice assistant to obtain a text request; sending the recognized text request to a server of a second voice assistant; receiving token information generated and returned by the server of the second voice assistant for the text request; and sending the text request and the token information to a client of the first voice assistant, such that the client of the first voice assistant calls a client of the second voice assistant to respond to the text request based on the token information. Based on the present application, after a user inputs the voice request with the first voice assistant, the first voice assistant may call the second voice assistant to respond to the voice request when the second voice assistant may better respond to the voice request.

Type: Application

Filed: May 7, 2020

Publication date: April 14, 2022

Inventors: Jizhou Huang, Shiqiang Ding, Changshun Hou
DIGITAL ASSISTANT

Publication number: 20220115013

Abstract: In one aspect, a server that receives, from a client terminal via a network, a request to initiate a verbal conversation using natural language that is in a spoken or textual format, extracts information during the verbal conversation, determines a context of the verbal conversation, receives an inquiry during the verbal conversation, processes the inquiry, acquires response information based on the determined appropriate response, and transmits to the client terminal the response information.

Type: Application

Filed: July 19, 2021

Publication date: April 14, 2022

Applicant: FIRST ADVANTAGE CORPORATION

Inventors: Arun N. KUMAR, Stefano MALNATI
VEHICLE AGENT DEVICE, VEHICLE AGENT SYSTEM, AND COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20220115014

Abstract: A vehicle agent device receives utterance information from an on-board unit, analyzes the content of the utterance, detects, as a non-installed function from a database, a function that an occupant intended to utilize but which was not installed and is installable, generates proposal information for furnishing the occupant with information relating to the non-installed function it detected, and sends the proposal information that has been generated to the on-board unit to thereby send the information relating to the non-installed function to a preregistered mobile device carried by the occupant.

Type: Application

Filed: October 6, 2021

Publication date: April 14, 2022

Applicant: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Chikage KUBO, Keiko Nakano, Eiichi Maeda, Hiroyuki Nishizawa
LOW-POWER MULTI-VOICE ASSISTANTS VOICE ACTIVATION

Publication number: 20220115015

Abstract: Systems and methods presented herein generally include multi-wake phrase detection executed on a single device utilizing multiple voice assistants. Systems and methods presented herein can further include continuously running a Voice Activity Detection (VAD) process which detects presence of human speech. The multi-wake phrase detection can activate when the VAD process detects human speech. Once activated, the multi-wake phrase detection can determine which (if any) of the wake phrases of the multiple voice assistants might be in the detected speech. Operation of the multi-wake phrase detection can have a low miss-rate. In some examples, operation of the multi-wake phrase detection can be granular to accomplish the low miss-rates at low power with a tolerance for false positives on wake phrase detection.

Type: Application

Filed: October 12, 2021

Publication date: April 14, 2022

Inventors: Mouna Elkhatib, Adil Benyassine
SPEECH-PROCESSING SYSTEM

Publication number: 20220115016

Abstract: A system may receive audio data that represents a wakeword associated with a first speech-processing system and a command associated with a second speech-processing system. Different indications of handing the audio data off to the second speech-processing system may be determined based on a determined amount of interaction with the second speech-processing system. If the amount of interaction is low, a longer, more detailed indication is generated; if the amount of interaction is high, a brief, less detailed indication is generated. A local device may output audio corresponding to the indication before outputting audio generated by the second speech-processing system in response to the command.

Type: Application

Filed: October 14, 2021

Publication date: April 14, 2022

Inventor: Timothy Whalin
INTEGRATING LOGIC SERVICES WITH A GROUP COMMUNICATION SERVICE AND A VOICE ASSISTANT SERVICE

Publication number: 20220115017

Abstract: Methods, apparatuses, and computing systems are provided for integrating logic services with a group communication service. In an implementation, a method may include receiving a spoken message from a communication node in a communication group and determining that the spoken message relates to a logic service and transferring the spoken message to a voice assistant service with an indication that the spoken message relates to the logic service. The method also includes receiving status information from the logic service indicative of a status of a networked device associated with the logic service. The further method includes sending an audible announcement to the communication nodes in the commutation group expressive of the status of the networked device.

Type: Application

Filed: December 20, 2021

Publication date: April 14, 2022

Inventors: Greg Albrecht, Ellen Juhlin, Jesse Robbins, Justin Black
SYNCHRONOUS AUDIO AND TEXT GENERATION

Publication number: 20220115018

Abstract: A messaging system, which hosts a backend service for an associated messaging client, includes a voice chat system that provides voice chat functionality that enables users to dictate their messages, while delivering the resulting message to the intended recipient as both the associated audio and text content. When a user at a sender client device begins dictating a voice message, the voice chat system starts converting the received audio stream into text and, also, starts communicating the audio content together with the generated text to the recipient client device. The recipient user can listen to the voice message and read the text generated from the audio in real time. It is also possible for the recipient user to consume the voice message in a textual form only, if the sound at the client device is undesirable.

Type: Application

Filed: October 14, 2020

Publication date: April 14, 2022

Inventors: Laurent Desserrey, Jeremy Baker Voss
METHOD AND SYSTEM FOR CONVERSATION TRANSCRIPTION WITH METADATA

Publication number: 20220115019

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Application

Filed: October 11, 2021

Publication date: April 14, 2022

Applicant: SoundHound, Inc.

Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
METHOD AND SYSTEM FOR CONVERSATION TRANSCRIPTION WITH METADATA

Publication number: 20220115020

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Application

Filed: October 11, 2021

Publication date: April 14, 2022

Applicant: SoundHound, Inc.

Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
Talker Prediction Method, Talker Prediction Device, and Communication System

Publication number: 20220115021

Abstract: A talker prediction method obtains a voice from a plurality of talkers, records a conversation history of the plurality of talkers, identifies a talker of the obtained voice, and predicts a next talker among the plurality of talkers based on the identified talker and the conversation history.

Type: Application

Filed: October 5, 2021

Publication date: April 14, 2022

Inventors: Satoshi UKAI, Ryo Tanaka
AUTOMATIC GENERATION AND/OR USE OF TEXT-DEPENDENT SPEAKER VERIFICATION FEATURES

Publication number: 20220115022

Abstract: Implementations relate to automatic generation of speaker features for each of one or more particular text-dependent speaker verifications (TD-SVs) for a user. Implementations can generate speaker features for a particular TD-SV using instances of audio data that each capture a corresponding spoken utterance of the user during normal non-enrollment interactions with an automated assistant via one or more respective assistant devices. For example, a portion of an instance of audio data can be used in response to: (a) determining that recognized term(s) for the spoken utterance captured by that the portion correspond to the particular TD-SV; and (b) determining that an authentication measure, for the user and for the spoken utterance, satisfies a threshold. Implementations additionally or alternatively relate to utilization of speaker features, for each of one or more particular TD-SVs for a user, in determining whether to authenticate a spoken utterance for the user.

Type: Application

Filed: October 13, 2020

Publication date: April 14, 2022

Inventors: Matthew Sharifi, Victor Carbune
Voice Recognition Transmission System

Publication number: 20220115023

Abstract: A voice recognition transmission system for preventing unauthorized activation of a vehicle is provided. The system includes a microphone, a voice recognition module, and an automotive gear shifter. The microphone receives a voice command. The voice recognition module analyzes the voice command against a plurality of authorized voice profiles. The voice recognition module determines if the voice command reaches a threshold of similarity in tone and tenor of one authorized voice profile. A voice command that reaches the threshold of similarity will activate the automotive gear shifter to perform the voice command. A voice command that fails to reach the threshold of similarity will activate an interlock. The interlock will prevent an unauthorized individual from operating the vehicle and activate the vehicle alarm system. A notification will be sent to a connected electronic device when a voice command activates the automotive gear shifter or the interlock.

Type: Application

Filed: October 13, 2021

Publication date: April 14, 2022

Inventor: Milia Cora
Apparatus, Methods, and Computer Programs for Encoding Spatial Metadata

Publication number: 20220115024

Abstract: Examples of the disclosure relate to apparatus, methods and computer programs for encoding spatial metadata. The example apparatus includes circuitry configured for obtaining spatial metadata associated with spatial audio content and obtaining a configuration parameter indicative of a source format of the spatial audio content. The circuitry is also configured to use the configuration parameter to select a method of compression of the spatial metadata associated with the spatial audio content.

Type: Application

Filed: October 28, 2019

Publication date: April 14, 2022

Inventors: Tapani PIHLAJAKUJA, Lasse LAAKSONEN, Antti ERONEN, Arto LEHTINIEMI
Acoustic Environment Simulation

Publication number: 20220115025

Abstract: Encoding/decoding an audio signal having one or more audio components, wherein each audio component is associated with a spatial location. A first audio signal presentation (z) of the audio components, a first set of transform parameters (w(f)), and signal level data (?2) are encoded and transmitted to the decoder. The decoder uses the first set of transform parameters (w(f)) to form a reconstructed simulation input signal intended for an acoustic environment simulation, and applies a signal level modification (?) to the reconstructed simulation input signal. The signal level modification is based on the signal level data (?2) and data (p2) related to the acoustic environment simulation. The attenuated reconstructed simulation input signal is then processed in an acoustic environment simulator. With this process, the decoder does not need to determine the signal level of the simulation input signal, thereby reducing processing load.

Type: Application

Filed: October 25, 2021

Publication date: April 14, 2022

Applicant: Dolby Laboratories Licensing Corporation

Inventor: Dirk Jeroen BREEBAART
STEREO PARAMETERS FOR STEREO DECODING

Publication number: 20220115026

Abstract: An apparatus includes a receiver and a decoder. The receiver is configured to receive a bitstream that includes a first frame and a second frame. The first frame includes a first portion of a mid channel and a first quantized stereo parameter. The second frame includes a second portion of the mid channel and a second quantized stereo parameter. The decoder is configured to generate a first portion of a channel based on the first portion of the mid channel and the first quantized stereo parameter. The decoder is configured to, in response to the second frame being unavailable for decoding operations, estimate the second quantized stereo parameter based on stereo parameters of one or more preceding frames and generate a second portion of the channel based on the estimated second quantized stereo parameter. The second portion of the channel corresponds to a decoded version of the second frame.

Type: Application

Filed: December 20, 2021

Publication date: April 14, 2022

Inventors: Venkata Subrahmanyam Chandra Sekhar CHEBIYYAM, Venkatramn Atti
METHOD AND APPARATUS FOR DECODING A BITSTREAM INCLUDING ENCODED HIGHER ORDER AMBISONICS REPRESENTATIONS

Publication number: 20220115027

Abstract: Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. For coding, portions of the original HOA representation are predicted from the directional signal components. This prediction provides side information which is required for a corresponding decoding. By using some additional specific purpose bits, a known side information coding processing is improved in that the required number of bits for coding that side information is reduced on average.

Type: Application

Filed: December 21, 2021

Publication date: April 14, 2022

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Sven Kordon, Alexander Krueger, Oliver Wuebbolt
REAL-TIME SPEECH-TO-SPEECH GENERATION (RSSG) APPARATUS, METHOD AND A SYSTEM THEREFORE

Publication number: 20220115028

Abstract: Information loss in speech to text conversion and Inability to preserve vocal emotion information without changing the artificial intelligence model infrastructure in a conventional speech to speech translation system are essential drawback of the conventional techniques. Embodiments of the invention provide direct speech to speech translation system is disclosed. Direct speech to speech translation system uses a one-tier approach, creating a unified-model for whole application. The single-model ecosystem takes in audio (mel spectrogram) as an input and gives out audio (mel spectrogram) as an output. This solves the bottleneck problem by not converting speech directly to text but having text as a byproduct of speech to speech translation, preserving phonetic information along the way. This model also uses pre-processing and post-processing scripts but only for the whole model. This model needs parallel audio samples in two languages.

Type: Application

Filed: December 24, 2021

Publication date: April 14, 2022

Inventors: Sandeep Dhawan, Kapil Dhawan, Dennis Reutter, Chris Beckman, Ahsan Memon
System And Method For Podcast Repetitive Content Detection

Publication number: 20220115029

Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.

Type: Application

Filed: December 10, 2020

Publication date: April 14, 2022

Inventors: Amanmeet Garg, Aneesh Vartakavi
SYSTEM AND METHODS FOR AUTOMATICALLY MIXING AUDIO FOR ACOUSTIC SCENES

Publication number: 20220115030

Abstract: The disclosed computer-implemented method may include obtaining an audio sample from a content source, inputting the obtained audio sample into a trained machine learning model, obtaining the output of the trained machine learning model, wherein the output is a profile of an environment in which the input audio sample was recorded, obtaining an acoustic impulse response corresponding to the profile of the environment in which the input audio sample was recorded, obtaining a second audio sample, processing the obtained acoustic impulse response with the second audio sample, and inserting a result of processing the obtained acoustic impulse response and the second audio sample into an audio track. Various other methods, systems, and computer-readable media are also disclosed.

Type: Application

Filed: December 17, 2021

Publication date: April 14, 2022

Inventors: Yadong Wang, Shilpa Jois Rao, Murthy Parthasarathi, Kyle Tacke
SPONSORSHIP CREDIT PERIOD IDENTIFICATION APPARATUS, SPONSORSHIP CREDIT PERIOD IDENTIFICATION METHOD AND PROGRAM

Publication number: 20220115031

Abstract: A credit segment identifying device includes an extracting unit which extracts, from a first speech signal, a plurality of first partial speech signals which are each a part of the first speech signals and shifted from each other in time direction and an identifying unit which identifies a credit segment in the first speech signal by determining whether each of the first partial speech signals includes a credit according to an association between each of second partial signals extracted from a second speech signal and the presence/absence of a credit, so that credit segments can be identified more efficiently.

Type: Application

Filed: January 24, 2020

Publication date: April 14, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Yasunori OISHI, Takahito KAWANISHI, Kunio KASHINO
HARMFUL BEHAVIOR DETECTING SYSTEM AND METHOD THEREOF

Publication number: 20220115032

Abstract: A technique capable of detecting harmful behavior such as power harassment, sexual harassment, or bullying in work environment to support handling is provided. A harmful behavior detecting system includes a computer that executes observation and detection regarding harmful behavior including power harassment, sexual harassment, and bullying among people in work environment. The computer obtains voice data into which voice around a target person is inputted; obtains voice information containing words and emotion information from the voice data; and obtains data such as vital data, date and time, or a location of the target person. The computer uses five elements including words and an emotion of the other person, words, an emotion, and vital data of the target person to calculate an index value regarding the harmful behavior; estimate a state of the harmful behavior based on the index value; and output handling data for handling the harmful behavior in accordance with the estimated state.

Type: Application

Filed: November 12, 2019

Publication date: April 14, 2022

Inventors: Satoshi IWAGAKI, Atsushi SHIMADA, Masumi SUEHIRO, Hidenori CHIBA, Kouichi HORIUCHI
MULTI-STAGE ADAPTIVE SYSTEM FOR CONTENT MODERATION

Publication number: 20220115033

Abstract: A toxicity moderation system has an input configured to receive speech from a speaker. The system includes a multi-stage toxicity machine learning system having a first stage and a second stage. The first stage is trained to analyze the received speech to determine whether a toxicity level of the speech meets a toxicity threshold. The first stage is also configured to filter-through, to the second stage, speech that meets the toxicity threshold, and is further configured to filter-out speech that does not meet the toxicity threshold.

Type: Application

Filed: October 8, 2021

Publication date: April 14, 2022

Inventors: William Carter Huffman, Michael Pappas, Henry Howie
AUDIO RESPONSE MESSAGES

Publication number: 20220115034

Abstract: An audio response system can generate multimodal messages that can be dynamically updated on viewer's client device based on a type of audio response detected. The audio responses can include keywords or continuum-based signal (e.g., levels of wind noise). A machine learning scheme can be trained to output classification data from the audio response data for content selection and dynamic display updates.

Type: Application

Filed: December 22, 2021

Publication date: April 14, 2022

Inventors: Gurunandan Krishnan Gorumkonda, Shree K. Nayar
Vertical Junction To Provide Optimal Transverse Bias For Dual Free Layer Read Heads

Publication number: 20220115035

Abstract: The present disclosure generally relates to a read head assembly having a dual free layer (DFL) structure disposed between a first shield and a second shield at a media facing surface. The read head assembly further comprises a rear hard bias (RHB) structure disposed adjacent to the DFL structure recessed from the media facing surface, where an insulation layer separates the RHB structure from the DFL structure. The insulation layer is disposed perpendicularly between the first shield and the second shield. The DFL structure comprises a first free layer and a second free layer having equal stripe heights from the media facing surface to the insulation layer. The RHB structure comprises a seed layer, a bulk layer, and a capping layer. The capping layer and the insulation layer prevent the bulk layer from contacting the second shield.

Type: Application

Filed: February 24, 2021

Publication date: April 14, 2022

Inventors: Ming MAO, Chen-Jung CHIEN, Daniele MAURI, Goncalo Marcos BAIÃO DE ALBUQUERQUE

prev … 92 93 94 95 96 97 98 99 100 … next