Patents Issued in April 14, 2022
-
Publication number: 20220114986Abstract: An afterimage compensation device includes: an afterimage area detector to receive an input image, and detect an afterimage area including an afterimage in the input image; an afterimage area corrector to detect a false detection area, and generate a corrected afterimage area, the false detection area being a part of a general area that is not detected as the afterimage area and surrounded in a plurality of directions by the detected afterimage area; and a compensation data generator to adjust a luminance of the corrected afterimage area to generate compensation data.Type: ApplicationFiled: August 16, 2021Publication date: April 14, 2022Inventor: Jun Gyu LEE
-
Publication number: 20220114987Abstract: The present disclosure provides a synchronous display method of a spliced screen, a display system, an electronic device and a computer readable medium. The spliced screen includes display screens spliced together, and the display method is based on wireless communication and includes: sending control information to the spliced screen for N times at intervals so as to control the display screens to display simultaneously, where N is not less than 2 and is an integer, the control information sent for previous N?1 times includes first information and second information, the control information sent for an Nth time at least includes the second information, the first information is configured for controlling the display screen receiving the control information to turn off a receiving component of the display screen, the second information is configured for controlling the display screen receiving the control information to display after a preset time duration.Type: ApplicationFiled: November 3, 2020Publication date: April 14, 2022Inventors: Genyu LIU, Tao LI, Xingqun JIANG, Chao YU, Quanzhong WANG
-
Publication number: 20220114988Abstract: A User-centric Enhanced Pathway (UEP) system may provide vehicle pathway guidance while driving using an augmented reality display system in the event of sun glare. The system uses a route observer module programmed to predict sun glare, by obtaining real-time information about the interior and exterior vehicle environment, heading, speed, and other data. The route observer module may also use an inward-facing camera to determine when the driver is squinting, which may increase the probability function that predicts when the driver is experiencing sun glare. When sun glare is predicted, the route observer sends the weighted prediction function and input signals to a decision module that uses vehicle speed, location, and user activity to determine appropriate guidance output for an enhanced pathway using an Augmented Reality (AR) module. The AR module may adjust brightness, contrast, and color of the AR information based on observed solar glare.Type: ApplicationFiled: October 8, 2020Publication date: April 14, 2022Applicant: Ford Global Technologies, LLCInventors: Kwaku O. Prakah-Asante, Jian Wan, Prayat Hegde
-
Publication number: 20220114989Abstract: A vehicle display system includes a vehicle detection device detecting at least one other vehicle present in surroundings of a host vehicle, a vehicle speed detection device detecting a speed of the host vehicle, a display device displaying the host vehicle and the at least one other vehicle respectively as vehicle icons, and a display control part configured to control a content of display of the display device. When the speed of the host vehicle is equal to or less than a predetermined value, the display control part is configured to blank out the display of or display transparently at least a part of a vehicle icon of a following vehicle detected by the vehicle detection device and positioned at a rear of the host vehicle in a driving lane of the host vehicle.Type: ApplicationFiled: October 1, 2021Publication date: April 14, 2022Inventor: Hironori ITO
-
Publication number: 20220114990Abstract: Some implementations can include a string bender including a body configured to attach to a bridge of a guitar or other stringed instrument. Some implementations can include a bridge with an integrated string bender. The string bender can be constructed to bend one or more strings, for example the B and/or G strings. The string bender can also be constructed to fit a variety of guitar styles, such as Fender Telecaster or Stratocaster-style guitars.Type: ApplicationFiled: July 17, 2021Publication date: April 14, 2022Inventor: Waylon Dale Baker
-
Publication number: 20220114991Abstract: A barrel for a musical instrument is disclosed herein. The barrel includes a top section, a bottom section, an adjustment ring and a bore through the center. The barrel and bore length are configured to be adjusted with the adjustment ring. The barrel may include haptic features which provide feedback to a user regarding adjustments to the length of the bore and barrel. A trilobe socket and trilobe plug may be included and are configured to connect to prevent the barrel from slipping when manipulated and provide structural support. A tapered bore and bore choke provide additional structural stability and acoustic flexibility to the barrel.Type: ApplicationFiled: October 13, 2020Publication date: April 14, 2022Applicant: BBSR Holdings, LLCInventor: Bradford Behn
-
Publication number: 20220114992Abstract: A pedal device for performance (1) has a pedal (4) where a player can perform a depressing operation with his/her foot. A rod (6) is movable in an axial direction in accordance with the depressing operation of the pedal (4). A detecting mechanism detects an amount of movement of the rod (6) and outputs a detection signal corresponding to the amount of movement. The detecting mechanism has an optical sensor (10) that is a non-contact sensor detecting an amount of displacement in the axial direction L of the rod (6) in a non-contact manner.Type: ApplicationFiled: December 22, 2021Publication date: April 14, 2022Inventor: Yoshiaki MORI
-
Publication number: 20220114993Abstract: A virtual instrument for real-time musical generation includes a musical rule set unit for defining musical rules, a time constrained pitch generator for synchronizing generated music, an audio generator for generating audio signals, wherein the rule definitions describe real-time morphable music parameters, and said morphable music parameters are controllable directly by the real-time control signal. With this virtual instrument, the user can create new musical content in a simple and interactive way regardless of the level of musical training obtained before using the instrument.Type: ApplicationFiled: September 24, 2019Publication date: April 14, 2022Applicant: GESTRUMENT ABInventors: Jesper NORDIN, Jonatan LILJEDAHL, Jonas KJELLBERG, Pär GUNNARS RISBERG
-
Publication number: 20220114994Abstract: The present systems, devices, and methods generally relate to generating families of symbol sequences with controllable degree of correlation within and between them using quantum computers, and particularly to the exploitation of this capability to generate families of symbol sequences representing musical events such as, but not limited to, musical notes, musical chords, musical percussion strikes, musical time intervals, musical note intervals, and musical key changes that comprise a musical composition. Quantum random walks on graphs representing allowed transitions between musical events are also employed in some implementations.Type: ApplicationFiled: December 17, 2021Publication date: April 14, 2022Inventors: Colin P. Williams, Gregory Gabrenya
-
Publication number: 20220114995Abstract: Audio signal dereverberation can be carried out in accordance instructions on a machine readable storage medium, using a processor. In an example, a location of a person in a room can be determined. An audio signal received from the location of the person can be captured using beamforming. Room properties can be determined based in part on a signal sweep of the room. A dereverberation parameter can be determined based in part on the location of the person and the room properties. The dereverberation parameter can be applied to the audio signal.Type: ApplicationFiled: July 3, 2019Publication date: April 14, 2022Applicant: Hewlett-Packard Development Company, L.P.Inventors: Srikanth Kuthuru, Sunil Bharitkar, Madhu Sudan Athreya
-
Publication number: 20220114996Abstract: Rotor noise cancellation through the use of mechanical means for a personal aerial drone vehicle. Active noise cancellation is achieved by creating an antiphase amplitude wave by modulation of the propeller blades, by utilizing embedded magnets through an electromagnetic coil encircling the propeller blades. A noise level sensor signals the rotor control system to adjust the frequency of the electromagnetic field surrounding the rotor and control the speed of the rotor. An additional method comprises of incorporating a phase lock loop within the control system configured to determine the frequencies corresponding to the rotors and generate corrective audio signals to achieve active noise cancellation.Type: ApplicationFiled: October 3, 2021Publication date: April 14, 2022Inventor: Alan Richard Greenberg
-
Publication number: 20220114997Abstract: An information processing apparatus including a noise reduction unit that reduces noise generated from an unmanned aerial vehicle, included in an audio signal picked up by a microphone mounted on the unmanned aerial vehicle, on the basis of state information on a noise source.Type: ApplicationFiled: November 7, 2019Publication date: April 14, 2022Inventors: NAOYA TAKAHASHI, WEIHSIANG LIAO
-
Publication number: 20220114998Abstract: First data comprising a first range of audio frequencies is received. The first range of audio frequencies corresponds to a predetermined cochlear region of a listener. Second data comprising a second range of audio frequencies is also received. Third data comprising a first modulated range of audio frequencies is acquired. The third data is acquired by modulating the first range of audio frequencies according to a stimulation protocol that is configured to provide neural stimulation of a brain of the listener. The second data and the third data are arranged to generate an audio composition from the second data and the third data.Type: ApplicationFiled: December 20, 2021Publication date: April 14, 2022Applicant: BrainFM, Inc.Inventor: Adam HEWETT
-
Publication number: 20220114999Abstract: A voice system of a moving machine is a voice system of a moving machine driven by a driver who is exposed to an outside of the moving machine and includes: a noise estimating section which estimates a future noise state based on information related to a noise generation factor; and a voice control section which changes an attribute of voice in accordance with the estimated noise state, the voice being voice to be output to the driver.Type: ApplicationFiled: May 30, 2019Publication date: April 14, 2022Inventors: Masanori KINUHATA, Daisuke KAWAI, Shohei TERAI, Hisanosuke KAWADA, Hirotoshi SHIMURA
-
Publication number: 20220115000Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using an on-device TTS generator model, to generate synthesized speech audio data that includes synthesized speech of the textual segment; process the synthesized speech, using an on-device ASR model to generate predicted ASR output; and generate a gradient based on comparing the predicted ASR output to ground truth output corresponding to the textual segment. Processor(s) of the client device can also: process the synthesized speech audio data using an on-device TTS generator model to make a prediction; and generate a gradient based on the prediction. In these implementations, the generated gradient(s) can be used to update weight(s) of the respective on-device model(s) and/or transmitted to a remote system for use in remote updating of respective global model(s). The updated weight(s) and/or the updated model(s) can be transmitted to client device(s).Type: ApplicationFiled: October 28, 2020Publication date: April 14, 2022Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
-
Publication number: 20220115001Abstract: A voice-based digital assistant (VDA) uses a conversation intelligence (CI) manager module having a rule-based engine on conversational intelligence to process information from one or more modules to make determinations on both i) understanding the human conversational cues and ii) generating the human conversational cues, including at least understanding and generating a backchannel utterance, in a flow and exchange of human communication in order to at least one of grab or yield a conversational floor between a user and the VDA. The CI manager module uses the rule-based engine to analyze and make a determination on a conversational cue of, at least, prosody in a user's flow of speech to generate the backchannel utterance to signal any of i) an understanding, ii) a correction, iii) a confirmation, and iv) a questioning of verbal communications conveyed by the user in the flow of speech during a time frame when the user still holds the conversational floor.Type: ApplicationFiled: May 7, 2020Publication date: April 14, 2022Inventors: Harry Bratt, Kristin Precoda, Dimitra Vergyri
-
Publication number: 20220115002Abstract: Disclosed are a speech recognition method and device, and an electronic equipment. In the speech recognition method, when a user performs speech input, the lip of the user may be shot while audio is collected, then a second lip region of the user in a current frame image is obtained based on the current frame image and at least one first lip region in a historical frame image; concurrently, a second speech feature of current frame audio may be obtained based on current frame audio and at least one first speech feature of historical frame audio. Then, the phoneme probability distribution of the current frame audio may be obtained according to the speech features and the lip regions, and then the speech recognition result of the current frame audio may be obtained according to the phoneme probability distribution.Type: ApplicationFiled: October 14, 2021Publication date: April 14, 2022Applicant: BEIJING HORIZON ROBOTICS TECHNOLOGY RESEARCH AND DEVELOPMENT CO., LTD.Inventor: Yichen GONG
-
Publication number: 20220115003Abstract: A method of determining an alignment sequence between a reference sequence of symbols and a hypothesis sequence of symbols includes loading a reference sequence of symbols to a computing system and creating a reference finite state automaton for the reference sequence of symbols. The method further includes loading a hypothesis sequence of symbols to the computing system and creating a hypothesis finite state automaton for the hypothesis sequence of symbols. The method further includes traversing the reference finite state automaton, adding new reference arcs and new reference transforming properties arcs and traversing the hypothesis finite state automaton, adding new hypothesis arcs and new hypothesis transforming properties arcs. The method further includes composing the hypothesis finite state automaton with the reference finite state automaton creating alternative paths to form a composed finite state automaton and tracking a number of the alternative paths created.Type: ApplicationFiled: October 13, 2020Publication date: April 14, 2022Inventors: Jean-Philippe Robichaud, Miguel Jette, Joshua Ian Dong, Quinten McNamara, Nishchal Bhandari, Michelle Kai Yu Huang
-
Publication number: 20220115004Abstract: System and method for generating disambiguated terms in automatically generated transcriptions including instructions within a knowledge domain and employing the system are disclosed.Type: ApplicationFiled: December 21, 2021Publication date: April 14, 2022Inventor: Ahmad Badary
-
Publication number: 20220115005Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.Type: ApplicationFiled: December 22, 2021Publication date: April 14, 2022Applicant: TENCENT AMERICA LLCInventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
-
Publication number: 20220115006Abstract: This invention relates generally to speech processing and more particularly to end-to-end automatic speech recognition (ASR) that utilizes long contextual information. Some embodiments of the invention provide a system and a method for end-to-end ASR suitable for recognizing long audio recordings such as lecture and conversational speeches. This disclosure includes a Transformer-based ASR system that utilizes contextual information, wherein the Transformer accepts multiple utterances at the same time and predicts transcript for the last utterance. This is repeated in a sliding-window fashion with one-utterance shifts to recognize the entire recording. In addition, some embodiments of the present invention may use acoustic and/or text features obtained from only the previous utterances spoken by the same speaker as the last utterance when the long audio recording includes multiple speakers.Type: ApplicationFiled: October 13, 2020Publication date: April 14, 2022Applicant: Mitsubishi Electric Research Laboratories, Inc.Inventors: Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux
-
Publication number: 20220115007Abstract: A device includes a memory configured to store instructions and one or more processors configured execute the instructions. The one or more processors are configured execute the instructions to receive audio data including first audio data corresponding to a first output of a first microphone and second audio data corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a dynamic classifier. The dynamic classifier is configured to generate a classification output corresponding to the audio data. The one or more processors are further configured to execute the instructions to determine, at least partially based on the classification output, whether the audio data corresponds to user voice activity.Type: ApplicationFiled: May 5, 2021Publication date: April 14, 2022Inventors: Taher SHAHBAZI MIRZAHASANLOO, Rogerio Guedes ALVES, Erik VISSER, Lae-Hoon KIM
-
Publication number: 20220115008Abstract: The method S200 can include: at an aircraft, receiving an audio utterance from air traffic control S210, converting the audio utterance to text, determining commands from the text using a question-and-answer model S240, and optionally controlling the aircraft based on the commands S250. The method functions to automatically interpret flight commands from the air traffic control (ATC) stream.Type: ApplicationFiled: October 13, 2021Publication date: April 14, 2022Inventors: Michael Pust, Joseph Bondaryk, Matthew George
-
Publication number: 20220115009Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers.Type: ApplicationFiled: December 8, 2020Publication date: April 14, 2022Inventors: Matthew Sharifi, Victor Carbune
-
Publication number: 20220115010Abstract: Embodiments of the present disclosure set forth a computer-implemented method comprising detecting an initial phrase portion included in a first auditory signal generated by a user, identifying, based on the initial phrase portion, a supplemental phrase portion that complements the initial phrase portion to form a complete phrase, and providing a command signal that drives an output device to generate an audio output corresponding to the supplemental phrase portion.Type: ApplicationFiled: October 8, 2020Publication date: April 14, 2022Inventors: Stefan MARTI, Joseph VERBEKE, Evgeny BURMISTROV, Priya SESHADRI
-
Publication number: 20220115011Abstract: Techniques are described herein for identifying a failed hotword attempt. A method includes: receiving first audio data; processing the first audio data to generate a first predicted output; determining that the first predicted output satisfies a secondary threshold but does not satisfy a primary threshold; receiving second audio data; processing the second audio data to generate a second predicted output; determining that the second predicted output satisfies the secondary threshold but does not satisfy the primary threshold; in response to the first predicted output and the second predicted output satisfying the secondary threshold but not satisfying the primary threshold, and in response to the first spoken utterance and the second spoken utterance satisfying one or more temporal criteria relative to one another, identifying a failed hotword attempt; and in response to identifying the failed hotword attempt, providing a hint that is responsive to the failed hotword attempt.Type: ApplicationFiled: October 27, 2020Publication date: April 14, 2022Inventors: Matthew Sharifi, Victor Carbune
-
Publication number: 20220115012Abstract: The present application discloses a method and apparatus for processing voices, a device and a computer storage medium, and relates to the technical field of voices. An implementation includes: recognizing a received voice request by a server of a first voice assistant to obtain a text request; sending the recognized text request to a server of a second voice assistant; receiving token information generated and returned by the server of the second voice assistant for the text request; and sending the text request and the token information to a client of the first voice assistant, such that the client of the first voice assistant calls a client of the second voice assistant to respond to the text request based on the token information. Based on the present application, after a user inputs the voice request with the first voice assistant, the first voice assistant may call the second voice assistant to respond to the voice request when the second voice assistant may better respond to the voice request.Type: ApplicationFiled: May 7, 2020Publication date: April 14, 2022Inventors: Jizhou Huang, Shiqiang Ding, Changshun Hou
-
Publication number: 20220115013Abstract: In one aspect, a server that receives, from a client terminal via a network, a request to initiate a verbal conversation using natural language that is in a spoken or textual format, extracts information during the verbal conversation, determines a context of the verbal conversation, receives an inquiry during the verbal conversation, processes the inquiry, acquires response information based on the determined appropriate response, and transmits to the client terminal the response information.Type: ApplicationFiled: July 19, 2021Publication date: April 14, 2022Applicant: FIRST ADVANTAGE CORPORATIONInventors: Arun N. KUMAR, Stefano MALNATI
-
Publication number: 20220115014Abstract: A vehicle agent device receives utterance information from an on-board unit, analyzes the content of the utterance, detects, as a non-installed function from a database, a function that an occupant intended to utilize but which was not installed and is installable, generates proposal information for furnishing the occupant with information relating to the non-installed function it detected, and sends the proposal information that has been generated to the on-board unit to thereby send the information relating to the non-installed function to a preregistered mobile device carried by the occupant.Type: ApplicationFiled: October 6, 2021Publication date: April 14, 2022Applicant: TOYOTA JIDOSHA KABUSHIKI KAISHAInventors: Chikage KUBO, Keiko Nakano, Eiichi Maeda, Hiroyuki Nishizawa
-
Publication number: 20220115015Abstract: Systems and methods presented herein generally include multi-wake phrase detection executed on a single device utilizing multiple voice assistants. Systems and methods presented herein can further include continuously running a Voice Activity Detection (VAD) process which detects presence of human speech. The multi-wake phrase detection can activate when the VAD process detects human speech. Once activated, the multi-wake phrase detection can determine which (if any) of the wake phrases of the multiple voice assistants might be in the detected speech. Operation of the multi-wake phrase detection can have a low miss-rate. In some examples, operation of the multi-wake phrase detection can be granular to accomplish the low miss-rates at low power with a tolerance for false positives on wake phrase detection.Type: ApplicationFiled: October 12, 2021Publication date: April 14, 2022Inventors: Mouna Elkhatib, Adil Benyassine
-
Publication number: 20220115016Abstract: A system may receive audio data that represents a wakeword associated with a first speech-processing system and a command associated with a second speech-processing system. Different indications of handing the audio data off to the second speech-processing system may be determined based on a determined amount of interaction with the second speech-processing system. If the amount of interaction is low, a longer, more detailed indication is generated; if the amount of interaction is high, a brief, less detailed indication is generated. A local device may output audio corresponding to the indication before outputting audio generated by the second speech-processing system in response to the command.Type: ApplicationFiled: October 14, 2021Publication date: April 14, 2022Inventor: Timothy Whalin
-
Publication number: 20220115017Abstract: Methods, apparatuses, and computing systems are provided for integrating logic services with a group communication service. In an implementation, a method may include receiving a spoken message from a communication node in a communication group and determining that the spoken message relates to a logic service and transferring the spoken message to a voice assistant service with an indication that the spoken message relates to the logic service. The method also includes receiving status information from the logic service indicative of a status of a networked device associated with the logic service. The further method includes sending an audible announcement to the communication nodes in the commutation group expressive of the status of the networked device.Type: ApplicationFiled: December 20, 2021Publication date: April 14, 2022Inventors: Greg Albrecht, Ellen Juhlin, Jesse Robbins, Justin Black
-
Publication number: 20220115018Abstract: A messaging system, which hosts a backend service for an associated messaging client, includes a voice chat system that provides voice chat functionality that enables users to dictate their messages, while delivering the resulting message to the intended recipient as both the associated audio and text content. When a user at a sender client device begins dictating a voice message, the voice chat system starts converting the received audio stream into text and, also, starts communicating the audio content together with the generated text to the recipient client device. The recipient user can listen to the voice message and read the text generated from the audio in real time. It is also possible for the recipient user to consume the voice message in a textual form only, if the sound at the client device is undesirable.Type: ApplicationFiled: October 14, 2020Publication date: April 14, 2022Inventors: Laurent Desserrey, Jeremy Baker Voss
-
Publication number: 20220115019Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.Type: ApplicationFiled: October 11, 2021Publication date: April 14, 2022Applicant: SoundHound, Inc.Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
-
Publication number: 20220115020Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.Type: ApplicationFiled: October 11, 2021Publication date: April 14, 2022Applicant: SoundHound, Inc.Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
-
Publication number: 20220115021Abstract: A talker prediction method obtains a voice from a plurality of talkers, records a conversation history of the plurality of talkers, identifies a talker of the obtained voice, and predicts a next talker among the plurality of talkers based on the identified talker and the conversation history.Type: ApplicationFiled: October 5, 2021Publication date: April 14, 2022Inventors: Satoshi UKAI, Ryo Tanaka
-
Publication number: 20220115022Abstract: Implementations relate to automatic generation of speaker features for each of one or more particular text-dependent speaker verifications (TD-SVs) for a user. Implementations can generate speaker features for a particular TD-SV using instances of audio data that each capture a corresponding spoken utterance of the user during normal non-enrollment interactions with an automated assistant via one or more respective assistant devices. For example, a portion of an instance of audio data can be used in response to: (a) determining that recognized term(s) for the spoken utterance captured by that the portion correspond to the particular TD-SV; and (b) determining that an authentication measure, for the user and for the spoken utterance, satisfies a threshold. Implementations additionally or alternatively relate to utilization of speaker features, for each of one or more particular TD-SVs for a user, in determining whether to authenticate a spoken utterance for the user.Type: ApplicationFiled: October 13, 2020Publication date: April 14, 2022Inventors: Matthew Sharifi, Victor Carbune
-
Publication number: 20220115023Abstract: A voice recognition transmission system for preventing unauthorized activation of a vehicle is provided. The system includes a microphone, a voice recognition module, and an automotive gear shifter. The microphone receives a voice command. The voice recognition module analyzes the voice command against a plurality of authorized voice profiles. The voice recognition module determines if the voice command reaches a threshold of similarity in tone and tenor of one authorized voice profile. A voice command that reaches the threshold of similarity will activate the automotive gear shifter to perform the voice command. A voice command that fails to reach the threshold of similarity will activate an interlock. The interlock will prevent an unauthorized individual from operating the vehicle and activate the vehicle alarm system. A notification will be sent to a connected electronic device when a voice command activates the automotive gear shifter or the interlock.Type: ApplicationFiled: October 13, 2021Publication date: April 14, 2022Inventor: Milia Cora
-
Publication number: 20220115024Abstract: Examples of the disclosure relate to apparatus, methods and computer programs for encoding spatial metadata. The example apparatus includes circuitry configured for obtaining spatial metadata associated with spatial audio content and obtaining a configuration parameter indicative of a source format of the spatial audio content. The circuitry is also configured to use the configuration parameter to select a method of compression of the spatial metadata associated with the spatial audio content.Type: ApplicationFiled: October 28, 2019Publication date: April 14, 2022Inventors: Tapani PIHLAJAKUJA, Lasse LAAKSONEN, Antti ERONEN, Arto LEHTINIEMI
-
Publication number: 20220115025Abstract: Encoding/decoding an audio signal having one or more audio components, wherein each audio component is associated with a spatial location. A first audio signal presentation (z) of the audio components, a first set of transform parameters (w(f)), and signal level data (?2) are encoded and transmitted to the decoder. The decoder uses the first set of transform parameters (w(f)) to form a reconstructed simulation input signal intended for an acoustic environment simulation, and applies a signal level modification (?) to the reconstructed simulation input signal. The signal level modification is based on the signal level data (?2) and data (p2) related to the acoustic environment simulation. The attenuated reconstructed simulation input signal is then processed in an acoustic environment simulator. With this process, the decoder does not need to determine the signal level of the simulation input signal, thereby reducing processing load.Type: ApplicationFiled: October 25, 2021Publication date: April 14, 2022Applicant: Dolby Laboratories Licensing CorporationInventor: Dirk Jeroen BREEBAART
-
Publication number: 20220115026Abstract: An apparatus includes a receiver and a decoder. The receiver is configured to receive a bitstream that includes a first frame and a second frame. The first frame includes a first portion of a mid channel and a first quantized stereo parameter. The second frame includes a second portion of the mid channel and a second quantized stereo parameter. The decoder is configured to generate a first portion of a channel based on the first portion of the mid channel and the first quantized stereo parameter. The decoder is configured to, in response to the second frame being unavailable for decoding operations, estimate the second quantized stereo parameter based on stereo parameters of one or more preceding frames and generate a second portion of the channel based on the estimated second quantized stereo parameter. The second portion of the channel corresponds to a decoded version of the second frame.Type: ApplicationFiled: December 20, 2021Publication date: April 14, 2022Inventors: Venkata Subrahmanyam Chandra Sekhar CHEBIYYAM, Venkatramn Atti
-
Publication number: 20220115027Abstract: Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. For coding, portions of the original HOA representation are predicted from the directional signal components. This prediction provides side information which is required for a corresponding decoding. By using some additional specific purpose bits, a known side information coding processing is improved in that the required number of bits for coding that side information is reduced on average.Type: ApplicationFiled: December 21, 2021Publication date: April 14, 2022Applicant: Dolby Laboratories Licensing CorporationInventors: Sven Kordon, Alexander Krueger, Oliver Wuebbolt
-
Publication number: 20220115028Abstract: Information loss in speech to text conversion and Inability to preserve vocal emotion information without changing the artificial intelligence model infrastructure in a conventional speech to speech translation system are essential drawback of the conventional techniques. Embodiments of the invention provide direct speech to speech translation system is disclosed. Direct speech to speech translation system uses a one-tier approach, creating a unified-model for whole application. The single-model ecosystem takes in audio (mel spectrogram) as an input and gives out audio (mel spectrogram) as an output. This solves the bottleneck problem by not converting speech directly to text but having text as a byproduct of speech to speech translation, preserving phonetic information along the way. This model also uses pre-processing and post-processing scripts but only for the whole model. This model needs parallel audio samples in two languages.Type: ApplicationFiled: December 24, 2021Publication date: April 14, 2022Inventors: Sandeep Dhawan, Kapil Dhawan, Dennis Reutter, Chris Beckman, Ahsan Memon
-
Publication number: 20220115029Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.Type: ApplicationFiled: December 10, 2020Publication date: April 14, 2022Inventors: Amanmeet Garg, Aneesh Vartakavi
-
Publication number: 20220115030Abstract: The disclosed computer-implemented method may include obtaining an audio sample from a content source, inputting the obtained audio sample into a trained machine learning model, obtaining the output of the trained machine learning model, wherein the output is a profile of an environment in which the input audio sample was recorded, obtaining an acoustic impulse response corresponding to the profile of the environment in which the input audio sample was recorded, obtaining a second audio sample, processing the obtained acoustic impulse response with the second audio sample, and inserting a result of processing the obtained acoustic impulse response and the second audio sample into an audio track. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 17, 2021Publication date: April 14, 2022Inventors: Yadong Wang, Shilpa Jois Rao, Murthy Parthasarathi, Kyle Tacke
-
Publication number: 20220115031Abstract: A credit segment identifying device includes an extracting unit which extracts, from a first speech signal, a plurality of first partial speech signals which are each a part of the first speech signals and shifted from each other in time direction and an identifying unit which identifies a credit segment in the first speech signal by determining whether each of the first partial speech signals includes a credit according to an association between each of second partial signals extracted from a second speech signal and the presence/absence of a credit, so that credit segments can be identified more efficiently.Type: ApplicationFiled: January 24, 2020Publication date: April 14, 2022Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Yasunori OISHI, Takahito KAWANISHI, Kunio KASHINO
-
Publication number: 20220115032Abstract: A technique capable of detecting harmful behavior such as power harassment, sexual harassment, or bullying in work environment to support handling is provided. A harmful behavior detecting system includes a computer that executes observation and detection regarding harmful behavior including power harassment, sexual harassment, and bullying among people in work environment. The computer obtains voice data into which voice around a target person is inputted; obtains voice information containing words and emotion information from the voice data; and obtains data such as vital data, date and time, or a location of the target person. The computer uses five elements including words and an emotion of the other person, words, an emotion, and vital data of the target person to calculate an index value regarding the harmful behavior; estimate a state of the harmful behavior based on the index value; and output handling data for handling the harmful behavior in accordance with the estimated state.Type: ApplicationFiled: November 12, 2019Publication date: April 14, 2022Inventors: Satoshi IWAGAKI, Atsushi SHIMADA, Masumi SUEHIRO, Hidenori CHIBA, Kouichi HORIUCHI
-
Publication number: 20220115033Abstract: A toxicity moderation system has an input configured to receive speech from a speaker. The system includes a multi-stage toxicity machine learning system having a first stage and a second stage. The first stage is trained to analyze the received speech to determine whether a toxicity level of the speech meets a toxicity threshold. The first stage is also configured to filter-through, to the second stage, speech that meets the toxicity threshold, and is further configured to filter-out speech that does not meet the toxicity threshold.Type: ApplicationFiled: October 8, 2021Publication date: April 14, 2022Inventors: William Carter Huffman, Michael Pappas, Henry Howie
-
Publication number: 20220115034Abstract: An audio response system can generate multimodal messages that can be dynamically updated on viewer's client device based on a type of audio response detected. The audio responses can include keywords or continuum-based signal (e.g., levels of wind noise). A machine learning scheme can be trained to output classification data from the audio response data for content selection and dynamic display updates.Type: ApplicationFiled: December 22, 2021Publication date: April 14, 2022Inventors: Gurunandan Krishnan Gorumkonda, Shree K. Nayar
-
Publication number: 20220115035Abstract: The present disclosure generally relates to a read head assembly having a dual free layer (DFL) structure disposed between a first shield and a second shield at a media facing surface. The read head assembly further comprises a rear hard bias (RHB) structure disposed adjacent to the DFL structure recessed from the media facing surface, where an insulation layer separates the RHB structure from the DFL structure. The insulation layer is disposed perpendicularly between the first shield and the second shield. The DFL structure comprises a first free layer and a second free layer having equal stripe heights from the media facing surface to the insulation layer. The RHB structure comprises a seed layer, a bulk layer, and a capping layer. The capping layer and the insulation layer prevent the bulk layer from contacting the second shield.Type: ApplicationFiled: February 24, 2021Publication date: April 14, 2022Inventors: Ming MAO, Chen-Jung CHIEN, Daniele MAURI, Goncalo Marcos BAIÃO DE ALBUQUERQUE