Patents Issued in April 14, 2022
  • Publication number: 20220114986
    Abstract: An afterimage compensation device includes: an afterimage area detector to receive an input image, and detect an afterimage area including an afterimage in the input image; an afterimage area corrector to detect a false detection area, and generate a corrected afterimage area, the false detection area being a part of a general area that is not detected as the afterimage area and surrounded in a plurality of directions by the detected afterimage area; and a compensation data generator to adjust a luminance of the corrected afterimage area to generate compensation data.
    Type: Application
    Filed: August 16, 2021
    Publication date: April 14, 2022
    Inventor: Jun Gyu LEE
  • Publication number: 20220114987
    Abstract: The present disclosure provides a synchronous display method of a spliced screen, a display system, an electronic device and a computer readable medium. The spliced screen includes display screens spliced together, and the display method is based on wireless communication and includes: sending control information to the spliced screen for N times at intervals so as to control the display screens to display simultaneously, where N is not less than 2 and is an integer, the control information sent for previous N?1 times includes first information and second information, the control information sent for an Nth time at least includes the second information, the first information is configured for controlling the display screen receiving the control information to turn off a receiving component of the display screen, the second information is configured for controlling the display screen receiving the control information to display after a preset time duration.
    Type: Application
    Filed: November 3, 2020
    Publication date: April 14, 2022
    Inventors: Genyu LIU, Tao LI, Xingqun JIANG, Chao YU, Quanzhong WANG
  • Publication number: 20220114988
    Abstract: A User-centric Enhanced Pathway (UEP) system may provide vehicle pathway guidance while driving using an augmented reality display system in the event of sun glare. The system uses a route observer module programmed to predict sun glare, by obtaining real-time information about the interior and exterior vehicle environment, heading, speed, and other data. The route observer module may also use an inward-facing camera to determine when the driver is squinting, which may increase the probability function that predicts when the driver is experiencing sun glare. When sun glare is predicted, the route observer sends the weighted prediction function and input signals to a decision module that uses vehicle speed, location, and user activity to determine appropriate guidance output for an enhanced pathway using an Augmented Reality (AR) module. The AR module may adjust brightness, contrast, and color of the AR information based on observed solar glare.
    Type: Application
    Filed: October 8, 2020
    Publication date: April 14, 2022
    Applicant: Ford Global Technologies, LLC
    Inventors: Kwaku O. Prakah-Asante, Jian Wan, Prayat Hegde
  • Publication number: 20220114989
    Abstract: A vehicle display system includes a vehicle detection device detecting at least one other vehicle present in surroundings of a host vehicle, a vehicle speed detection device detecting a speed of the host vehicle, a display device displaying the host vehicle and the at least one other vehicle respectively as vehicle icons, and a display control part configured to control a content of display of the display device. When the speed of the host vehicle is equal to or less than a predetermined value, the display control part is configured to blank out the display of or display transparently at least a part of a vehicle icon of a following vehicle detected by the vehicle detection device and positioned at a rear of the host vehicle in a driving lane of the host vehicle.
    Type: Application
    Filed: October 1, 2021
    Publication date: April 14, 2022
    Inventor: Hironori ITO
  • Publication number: 20220114990
    Abstract: Some implementations can include a string bender including a body configured to attach to a bridge of a guitar or other stringed instrument. Some implementations can include a bridge with an integrated string bender. The string bender can be constructed to bend one or more strings, for example the B and/or G strings. The string bender can also be constructed to fit a variety of guitar styles, such as Fender Telecaster or Stratocaster-style guitars.
    Type: Application
    Filed: July 17, 2021
    Publication date: April 14, 2022
    Inventor: Waylon Dale Baker
  • Publication number: 20220114991
    Abstract: A barrel for a musical instrument is disclosed herein. The barrel includes a top section, a bottom section, an adjustment ring and a bore through the center. The barrel and bore length are configured to be adjusted with the adjustment ring. The barrel may include haptic features which provide feedback to a user regarding adjustments to the length of the bore and barrel. A trilobe socket and trilobe plug may be included and are configured to connect to prevent the barrel from slipping when manipulated and provide structural support. A tapered bore and bore choke provide additional structural stability and acoustic flexibility to the barrel.
    Type: Application
    Filed: October 13, 2020
    Publication date: April 14, 2022
    Applicant: BBSR Holdings, LLC
    Inventor: Bradford Behn
  • Publication number: 20220114992
    Abstract: A pedal device for performance (1) has a pedal (4) where a player can perform a depressing operation with his/her foot. A rod (6) is movable in an axial direction in accordance with the depressing operation of the pedal (4). A detecting mechanism detects an amount of movement of the rod (6) and outputs a detection signal corresponding to the amount of movement. The detecting mechanism has an optical sensor (10) that is a non-contact sensor detecting an amount of displacement in the axial direction L of the rod (6) in a non-contact manner.
    Type: Application
    Filed: December 22, 2021
    Publication date: April 14, 2022
    Inventor: Yoshiaki MORI
  • Publication number: 20220114993
    Abstract: A virtual instrument for real-time musical generation includes a musical rule set unit for defining musical rules, a time constrained pitch generator for synchronizing generated music, an audio generator for generating audio signals, wherein the rule definitions describe real-time morphable music parameters, and said morphable music parameters are controllable directly by the real-time control signal. With this virtual instrument, the user can create new musical content in a simple and interactive way regardless of the level of musical training obtained before using the instrument.
    Type: Application
    Filed: September 24, 2019
    Publication date: April 14, 2022
    Applicant: GESTRUMENT AB
    Inventors: Jesper NORDIN, Jonatan LILJEDAHL, Jonas KJELLBERG, Pär GUNNARS RISBERG
  • Publication number: 20220114994
    Abstract: The present systems, devices, and methods generally relate to generating families of symbol sequences with controllable degree of correlation within and between them using quantum computers, and particularly to the exploitation of this capability to generate families of symbol sequences representing musical events such as, but not limited to, musical notes, musical chords, musical percussion strikes, musical time intervals, musical note intervals, and musical key changes that comprise a musical composition. Quantum random walks on graphs representing allowed transitions between musical events are also employed in some implementations.
    Type: Application
    Filed: December 17, 2021
    Publication date: April 14, 2022
    Inventors: Colin P. Williams, Gregory Gabrenya
  • Publication number: 20220114995
    Abstract: Audio signal dereverberation can be carried out in accordance instructions on a machine readable storage medium, using a processor. In an example, a location of a person in a room can be determined. An audio signal received from the location of the person can be captured using beamforming. Room properties can be determined based in part on a signal sweep of the room. A dereverberation parameter can be determined based in part on the location of the person and the room properties. The dereverberation parameter can be applied to the audio signal.
    Type: Application
    Filed: July 3, 2019
    Publication date: April 14, 2022
    Applicant: Hewlett-Packard Development Company, L.P.
    Inventors: Srikanth Kuthuru, Sunil Bharitkar, Madhu Sudan Athreya
  • Publication number: 20220114996
    Abstract: Rotor noise cancellation through the use of mechanical means for a personal aerial drone vehicle. Active noise cancellation is achieved by creating an antiphase amplitude wave by modulation of the propeller blades, by utilizing embedded magnets through an electromagnetic coil encircling the propeller blades. A noise level sensor signals the rotor control system to adjust the frequency of the electromagnetic field surrounding the rotor and control the speed of the rotor. An additional method comprises of incorporating a phase lock loop within the control system configured to determine the frequencies corresponding to the rotors and generate corrective audio signals to achieve active noise cancellation.
    Type: Application
    Filed: October 3, 2021
    Publication date: April 14, 2022
    Inventor: Alan Richard Greenberg
  • Publication number: 20220114997
    Abstract: An information processing apparatus including a noise reduction unit that reduces noise generated from an unmanned aerial vehicle, included in an audio signal picked up by a microphone mounted on the unmanned aerial vehicle, on the basis of state information on a noise source.
    Type: Application
    Filed: November 7, 2019
    Publication date: April 14, 2022
    Inventors: NAOYA TAKAHASHI, WEIHSIANG LIAO
  • Publication number: 20220114998
    Abstract: First data comprising a first range of audio frequencies is received. The first range of audio frequencies corresponds to a predetermined cochlear region of a listener. Second data comprising a second range of audio frequencies is also received. Third data comprising a first modulated range of audio frequencies is acquired. The third data is acquired by modulating the first range of audio frequencies according to a stimulation protocol that is configured to provide neural stimulation of a brain of the listener. The second data and the third data are arranged to generate an audio composition from the second data and the third data.
    Type: Application
    Filed: December 20, 2021
    Publication date: April 14, 2022
    Applicant: BrainFM, Inc.
    Inventor: Adam HEWETT
  • Publication number: 20220114999
    Abstract: A voice system of a moving machine is a voice system of a moving machine driven by a driver who is exposed to an outside of the moving machine and includes: a noise estimating section which estimates a future noise state based on information related to a noise generation factor; and a voice control section which changes an attribute of voice in accordance with the estimated noise state, the voice being voice to be output to the driver.
    Type: Application
    Filed: May 30, 2019
    Publication date: April 14, 2022
    Inventors: Masanori KINUHATA, Daisuke KAWAI, Shohei TERAI, Hisanosuke KAWADA, Hirotoshi SHIMURA
  • Publication number: 20220115000
    Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using an on-device TTS generator model, to generate synthesized speech audio data that includes synthesized speech of the textual segment; process the synthesized speech, using an on-device ASR model to generate predicted ASR output; and generate a gradient based on comparing the predicted ASR output to ground truth output corresponding to the textual segment. Processor(s) of the client device can also: process the synthesized speech audio data using an on-device TTS generator model to make a prediction; and generate a gradient based on the prediction. In these implementations, the generated gradient(s) can be used to update weight(s) of the respective on-device model(s) and/or transmitted to a remote system for use in remote updating of respective global model(s). The updated weight(s) and/or the updated model(s) can be transmitted to client device(s).
    Type: Application
    Filed: October 28, 2020
    Publication date: April 14, 2022
    Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
  • Publication number: 20220115001
    Abstract: A voice-based digital assistant (VDA) uses a conversation intelligence (CI) manager module having a rule-based engine on conversational intelligence to process information from one or more modules to make determinations on both i) understanding the human conversational cues and ii) generating the human conversational cues, including at least understanding and generating a backchannel utterance, in a flow and exchange of human communication in order to at least one of grab or yield a conversational floor between a user and the VDA. The CI manager module uses the rule-based engine to analyze and make a determination on a conversational cue of, at least, prosody in a user's flow of speech to generate the backchannel utterance to signal any of i) an understanding, ii) a correction, iii) a confirmation, and iv) a questioning of verbal communications conveyed by the user in the flow of speech during a time frame when the user still holds the conversational floor.
    Type: Application
    Filed: May 7, 2020
    Publication date: April 14, 2022
    Inventors: Harry Bratt, Kristin Precoda, Dimitra Vergyri
  • Publication number: 20220115002
    Abstract: Disclosed are a speech recognition method and device, and an electronic equipment. In the speech recognition method, when a user performs speech input, the lip of the user may be shot while audio is collected, then a second lip region of the user in a current frame image is obtained based on the current frame image and at least one first lip region in a historical frame image; concurrently, a second speech feature of current frame audio may be obtained based on current frame audio and at least one first speech feature of historical frame audio. Then, the phoneme probability distribution of the current frame audio may be obtained according to the speech features and the lip regions, and then the speech recognition result of the current frame audio may be obtained according to the phoneme probability distribution.
    Type: Application
    Filed: October 14, 2021
    Publication date: April 14, 2022
    Applicant: BEIJING HORIZON ROBOTICS TECHNOLOGY RESEARCH AND DEVELOPMENT CO., LTD.
    Inventor: Yichen GONG
  • Publication number: 20220115003
    Abstract: A method of determining an alignment sequence between a reference sequence of symbols and a hypothesis sequence of symbols includes loading a reference sequence of symbols to a computing system and creating a reference finite state automaton for the reference sequence of symbols. The method further includes loading a hypothesis sequence of symbols to the computing system and creating a hypothesis finite state automaton for the hypothesis sequence of symbols. The method further includes traversing the reference finite state automaton, adding new reference arcs and new reference transforming properties arcs and traversing the hypothesis finite state automaton, adding new hypothesis arcs and new hypothesis transforming properties arcs. The method further includes composing the hypothesis finite state automaton with the reference finite state automaton creating alternative paths to form a composed finite state automaton and tracking a number of the alternative paths created.
    Type: Application
    Filed: October 13, 2020
    Publication date: April 14, 2022
    Inventors: Jean-Philippe Robichaud, Miguel Jette, Joshua Ian Dong, Quinten McNamara, Nishchal Bhandari, Michelle Kai Yu Huang
  • Publication number: 20220115004
    Abstract: System and method for generating disambiguated terms in automatically generated transcriptions including instructions within a knowledge domain and employing the system are disclosed.
    Type: Application
    Filed: December 21, 2021
    Publication date: April 14, 2022
    Inventor: Ahmad Badary
  • Publication number: 20220115005
    Abstract: Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.
    Type: Application
    Filed: December 22, 2021
    Publication date: April 14, 2022
    Applicant: TENCENT AMERICA LLC
    Inventors: Jia CUI, Chao WENG, Guangsen WANG, Jun WANG, Chengzhu YU, Dan SU, Dong YU
  • Publication number: 20220115006
    Abstract: This invention relates generally to speech processing and more particularly to end-to-end automatic speech recognition (ASR) that utilizes long contextual information. Some embodiments of the invention provide a system and a method for end-to-end ASR suitable for recognizing long audio recordings such as lecture and conversational speeches. This disclosure includes a Transformer-based ASR system that utilizes contextual information, wherein the Transformer accepts multiple utterances at the same time and predicts transcript for the last utterance. This is repeated in a sliding-window fashion with one-utterance shifts to recognize the entire recording. In addition, some embodiments of the present invention may use acoustic and/or text features obtained from only the previous utterances spoken by the same speaker as the last utterance when the long audio recording includes multiple speakers.
    Type: Application
    Filed: October 13, 2020
    Publication date: April 14, 2022
    Applicant: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux
  • Publication number: 20220115007
    Abstract: A device includes a memory configured to store instructions and one or more processors configured execute the instructions. The one or more processors are configured execute the instructions to receive audio data including first audio data corresponding to a first output of a first microphone and second audio data corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a dynamic classifier. The dynamic classifier is configured to generate a classification output corresponding to the audio data. The one or more processors are further configured to execute the instructions to determine, at least partially based on the classification output, whether the audio data corresponds to user voice activity.
    Type: Application
    Filed: May 5, 2021
    Publication date: April 14, 2022
    Inventors: Taher SHAHBAZI MIRZAHASANLOO, Rogerio Guedes ALVES, Erik VISSER, Lae-Hoon KIM
  • Publication number: 20220115008
    Abstract: The method S200 can include: at an aircraft, receiving an audio utterance from air traffic control S210, converting the audio utterance to text, determining commands from the text using a question-and-answer model S240, and optionally controlling the aircraft based on the commands S250. The method functions to automatically interpret flight commands from the air traffic control (ATC) stream.
    Type: Application
    Filed: October 13, 2021
    Publication date: April 14, 2022
    Inventors: Michael Pust, Joseph Bondaryk, Matthew George
  • Publication number: 20220115009
    Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers.
    Type: Application
    Filed: December 8, 2020
    Publication date: April 14, 2022
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20220115010
    Abstract: Embodiments of the present disclosure set forth a computer-implemented method comprising detecting an initial phrase portion included in a first auditory signal generated by a user, identifying, based on the initial phrase portion, a supplemental phrase portion that complements the initial phrase portion to form a complete phrase, and providing a command signal that drives an output device to generate an audio output corresponding to the supplemental phrase portion.
    Type: Application
    Filed: October 8, 2020
    Publication date: April 14, 2022
    Inventors: Stefan MARTI, Joseph VERBEKE, Evgeny BURMISTROV, Priya SESHADRI
  • Publication number: 20220115011
    Abstract: Techniques are described herein for identifying a failed hotword attempt. A method includes: receiving first audio data; processing the first audio data to generate a first predicted output; determining that the first predicted output satisfies a secondary threshold but does not satisfy a primary threshold; receiving second audio data; processing the second audio data to generate a second predicted output; determining that the second predicted output satisfies the secondary threshold but does not satisfy the primary threshold; in response to the first predicted output and the second predicted output satisfying the secondary threshold but not satisfying the primary threshold, and in response to the first spoken utterance and the second spoken utterance satisfying one or more temporal criteria relative to one another, identifying a failed hotword attempt; and in response to identifying the failed hotword attempt, providing a hint that is responsive to the failed hotword attempt.
    Type: Application
    Filed: October 27, 2020
    Publication date: April 14, 2022
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20220115012
    Abstract: The present application discloses a method and apparatus for processing voices, a device and a computer storage medium, and relates to the technical field of voices. An implementation includes: recognizing a received voice request by a server of a first voice assistant to obtain a text request; sending the recognized text request to a server of a second voice assistant; receiving token information generated and returned by the server of the second voice assistant for the text request; and sending the text request and the token information to a client of the first voice assistant, such that the client of the first voice assistant calls a client of the second voice assistant to respond to the text request based on the token information. Based on the present application, after a user inputs the voice request with the first voice assistant, the first voice assistant may call the second voice assistant to respond to the voice request when the second voice assistant may better respond to the voice request.
    Type: Application
    Filed: May 7, 2020
    Publication date: April 14, 2022
    Inventors: Jizhou Huang, Shiqiang Ding, Changshun Hou
  • Publication number: 20220115013
    Abstract: In one aspect, a server that receives, from a client terminal via a network, a request to initiate a verbal conversation using natural language that is in a spoken or textual format, extracts information during the verbal conversation, determines a context of the verbal conversation, receives an inquiry during the verbal conversation, processes the inquiry, acquires response information based on the determined appropriate response, and transmits to the client terminal the response information.
    Type: Application
    Filed: July 19, 2021
    Publication date: April 14, 2022
    Applicant: FIRST ADVANTAGE CORPORATION
    Inventors: Arun N. KUMAR, Stefano MALNATI
  • Publication number: 20220115014
    Abstract: A vehicle agent device receives utterance information from an on-board unit, analyzes the content of the utterance, detects, as a non-installed function from a database, a function that an occupant intended to utilize but which was not installed and is installable, generates proposal information for furnishing the occupant with information relating to the non-installed function it detected, and sends the proposal information that has been generated to the on-board unit to thereby send the information relating to the non-installed function to a preregistered mobile device carried by the occupant.
    Type: Application
    Filed: October 6, 2021
    Publication date: April 14, 2022
    Applicant: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Chikage KUBO, Keiko Nakano, Eiichi Maeda, Hiroyuki Nishizawa
  • Publication number: 20220115015
    Abstract: Systems and methods presented herein generally include multi-wake phrase detection executed on a single device utilizing multiple voice assistants. Systems and methods presented herein can further include continuously running a Voice Activity Detection (VAD) process which detects presence of human speech. The multi-wake phrase detection can activate when the VAD process detects human speech. Once activated, the multi-wake phrase detection can determine which (if any) of the wake phrases of the multiple voice assistants might be in the detected speech. Operation of the multi-wake phrase detection can have a low miss-rate. In some examples, operation of the multi-wake phrase detection can be granular to accomplish the low miss-rates at low power with a tolerance for false positives on wake phrase detection.
    Type: Application
    Filed: October 12, 2021
    Publication date: April 14, 2022
    Inventors: Mouna Elkhatib, Adil Benyassine
  • Publication number: 20220115016
    Abstract: A system may receive audio data that represents a wakeword associated with a first speech-processing system and a command associated with a second speech-processing system. Different indications of handing the audio data off to the second speech-processing system may be determined based on a determined amount of interaction with the second speech-processing system. If the amount of interaction is low, a longer, more detailed indication is generated; if the amount of interaction is high, a brief, less detailed indication is generated. A local device may output audio corresponding to the indication before outputting audio generated by the second speech-processing system in response to the command.
    Type: Application
    Filed: October 14, 2021
    Publication date: April 14, 2022
    Inventor: Timothy Whalin
  • Publication number: 20220115017
    Abstract: Methods, apparatuses, and computing systems are provided for integrating logic services with a group communication service. In an implementation, a method may include receiving a spoken message from a communication node in a communication group and determining that the spoken message relates to a logic service and transferring the spoken message to a voice assistant service with an indication that the spoken message relates to the logic service. The method also includes receiving status information from the logic service indicative of a status of a networked device associated with the logic service. The further method includes sending an audible announcement to the communication nodes in the commutation group expressive of the status of the networked device.
    Type: Application
    Filed: December 20, 2021
    Publication date: April 14, 2022
    Inventors: Greg Albrecht, Ellen Juhlin, Jesse Robbins, Justin Black
  • Publication number: 20220115018
    Abstract: A messaging system, which hosts a backend service for an associated messaging client, includes a voice chat system that provides voice chat functionality that enables users to dictate their messages, while delivering the resulting message to the intended recipient as both the associated audio and text content. When a user at a sender client device begins dictating a voice message, the voice chat system starts converting the received audio stream into text and, also, starts communicating the audio content together with the generated text to the recipient client device. The recipient user can listen to the voice message and read the text generated from the audio in real time. It is also possible for the recipient user to consume the voice message in a textual form only, if the sound at the client device is undesirable.
    Type: Application
    Filed: October 14, 2020
    Publication date: April 14, 2022
    Inventors: Laurent Desserrey, Jeremy Baker Voss
  • Publication number: 20220115019
    Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
    Type: Application
    Filed: October 11, 2021
    Publication date: April 14, 2022
    Applicant: SoundHound, Inc.
    Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
  • Publication number: 20220115020
    Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
    Type: Application
    Filed: October 11, 2021
    Publication date: April 14, 2022
    Applicant: SoundHound, Inc.
    Inventors: Kiersten L. BRADLEY, Ethan COEYTAUX, Ziming YIN
  • Publication number: 20220115021
    Abstract: A talker prediction method obtains a voice from a plurality of talkers, records a conversation history of the plurality of talkers, identifies a talker of the obtained voice, and predicts a next talker among the plurality of talkers based on the identified talker and the conversation history.
    Type: Application
    Filed: October 5, 2021
    Publication date: April 14, 2022
    Inventors: Satoshi UKAI, Ryo Tanaka
  • Publication number: 20220115022
    Abstract: Implementations relate to automatic generation of speaker features for each of one or more particular text-dependent speaker verifications (TD-SVs) for a user. Implementations can generate speaker features for a particular TD-SV using instances of audio data that each capture a corresponding spoken utterance of the user during normal non-enrollment interactions with an automated assistant via one or more respective assistant devices. For example, a portion of an instance of audio data can be used in response to: (a) determining that recognized term(s) for the spoken utterance captured by that the portion correspond to the particular TD-SV; and (b) determining that an authentication measure, for the user and for the spoken utterance, satisfies a threshold. Implementations additionally or alternatively relate to utilization of speaker features, for each of one or more particular TD-SVs for a user, in determining whether to authenticate a spoken utterance for the user.
    Type: Application
    Filed: October 13, 2020
    Publication date: April 14, 2022
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20220115023
    Abstract: A voice recognition transmission system for preventing unauthorized activation of a vehicle is provided. The system includes a microphone, a voice recognition module, and an automotive gear shifter. The microphone receives a voice command. The voice recognition module analyzes the voice command against a plurality of authorized voice profiles. The voice recognition module determines if the voice command reaches a threshold of similarity in tone and tenor of one authorized voice profile. A voice command that reaches the threshold of similarity will activate the automotive gear shifter to perform the voice command. A voice command that fails to reach the threshold of similarity will activate an interlock. The interlock will prevent an unauthorized individual from operating the vehicle and activate the vehicle alarm system. A notification will be sent to a connected electronic device when a voice command activates the automotive gear shifter or the interlock.
    Type: Application
    Filed: October 13, 2021
    Publication date: April 14, 2022
    Inventor: Milia Cora
  • Publication number: 20220115024
    Abstract: Examples of the disclosure relate to apparatus, methods and computer programs for encoding spatial metadata. The example apparatus includes circuitry configured for obtaining spatial metadata associated with spatial audio content and obtaining a configuration parameter indicative of a source format of the spatial audio content. The circuitry is also configured to use the configuration parameter to select a method of compression of the spatial metadata associated with the spatial audio content.
    Type: Application
    Filed: October 28, 2019
    Publication date: April 14, 2022
    Inventors: Tapani PIHLAJAKUJA, Lasse LAAKSONEN, Antti ERONEN, Arto LEHTINIEMI
  • Publication number: 20220115025
    Abstract: Encoding/decoding an audio signal having one or more audio components, wherein each audio component is associated with a spatial location. A first audio signal presentation (z) of the audio components, a first set of transform parameters (w(f)), and signal level data (?2) are encoded and transmitted to the decoder. The decoder uses the first set of transform parameters (w(f)) to form a reconstructed simulation input signal intended for an acoustic environment simulation, and applies a signal level modification (?) to the reconstructed simulation input signal. The signal level modification is based on the signal level data (?2) and data (p2) related to the acoustic environment simulation. The attenuated reconstructed simulation input signal is then processed in an acoustic environment simulator. With this process, the decoder does not need to determine the signal level of the simulation input signal, thereby reducing processing load.
    Type: Application
    Filed: October 25, 2021
    Publication date: April 14, 2022
    Applicant: Dolby Laboratories Licensing Corporation
    Inventor: Dirk Jeroen BREEBAART
  • Publication number: 20220115026
    Abstract: An apparatus includes a receiver and a decoder. The receiver is configured to receive a bitstream that includes a first frame and a second frame. The first frame includes a first portion of a mid channel and a first quantized stereo parameter. The second frame includes a second portion of the mid channel and a second quantized stereo parameter. The decoder is configured to generate a first portion of a channel based on the first portion of the mid channel and the first quantized stereo parameter. The decoder is configured to, in response to the second frame being unavailable for decoding operations, estimate the second quantized stereo parameter based on stereo parameters of one or more preceding frames and generate a second portion of the channel based on the estimated second quantized stereo parameter. The second portion of the channel corresponds to a decoded version of the second frame.
    Type: Application
    Filed: December 20, 2021
    Publication date: April 14, 2022
    Inventors: Venkata Subrahmanyam Chandra Sekhar CHEBIYYAM, Venkatramn Atti
  • Publication number: 20220115027
    Abstract: Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. For coding, portions of the original HOA representation are predicted from the directional signal components. This prediction provides side information which is required for a corresponding decoding. By using some additional specific purpose bits, a known side information coding processing is improved in that the required number of bits for coding that side information is reduced on average.
    Type: Application
    Filed: December 21, 2021
    Publication date: April 14, 2022
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Sven Kordon, Alexander Krueger, Oliver Wuebbolt
  • Publication number: 20220115028
    Abstract: Information loss in speech to text conversion and Inability to preserve vocal emotion information without changing the artificial intelligence model infrastructure in a conventional speech to speech translation system are essential drawback of the conventional techniques. Embodiments of the invention provide direct speech to speech translation system is disclosed. Direct speech to speech translation system uses a one-tier approach, creating a unified-model for whole application. The single-model ecosystem takes in audio (mel spectrogram) as an input and gives out audio (mel spectrogram) as an output. This solves the bottleneck problem by not converting speech directly to text but having text as a byproduct of speech to speech translation, preserving phonetic information along the way. This model also uses pre-processing and post-processing scripts but only for the whole model. This model needs parallel audio samples in two languages.
    Type: Application
    Filed: December 24, 2021
    Publication date: April 14, 2022
    Inventors: Sandeep Dhawan, Kapil Dhawan, Dennis Reutter, Chris Beckman, Ahsan Memon
  • Publication number: 20220115029
    Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.
    Type: Application
    Filed: December 10, 2020
    Publication date: April 14, 2022
    Inventors: Amanmeet Garg, Aneesh Vartakavi
  • Publication number: 20220115030
    Abstract: The disclosed computer-implemented method may include obtaining an audio sample from a content source, inputting the obtained audio sample into a trained machine learning model, obtaining the output of the trained machine learning model, wherein the output is a profile of an environment in which the input audio sample was recorded, obtaining an acoustic impulse response corresponding to the profile of the environment in which the input audio sample was recorded, obtaining a second audio sample, processing the obtained acoustic impulse response with the second audio sample, and inserting a result of processing the obtained acoustic impulse response and the second audio sample into an audio track. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 17, 2021
    Publication date: April 14, 2022
    Inventors: Yadong Wang, Shilpa Jois Rao, Murthy Parthasarathi, Kyle Tacke
  • Publication number: 20220115031
    Abstract: A credit segment identifying device includes an extracting unit which extracts, from a first speech signal, a plurality of first partial speech signals which are each a part of the first speech signals and shifted from each other in time direction and an identifying unit which identifies a credit segment in the first speech signal by determining whether each of the first partial speech signals includes a credit according to an association between each of second partial signals extracted from a second speech signal and the presence/absence of a credit, so that credit segments can be identified more efficiently.
    Type: Application
    Filed: January 24, 2020
    Publication date: April 14, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Yasunori OISHI, Takahito KAWANISHI, Kunio KASHINO
  • Publication number: 20220115032
    Abstract: A technique capable of detecting harmful behavior such as power harassment, sexual harassment, or bullying in work environment to support handling is provided. A harmful behavior detecting system includes a computer that executes observation and detection regarding harmful behavior including power harassment, sexual harassment, and bullying among people in work environment. The computer obtains voice data into which voice around a target person is inputted; obtains voice information containing words and emotion information from the voice data; and obtains data such as vital data, date and time, or a location of the target person. The computer uses five elements including words and an emotion of the other person, words, an emotion, and vital data of the target person to calculate an index value regarding the harmful behavior; estimate a state of the harmful behavior based on the index value; and output handling data for handling the harmful behavior in accordance with the estimated state.
    Type: Application
    Filed: November 12, 2019
    Publication date: April 14, 2022
    Inventors: Satoshi IWAGAKI, Atsushi SHIMADA, Masumi SUEHIRO, Hidenori CHIBA, Kouichi HORIUCHI
  • Publication number: 20220115033
    Abstract: A toxicity moderation system has an input configured to receive speech from a speaker. The system includes a multi-stage toxicity machine learning system having a first stage and a second stage. The first stage is trained to analyze the received speech to determine whether a toxicity level of the speech meets a toxicity threshold. The first stage is also configured to filter-through, to the second stage, speech that meets the toxicity threshold, and is further configured to filter-out speech that does not meet the toxicity threshold.
    Type: Application
    Filed: October 8, 2021
    Publication date: April 14, 2022
    Inventors: William Carter Huffman, Michael Pappas, Henry Howie
  • Publication number: 20220115034
    Abstract: An audio response system can generate multimodal messages that can be dynamically updated on viewer's client device based on a type of audio response detected. The audio responses can include keywords or continuum-based signal (e.g., levels of wind noise). A machine learning scheme can be trained to output classification data from the audio response data for content selection and dynamic display updates.
    Type: Application
    Filed: December 22, 2021
    Publication date: April 14, 2022
    Inventors: Gurunandan Krishnan Gorumkonda, Shree K. Nayar
  • Publication number: 20220115035
    Abstract: The present disclosure generally relates to a read head assembly having a dual free layer (DFL) structure disposed between a first shield and a second shield at a media facing surface. The read head assembly further comprises a rear hard bias (RHB) structure disposed adjacent to the DFL structure recessed from the media facing surface, where an insulation layer separates the RHB structure from the DFL structure. The insulation layer is disposed perpendicularly between the first shield and the second shield. The DFL structure comprises a first free layer and a second free layer having equal stripe heights from the media facing surface to the insulation layer. The RHB structure comprises a seed layer, a bulk layer, and a capping layer. The capping layer and the insulation layer prevent the bulk layer from contacting the second shield.
    Type: Application
    Filed: February 24, 2021
    Publication date: April 14, 2022
    Inventors: Ming MAO, Chen-Jung CHIEN, Daniele MAURI, Goncalo Marcos BAIÃO DE ALBUQUERQUE