Patents Issued in February 1, 2024
-
Publication number: 20240038198Abstract: A mura compensation method of a display panel and a display panel are provided. The display panel includes a display area. The display area includes a regular display sub-area and a function display sub-area. The mura compensation method of the display panel includes: performing an initial compensation for each pixel unit in the display area; acquiring an actual grayscale value and an actual brightness value of each pixel unit in the function display sub-area; and determining a secondary compensation value of each pixel unit in the function display sub-area and performing a secondary compensation.Type: ApplicationFiled: December 20, 2021Publication date: February 1, 2024Applicant: Wuhan China Star Optoelectronics Semiconductor Display Technology Co., Ltd.Inventor: Yedong WANG
-
Publication number: 20240038199Abstract: An image processing device according to the present disclosure includes: an input section that receives an input image signal including an input synchronization signal and input image data; a first processor that performs first processing on the basis of the input image data at a timing corresponding to the input synchronization signal; a synchronization signal generator that generates an output synchronization signal; a second processor that performs second processing on the basis of a processing result of the first processing at a timing corresponding to the input synchronization signal or a timing corresponding to the output synchronization signal; a controller that controls at which timing of the timing corresponding to the input synchronization signal and the timing corresponding to the output synchronization signal the second processor is to perform the second processing; a third processor that performs third processing on the basis of a processing result of the second processing at a timing correspondiType: ApplicationFiled: December 1, 2021Publication date: February 1, 2024Inventors: Akira Shimizu, Toshichika Mikami
-
Publication number: 20240038200Abstract: An electronic device may be provided with a display. The display may be formed from liquid crystal display pixels, organic light-emitting diode pixels, or other pixels. The display may have an active area that is bordered along at least one edge by an inactive area. The active area contains pixels and displays images. The inactive area does not contain any pixels and does not display images. The inactive area may have a layer of black ink or other masking material to block internal components from view. The active area may have an opening that contains an isolated portion of the inactive area or may contain a recess into which a portion of the inactive area protrudes. An electrical component such as a speaker, camera, light-emitting diode, light sensor, or other electrical device may be mounted in the inactive area in the recess or opening of the active area.Type: ApplicationFiled: October 16, 2023Publication date: February 1, 2024Inventors: Mikael M. Silvanto, Dinesh C. Mathew, Victor H. Yin
-
Publication number: 20240038201Abstract: An image display method according to the present disclosure includes: a step of measuring a distance in a depth direction of a display target when viewed from the user (S12); and a step of displaying the display target on either the first display device or the second display device based on the measured distance (S14 and S15).Type: ApplicationFiled: December 11, 2020Publication date: February 1, 2024Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventor: Makoto MUTO
-
Publication number: 20240038202Abstract: An image display method and an image display device are provided, which are applied to a display panel including a plurality of pixel units having A pixel unit rows and B pixel unit columns, including: acquiring data image information including C pixel data rows and D pixel data columns of a plurality of pixel data, wherein C<A and/or D<B, and input the plurality of pixel data to multiple data lines and enable pixel unit rows, inputting the plurality of pixel data into a plurality of data lines corresponding to the B pixel unit columns, and enabling the A pixel unit rows sequentially, to make at least part of the pixel units display according to a same pixel data.Type: ApplicationFiled: December 17, 2021Publication date: February 1, 2024Applicant: TCL CHINA STAR OPTOELECTRONICS TECHNOLOGY CO., LTD.Inventors: Xiaoli FANG, Yu WU
-
Publication number: 20240038203Abstract: Guitar strings composed of an Iron and Nickel smelted alloy for use with wide aperture single coil pickups, and guitars therefor.Type: ApplicationFiled: July 14, 2023Publication date: February 1, 2024Inventors: Dean Farley, Robert L. Butterworth
-
Publication number: 20240038204Abstract: A sound bar has a striking surface. The sound bar includes: a surface layer having a first surface constituting at least a part of the striking surface and a second surface opposite across a thickness of the surface layer from the first surface; and a base fixed to the second surface of the surface layer. A cutout surface is provided on a peripheral edge portion of the striking surface. The first surface of the surface layer is smaller than the base in a plan view.Type: ApplicationFiled: October 16, 2023Publication date: February 1, 2024Inventors: Eri HIRAI, Kazuki SOGA, Hisaaki MUKAI, Junnosuke KASEDA, Ichiro OSUGA, Yuichi TADANO, Ayumi IRISA
-
Publication number: 20240038205Abstract: Disclosed herein are systems, apparatus, and/or methods for automatically generating in real-time an adaptive digital music stream that satisfies a particular scenario (either static or changing), as described by the dynamic input of an emotional vector (possibly with emotional direction/target), one or more musical styles and one or more musical themes, using an automated music composition, performance and audio production system based which utilizes machine learning and artificial intelligence techniques.Type: ApplicationFiled: July 26, 2023Publication date: February 1, 2024Inventors: Ryan Alexander GROVES, Andrew John ELMSLEY, Valerio VELARDO
-
Publication number: 20240038206Abstract: An electronic musical instrument, method for a musical sound generation process and a non-transitory computer readable medium that stores an electronic musical instrument program are provided. The program causes a computer provided with a storage part to execute a musical sound generation process using sound data. The program causes the computer to execute: acquiring, from the storage part, first sound data and first user identification information indicating a user who has acquired the first sound data from a distribution server; acquiring second user identification information indicating a user who causes the musical sound generation process to be executed using the first sound data; determining whether or not the first user identification information matches the second user identification information; and inhibiting execution of the musical sound generation process using the first sound data in a case when the first user identification information does not match the second user identification information.Type: ApplicationFiled: October 13, 2023Publication date: February 1, 2024Applicant: Roland CorporationInventor: Yusuke MIYAMA
-
Publication number: 20240038207Abstract: A live distribution device includes an obtaining circuit, a data processing circuit, and a distribution circuit. The obtaining circuit is configured to obtain a piece of music and/or a user reaction to the piece of music. The user reaction is obtained from a viewing user, among a plurality of users, who is viewing a performance. The data processing circuit is configured to generate processed data based on the piece of music and/or the user reaction obtained by the obtaining circuit. The processed data indicates how the performance is viewed by the viewing user. The distribution circuit is configured to distribute the generated processed data to a terminal device of a non-viewing user, among the plurality of users, who is not viewing the performance.Type: ApplicationFiled: October 16, 2023Publication date: February 1, 2024Inventors: Taiki SHIMOZONO, Keijiro SAINO
-
Publication number: 20240038208Abstract: An ultrasonic sensor includes a magnetic substance, a diaphragm, and an electromagnetic transducer. The electromagnetic transducer is disposed opposing the magnetic substance across an outer plate of a vehicle. The diaphragm has the shape of a thin film with a film thickness direction along an axial direction parallel to a directional axis. The diaphragm is able to ultrasonically oscillate by being joined to an outer surface of the outer plate at an outer edge in a radial direction crossing the directional axis. With the outer edge of the diaphragm joined to the outer surface of the outer plate, an internal space that expands and contracts along the axial direction in response to ultrasonic oscillation of the diaphragm is formed between the diaphragm and the outer surface of the outer plate.Type: ApplicationFiled: October 11, 2023Publication date: February 1, 2024Inventors: Masayoshi SATAKE, Tetsuya AOYAMA, Mitsumasa MIYAZAKI
-
Publication number: 20240038209Abstract: A damping panel includes a first metal and a second metal. The first metal has a first surface and a second surface opposite the first surface. The second metal portion has a third surface and a fourth surface opposite the third surface. A fixed region is where the second surface of the first metal is fixed to the third surface of the second metal. A non-fixed region is where the second surface is not fixed to the third surface. A cavity is disposed between the second surface and the third surface at the non-fixed region. A damping material is disposed on at least one of: at least a portion of the cavity on the second surface, at least a portion of the cavity on the third surface, at least a portion of the first surface, and at least a portion of the fourth surface.Type: ApplicationFiled: July 26, 2022Publication date: February 1, 2024Inventors: Song HE, Arianna T. MORALES, Anil K. SACHDEV, Pavan Kumar PATRUNI, Hung-yih Isaac DU
-
Publication number: 20240038210Abstract: A flexural wave absorber includes an L-shaped cantilever beam lossy acoustic black hole disposed on a surface of a mechanical structure and an L-shaped cantilever beam lossless acoustic black hole disposed on the surface and spaced apart from the L-shaped cantilever beam lossy acoustic black hole a predefined distance. The L-shaped cantilever beam lossy acoustic black hole and the L-shaped cantilever beam lossless acoustic black hole, in combination, are configured to asymmetrically absorb a plurality of different frequencies of flexural waves within at least a 2000 Hz frequency band acting on the mechanical structure. The L-shaped cantilever beam lossy acoustic black hole and the L-shaped cantilever beam lossless acoustic black hole can both include a projecting beam with an outer surface or an inner surface having power law profile.Type: ApplicationFiled: July 28, 2022Publication date: February 1, 2024Applicants: Toyota Motor Engineering & Manufacturing North America, Inc., Toyota Jidosha Kabushiki KaishaInventors: Xiaopeng Li, Ziqi Yu, Taehwa Lee
-
Publication number: 20240038211Abstract: An adaptive noise-canceling system generates an anti-noise signal with a filter that has a response controlled by a set of coefficients selected from a collection of coefficient sets. The adaptive noise-canceling system includes an acoustic output transducer for reproducing a signal containing the anti-noise signal, a first microphone for measuring ambient noise at a first location to produce a first noise measurement signal, a second microphone for measuring the ambient noise at a second location to generate a second noise measurement signal, and an analysis subsystem for analyzing the first noise measurement signal and the second noise measurement signal. The adaptive noise-canceling system also includes a controller that selects the set of coefficients from the collection of coefficient sets according to a phase difference between the first noise measurement signal and the second noise measurement signal as determined by the analysis subsystem.Type: ApplicationFiled: July 27, 2022Publication date: February 1, 2024Inventors: John Bryan-Merrett, Mert Salahi, Wilbur Lawrence, Samuel P. Ebenezer, Rachid Kerkoud
-
Publication number: 20240038212Abstract: Disclosed are apparatuses, systems, and techniques that may use machine learning for implementing generative text-to-speech models. The techniques include identifying a mapping of speech characteristics (SC) on a target distribution of a latent variable using a non-linear transformation for at least a subset of the SC. Parameters of the non-linear transformation are determined using a neural network that approximates a statistics of the SC with a statistics predicted for the SC based on the identified mapping and the target distribution of the latent variable.Type: ApplicationFiled: January 20, 2023Publication date: February 1, 2024Inventors: Kevin Shih, José Rafael Valle Gomes da Costa, Rohan Badlani, João Felipe Santos, Bryan Catanzaro
-
Publication number: 20240038213Abstract: A generation device (100) extracts a plurality of integrated speech samples by repeatedly executing processing of integrating a plurality of consecutive speech samples included in speech waveform information into one speech sample, and generates a compressed speech sample by compressing the plurality of integrated speech samples extracted. The generation device (100) generates a plurality of new integrated speech samples subsequent to the plurality of integrated speech samples by inputting the compressed speech sample and an acoustic feature value calculated from the speech waveform information to a speech waveform generation model, and repeatedly executes processing of inputting a compressed speech sample obtained by compressing the plurality of new integrated speech samples and the acoustic feature value to the speech waveform generation model, to generate a plurality of new integrated speech samples a plurality of times.Type: ApplicationFiled: November 25, 2020Publication date: February 1, 2024Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventor: Hiroki KANAGAWA
-
Publication number: 20240038214Abstract: A method for representing an intended prosody in synthesized speech includes receiving a text utterance having at least one word, and selecting an utterance embedding for the text utterance. Each word in the text utterance has at least one syllable and each syllable has at least one phoneme. The utterance embedding represents an intended prosody. For each syllable, using the selected utterance embedding, the method also includes: predicting a duration of the syllable by decoding a prosodic syllable embedding for the syllable based on attention by an attention mechanism to linguistic features of each phoneme of the syllable and generating a plurality of fixed-length predicted frames based on the predicted duration for the syllable.Type: ApplicationFiled: October 16, 2023Publication date: February 1, 2024Applicant: Google LLCInventors: Robert Clark, Chun-an Chan, Vincent Wan
-
Publication number: 20240038215Abstract: The present disclosure provides methods, devices, and computer-readable mediums for audio signal processing. In some embodiments, a method executed by an electronic device includes obtaining guidance features corresponding to an audio signal to be processed, the guidance features indicating distinguishable features of at least one signal type of at least one signal category. The method further includes extracting, according to the guidance features, target audio features corresponding to the audio signal. The method further includes determining, according to the target audio features, a target signal type of the audio signal from among the at least one signal type of the at least one signal category. The method further includes performing corresponding processing according to the target signal type of the audio signal.Type: ApplicationFiled: July 18, 2023Publication date: February 1, 2024Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Qiuyue MA, Yuxing ZHENG, Hosang SUNG, Lizhong WANG, Xiaoyan LOU
-
Publication number: 20240038216Abstract: An example system includes a processor to receive encoded audio from an encoder of a pre-trained speech-to-text (STT) model. The processor is to further train a language identification (LID) classifier to detect a language of the encoded audio using training samples labeled by language.Type: ApplicationFiled: July 27, 2022Publication date: February 1, 2024Inventor: Zvi KONS
-
Publication number: 20240038217Abstract: In an embodiment a system includes a training data preparation device configured to obtain a speech recognition rate of speech data for training using a target speech recognition model, a recognition rate prediction model configured to estimate an expected recognition rate of the target speech recognition model for clean speech data in which noise is removed from the speech data for training and a speech preprocessing model configured to preprocess the speech data for training to obtain the clean speech data and to update the speech preprocessing model based on a recognition rate loss corresponding to a difference between the expected recognition rate and a maximum recognition rate.Type: ApplicationFiled: April 4, 2023Publication date: February 1, 2024Inventor: Yong Hyeok Lee
-
Publication number: 20240038218Abstract: An apparatus for speech model with personalization via ambient context harvesting, is described herein. The apparatus includes a microphone, context harvesting module, confidence module, and training module. The context harvesting module is to determine a context associated with the captured audio signals. A confidence module is to determine a confidence of the context as applied to the audio signals. A training module is to train a neural network in response to the confidence being above a predetermined threshold.Type: ApplicationFiled: August 10, 2023Publication date: February 1, 2024Inventors: Gabriel Amores, Guillermo Perez, Moshe Wasserblat, Michael Deisher, Loic Dufrensne de Virel
-
Publication number: 20240038219Abstract: Methods, systems, apparatus, including computer programs encoded on a computer storage medium, for a user device to learn offline voice actions. In one aspect, the method includes actions of detecting, by the user device, an utterance at a first time when the user device is connected to a server by a network, providing, by the user device, the utterance to the server using the network, receiving, by the user device and from the server, an update to the grammar of the user device, detecting, by the user device, a subsequent utterance of the utterance at a second time when the user device is not connected to the server by a network, and in response to detecting, by the user device, the subsequent utterance of the utterance at the second time, identifying, by the user device, an operation to perform based on (i) the subsequent utterance, and (ii) the updated grammar.Type: ApplicationFiled: October 16, 2023Publication date: February 1, 2024Inventors: Vikram Aggarwal, Moises Morgenstern Gali
-
Publication number: 20240038220Abstract: A computer-implemented technique is described herein for expediting a user's interaction with a digital assistant. In one implementation, the technique involves receiving a system prompt generated by a digital assistant in response to an input command provided by a user via an input device. The technique then generates a predicted response based on linguistic content of the system prompt, together with contextual features pertaining to a circumstance in which the system prompt was issued. The predicted response corresponds to a prediction of how the user will respond to the system prompt. The technique then selects one or more dialogue actions from a plurality of dialogue actions, based on a confidence value associated with the predicted response. The technique expedites the user's interaction with the digital assistant by reducing the number of system prompts that the user is asked to respond to.Type: ApplicationFiled: October 9, 2023Publication date: February 1, 2024Inventors: Vipul AGARWAL, Rahul Kumar JHA, Soumya BATRA, Karthik TANGIRALA, Mohammad MAKARECHIAN, Imed ZITOUNI
-
Publication number: 20240038221Abstract: Systems, computer-implemented methods, and computer program products to facilitate multi-task training a recurrent neural network transducer (RNN-T) using automatic speech recognition (ASR) information are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include an RNN-T that can receive ASR information. The computer executable components can include a voice activity detection (VAD) model that trains the RNN-T using the ASR information, where the RNN-T can further comprise an encoder and a joint network. One or more outputs of the encoder can be integrated with the joint network and one or more outputs of the VAD model.Type: ApplicationFiled: July 28, 2022Publication date: February 1, 2024Inventors: Sashi Novitasari, Takashi Fukuda, Gakuto Kurata
-
Publication number: 20240038222Abstract: A system is provided for consent detection and validation. Uttered speech signals and sensor data of at least one user is received from a first electronic device during a personal interaction, scheduled based on a consent response corresponding to an acceptance of a consent request, between two users. A confidence score is determined based on intent of the two users, current sensor data and a new set of user characteristics for the two users during the personal interaction. An immediate consent or dissent of one of the two users is detected at defined timestamp during the personal interaction based on comparison of the confidence score with threshold value, explicit or implied keywords from uttered speech signals, and extent of deviated values of sensor data. Based on a plurality of criteria, immediate consent or dissent of one of the two users is validated and second set of tasks is performed.Type: ApplicationFiled: July 26, 2022Publication date: February 1, 2024Inventor: Wassim Samir Chaar
-
Publication number: 20240038223Abstract: This application relates to a speech recognition method and apparatus. The speech recognition method includes: A terminal device inputs a to-be-recognized phoneme into a first multitask neural network model; the first multitask neural network model outputs a first prediction result, where the first prediction result includes a character prediction result and a punctuation prediction result that correspond to the to-be-recognized phoneme; and the terminal device displays at least a part of the first prediction result on a display of the terminal device. A neural network model for simultaneously predicting a character and a punctuation corresponding to a phoneme is constructed, so that the character and the punctuation corresponding to the phoneme can be simultaneously output. In addition, the neural network model is small-sized, and can be deployed on a terminal side.Type: ApplicationFiled: December 29, 2021Publication date: February 1, 2024Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventor: Xuxian Yin
-
Publication number: 20240038224Abstract: Accuracy of transcript is of foremost importance in Automatic Speech Recognition (ASR). State of the art system mostly rely on spelling correction based contextual improvement in ASR, which is generally a static vocabulary based biasing approach. Embodiments of the present disclosure provide a method and system for visual context aware ASR. The method provides biasing using shallow fusion biasing approach with a modified beam search decoding technique, which introduces a non-greedy pruning strategy to allow biasing at the sub-word level. The biasing algorithm brings in the visual context of the robot to the speech recognizer based on a dynamic biasing vocabulary, improving the transcription accuracy. The dynamic biasing vocabulary, comprising objects in a current environment accompanied by their self and relational attributes, is generated using a bias prediction network that explicitly adds label to objects, which are detected and captioned via a state of the art dense image captioning network.Type: ApplicationFiled: June 13, 2023Publication date: February 1, 2024Applicant: Tata Consultancy Services LimitedInventors: CHAYAN SARKAR, PRADIP PRAMANICK, RUCHIRA SINGH
-
Publication number: 20240038225Abstract: There is provided a method that includes obtaining data that describes (a) a situation, (b) a gesture for a response to the situation, (c) a prompt to accompany the response, and (d) a gestural annotation for the response, and utilizing a conversational machine learning technique to train a natural language understanding (NLU) model to address the situation, based on the data.Type: ApplicationFiled: July 26, 2022Publication date: February 1, 2024Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Abhishek ROHATGI, Eduardo OLVERA, Dinesh SAMTANI, Flaviu Gelu NEGREAN, Manar ALAZMA
-
Publication number: 20240038226Abstract: Systems and methods relate to executing a task using a machine learning model based on prompt generation and collaborative interactions with a user. The machine language model generating a set of questions based on a task request. The user interactively answers the questions. A task processor generates a set of question-answer pairs based on the questions generated by the machine learning model and the answers given by the user. The machine learning model generates a task specific output based on the set of question-answer pairs. The machine learning model represents a large language model with deep learning. The simple question-and-answer prompts enable non-expert users to instruct the machine learning model with information that is sufficient to execute the task without overwhelming the users with the operations. The machine learning model leverages the answers to execute the task with accuracy, thereby providing efficacy of the prompting technique.Type: ApplicationFiled: October 21, 2022Publication date: February 1, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Elnaz NOURI, Swaroop Ranjan MISHRA
-
Publication number: 20240038227Abstract: A system includes a multi-user interface and a collaborative drawing tool. The collaborative drawing tool can receive a first set of voice-based drawing commands from a first user, perform a first speech-to-text conversion to transform the first set of voice-based drawing commands into a first set of drawing commands, and render a first drawing update on the multi-user interface based on the first set of drawing commands. The collaborative drawing tool can receive a second set of voice-based drawing commands from a second user, perform a second speech-to-text conversion to transform the second set of voice-based drawing commands into a second set of drawing commands, render a second drawing update on the multi-user interface based on the second set of drawing commands, and save, to a storage system, a drawing output of the multi-user interface resulting from the first drawing update and the second drawing update.Type: ApplicationFiled: July 29, 2022Publication date: February 1, 2024Inventors: Obaid Shaikh, Chandra S. Bathula, Rebecca T. Scanlon, Annapurna Jagasia, Bharghavi Tanamala, Robert O'Connor, Matthew W. Kropp, Jhin McGlynn
-
Publication number: 20240038228Abstract: In some implementations, a method includes displaying, on a display, an environment that includes a representation of a virtual agent that is associated with a sensory characteristic. In some implementations, the method includes selecting, based on the sensory characteristic associated with the virtual agent, a subset of a plurality of sensors to provide sensor data for the virtual agent. In some implementations, the method includes providing the sensor data captured by the subset of the plurality of sensors to the virtual agent in order to reduce power consumption of the device. In some implementations, the method includes displaying a manipulation of the representation of the virtual agent based on an interpretation of the sensor data by the virtual agent.Type: ApplicationFiled: July 26, 2023Publication date: February 1, 2024Inventors: Dan Feng, Behrooz Mahasseni, Bo Morgan, Daniel L. Kovacs, Mu Qiao
-
Publication number: 20240038229Abstract: Systems and methods are disclosed for enabling verbal interaction with an NLUI application without relying on express wake terms. The NLUI application receives an audio input comprising a plurality of terms. In response to determining that none of the terms is an express wake term pre-programmed into the NLUI application, the NLUI application determines a topic for the plurality of terms. The NLUI application then determines whether the topic is within a plurality of topics for which a response should be generated. If the determined topic of the audio input is within the plurality of topics, the NLUI application generates a response to the audio input.Type: ApplicationFiled: August 1, 2023Publication date: February 1, 2024Inventors: Vikram Makam Gupta, Sukanya Agarwal, Gyanveer Singh
-
Publication number: 20240038230Abstract: A display apparatus includes an input unit configured to receive a user command; an output unit configured to output a registration suitability determination result for the user command; and a processor configured to generate phonetic symbols for the user command, analyze the generated phonetic symbols to determine registration suitability for the user command, and control the output unit to output the registration suitability determination result for the user command. Therefore, the display apparatus may register a user command which is resistant to misrecognition and guarantees high recognition rate among user commands defined by a user.Type: ApplicationFiled: October 6, 2023Publication date: February 1, 2024Applicant: Samsung Electronics Co., Ltd.Inventors: Nam-yeong KWON, Kyung-mi PARK
-
Publication number: 20240038231Abstract: Systems and methods for determining whether to combine responses from multiple automated assistants. An automated assistant may be invoked by a user utterance, followed by a query, which is provided to a plurality of automated assistants. A first response is received from a first automated assistant and a second response is received from a second automated assistant. Based on similarity between the responses, a primary automated assistant determines whether to combine the responses into a combined response. Once the combined response has been generated, one or more actions are performed in response to the combined response.Type: ApplicationFiled: October 9, 2023Publication date: February 1, 2024Inventors: Matthew Sharifi, Victor Carbune
-
Publication number: 20240038232Abstract: The present disclosure is generally related to a data processing system to process data packets in a voice activated computer network environment. The data processing system can improve the efficiency of the network by generating non-video data responses to voice commands received from a client device if a display associated with a client device is in an OFF state. A digital assistant application executed on the client device can send to the data processing system client device configuration data, which includes the state of the display device, among status data of other components of the client device. The data processing system can receive a current volume of speakers associated with the client device, and set a volume level for the client device based on the current volume level and a minimum response volume level at the client device.Type: ApplicationFiled: October 11, 2023Publication date: February 1, 2024Inventor: Jian Wei Leong
-
Publication number: 20240038233Abstract: Custom acoustic models can be configured by developers by providing audio files with custom recordings. The custom acoustic model is trained by tuning a baseline model using the audio files. Audio files may contain custom noise to apply to clean speech for training. The custom acoustic model is provided as an alternative to a standard acoustic model. A speech recognition system can select an acoustic model for use upon receiving metadata about the device conditions or type. Speech recognition is performed on speech audio using one or more acoustic models. The result can be provided to developers through the user interface, and an error rate can be computed and also provided.Type: ApplicationFiled: October 12, 2023Publication date: February 1, 2024Applicant: SoundHound AI IP, LLCInventors: Keyvan Mohajer, Mehul Patel
-
Publication number: 20240038234Abstract: A device may receive a command associated with identifying a merchant for a virtual card swap procedure wherein the virtual card swap procedure is to replace a credit card of a user with a virtual card corresponding to the credit card. The device may identify the merchant for the virtual card swap procedure based on the command. The device may obtain the virtual card for the user. The device may determine a virtual card swap procedure template for the merchant. The device may perform the virtual card swap procedure based on the virtual card swap procedure template.Type: ApplicationFiled: September 25, 2023Publication date: February 1, 2024Inventors: Adam VUKICH, Abdelkadar M’Hamed BENKREIRA, Vu NGUYEN, Joshua EDWARDS, Jonatan YUCRA RODRIGUEZ, David GABRIELE
-
Publication number: 20240038235Abstract: A method includes, during a teleconference between a first audio input/output device and a second audio input/output device, receiving, at an analysis and response device, a signal indicating a spoken command, the spoken command associated with a command mode. The method further includes, in response to receiving the signal, generating, at the device, a reply message based on the spoken command, the reply message to be output to one or more devices selected based on the command mode. The one or more devices includes the first audio input/output device, the second audio input/output device, or a combination thereof.Type: ApplicationFiled: October 12, 2023Publication date: February 1, 2024Inventors: Kwan Truong, Yibo Liu, Peter L. Chu, Zhemin Tu, Jesse Coleman, Cody Schnacker, Andrew Lochbaum
-
Publication number: 20240038236Abstract: Utterance-based user interfaces can include activation trigger processing techniques for detecting activation triggers and causing execution of certain commands associated with particular command pattern activation triggers without waiting for output from a separate speech processing engine. The activation trigger processing techniques can also detect speech analysis patterns and selectively activate a speech processing engine.Type: ApplicationFiled: October 12, 2023Publication date: February 1, 2024Applicant: Spotify ABInventor: Richard Mitic
-
Publication number: 20240038237Abstract: A system for determining intent in a voice signal receives a first voice signal that indicates to perform a task. The system sends a first response that comprises a hyperlink associated with a particular webpage used to perform the task. The system receives a second voice signal that indicates whether to access the hyperlink. The system determines intent of the second voice signal by comparing keywords of the second voice signal with keywords of the first response. The system activates the hyperlink in response to determining that the keywords of the second voice signal correspond to the keywords of the first response.Type: ApplicationFiled: October 16, 2023Publication date: February 1, 2024Inventor: Emad Noorizadeh
-
Publication number: 20240038238Abstract: Embodiments of this application provide a speech recognition method. The speech recognition method includes: obtaining a facial depth image and a to-be-recognized voice of a user, where the facial depth image is an image collected by using a depth camera; recognizing a mouth shape feature from the facial depth image, and recognizing a voice feature from a to-be-recognized audio; and fusing the voice feature and the mouth shape feature into an audio-video feature, and recognizing, based on the audio-video feature, a voice uttered by the user.Type: ApplicationFiled: August 12, 2021Publication date: February 1, 2024Inventors: Lei QIN, Lele ZHANG, Hao LIU, Yuewan LU
-
Publication number: 20240038239Abstract: A user device (e.g., voice assistant device, voice enabled device, smart device, computing device, etc.) may receive/detect audio content (e.g., speech, etc.) that includes a wake word and/or words similar to a wake word. The user device may require a wake word, a portion of the wake word, or words similar to the wake word to be detected prior to interacting with a user. The user device may, based on characteristics of the audio content, determine if the audio content originates from an authorized user. The user device may decrease and/or increase scrutiny applied to wake word detection based on whether audio content originates from an authorized user.Type: ApplicationFiled: October 6, 2023Publication date: February 1, 2024Inventors: Hans Sayyadi, Nima Bina
-
Publication number: 20240038240Abstract: A biometric authentication device is provided with: a replay unit for reproducing a sound; an ear authentication unit for acquiring a reverberation sound of the sound in an ear of a user to be authenticated, extracting an ear acoustic feature from the reverberation sound, and calculating an ear authentication score by comparing the extracted ear acoustic feature with an ear acoustic feature stored in advance; a voice authentication unit for extracting a talker feature from a voice of the user that has been input, and calculating a voice authentication score by comparing the extracted talker feature with a talker feature stored in advance; and an authentication integration unit for outputting an authentication integration result calculated based on the ear authentication score and the voice authentication score. After the sound is output into the ear, a recording unit inputs the voice of the user.Type: ApplicationFiled: October 10, 2023Publication date: February 1, 2024Applicant: NEC CorporationInventors: Koji OKABE, Takayuki ARAKAWA, Takafumi KOSHINAKA
-
Publication number: 20240038241Abstract: A biometric authentication device is provided with: a replay unit for reproducing a sound; an ear authentication unit for acquiring a reverberation sound of the sound in an ear of a user to be authenticated, extracting an ear acoustic feature from the reverberation sound, and calculating an ear authentication score by comparing the extracted ear acoustic feature with an ear acoustic feature stored in advance; a voice authentication unit for extracting a talker feature from a voice of the user that has been input, and calculating a voice authentication score by comparing the extracted talker feature with a talker feature stored in advance; and an authentication integration unit for outputting an authentication integration result calculated based on the ear authentication score and the voice authentication score. After the sound is output into the ear, a recording unit inputs the voice of the user.Type: ApplicationFiled: October 10, 2023Publication date: February 1, 2024Applicant: NEC CorporationInventors: Koji OKABE, Takayuki ARAKAWA, Takafumi KOSHINAKA
-
Publication number: 20240038242Abstract: A biometric authentication device is provided with: a replay unit for reproducing a sound; an ear authentication unit for acquiring a reverberation sound of the sound in an ear of a user to be authenticated, extracting an ear acoustic feature from the reverberation sound, and calculating an ear authentication score by comparing the extracted ear acoustic feature with an ear acoustic feature stored in advance; a voice authentication unit for extracting a talker feature from a voice of the user that has been input, and calculating a voice authentication score by comparing the extracted talker feature with a talker feature stored in advance; and an authentication integration unit for outputting an authentication integration result calculated based on the ear authentication score and the voice authentication score. After the sound is output into the ear, a recording unit inputs the voice of the user.Type: ApplicationFiled: October 10, 2023Publication date: February 1, 2024Applicant: NEC CorporationInventors: Koji Okabe, Takayuki Arakawa, Takafumi Koshinaka
-
Publication number: 20240038243Abstract: A biometric authentication device is provided with: a replay unit for reproducing a sound; an ear authentication unit for acquiring a reverberation sound of the sound in an ear of a user to be authenticated, extracting an ear acoustic feature from the reverberation sound, and calculating an ear authentication score by comparing the extracted ear acoustic feature with an ear acoustic feature stored in advance; a voice authentication unit for extracting a talker feature from a voice of the user that has been input, and calculating a voice authentication score by comparing the extracted talker feature with a talker feature stored in advance; and an authentication integration unit for outputting an authentication integration result calculated based on the ear authentication score and the voice authentication score. After the sound is output into the ear, a recording unit inputs the voice of the user.Type: ApplicationFiled: October 10, 2023Publication date: February 1, 2024Applicant: NEC CorporationInventors: Koji OKABE, Takayuki Arakawa, Takafumi Koshinaka
-
Publication number: 20240038244Abstract: A speaker subset selection means 81 selects speakers corresponding to an attribute from subset information of an entire speaker to determine a subset of a speech model from which test utterance is identified. A speaker identification means 82 identifies a speaker of the test utterance from a subset of the determined speech model based on features extracted from the test utterance.Type: ApplicationFiled: December 25, 2020Publication date: February 1, 2024Applicant: NEC CorporationInventors: Qiongqiong Wang, Takafumi KOSHINAKA
-
Publication number: 20240038245Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.Type: ApplicationFiled: October 11, 2023Publication date: February 1, 2024Applicant: Google LLCInventors: Georg Heigold, Samuel Bengio, Ignacio Lopez Moreno
-
Publication number: 20240038246Abstract: Implementations relate to an automated assistant that is responsive, without requiring an invocation phrase or other invocation input(s), to certain spoken utterances when certain display content is being accessed by a user. The display content can be processed to identify certain inputs and/or other intents and parameters that are associated with assistant operations and are relevant to the display content. Thereafter, the automated assistant can determine whether any spoken utterances from the user correspond to those certain inputs, intents, and/or parameters. In response to receiving such a spoken utterance, the automated assistant can initialize performance of the relevant operation without necessitating that the user provides a preceding invocation phrase or other invocation input(s). When other display content is being accessed, the automated assistant can repeat the process for other inputs and operations.Type: ApplicationFiled: July 28, 2022Publication date: February 1, 2024Inventors: Pu-sen Chao, Alex Fandrianto, Muhammad Umair
-
Publication number: 20240038247Abstract: This application relates to a method and an apparatus for controlling a sound receiving device based on a dual-mode audio three-dimensional code and a method and an apparatus for parsing a control signal of a sound receiving device based on a dual-mode audio three-dimensional code. The method includes: receiving an operation instruction and encoding the operation instruction as a digital vector; obtaining a first audio three-dimensional code corresponding to a preset speech signal, and encoding the digital vector into the first audio three-dimensional code to obtain a second audio three-dimensional code; and converting the second audio three-dimensional code into a speech signal and sending the speech signal to a sound receiving device. In the method, operation convenience can be improved, and an instruction operation can be performed without arrangement of any additional module.Type: ApplicationFiled: May 11, 2023Publication date: February 1, 2024Applicant: AUDICON CORPORATIONInventors: Zhihui Xiong, Wang Chen