Patents by Inventor Georg Stemmer

Georg Stemmer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Methods and apparatus to generate binaural sounds for hearing devices

Patent number: 12684304

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to generate binaural sounds for hearing devices. An example apparatus includes processor circuitry to at least access audio data corresponding to multiple devices, ones of the multiple devices positioned at spatial locations relative to a listener, identify a position of the listener relative to the multiple devices, adjust, based on the spatial locations and the position of the listener, the audio data associated with at least one of the multiple devices, transmit the adjusted audio data to a hearing device associated with the listener, the adjusted audio data including a binaural sound corresponding to each of the spatial locations.

Type: Grant

Filed: May 27, 2022

Date of Patent: July 14, 2026

Assignee: Intel Corporation

Inventors: Georg Stemmer, Hector Cordourier Maruri, Willem Beltman
Automatic personal identifiable information removal from audio

Patent number: 12614546

Abstract: This disclosure describes systems, methods, and devices related to automatic personal identifiable information (PII) removal. A system may detect a sound signal received from a vicinity of a machine during the operation of the machine. The system may perform speech detection to detect a segment of the sound signal that comprises a speech signal. The system may modify the sound signal at the segment of the sound signal by performing a segment replacement mechanism. The system may generate a filtered sound signal to be used for monitoring the operation of the machine.

Type: Grant

Filed: November 23, 2021

Date of Patent: April 28, 2026

Assignee: Intel Corporation

Inventors: Raju Arvind, Jose Lopez, Georg Stemmer
PERSONALIZED DEEPFAKE DETECTION

Publication number: 20260088032

Abstract: Systems and methods are provided for detecting audio deepfakes, including synthetic speech generated using advanced artificial intelligence techniques. The disclosed techniques address the shortcomings of existing deepfake detection models, which often fail to robustly distinguish between authentic and synthetic audio and may require extensive retraining or large datasets. The deepfake detection system leverages verified audio samples of a known speaker to generate a distribution of detection scores using a speaker-independent deepfake detector, without modifying or retraining the underlying model. By segmenting the verified samples and constructing a statistical reference distribution, the system applies a statistical test to determine whether the detection scores from an unverified audio input are consistent with the reference distribution.

Type: Application

Filed: November 24, 2025

Publication date: March 26, 2026

Applicant: Intel Corporation

Inventors: Jose Lopez, Georg Stemmer
VOICE TRANSFORMATION FOR THROAT MICROPHONES

Publication number: 20260080888

Abstract: Systems and methods are provided for transforming audio signals captured by a throat microphone into signals emulating speech recorded with a conventional air-conduction microphone. Throat microphones employ vibration sensors positioned on the neck to capture audio, making them suitable for high-noise environments. However, throat microphone signals lack high-frequency components, reducing intelligibility and degrading automatic speech recognition performance. The techniques provided herein apply signal-processing operations and a lightweight neural network to reconstruct missing spectral details. The input signal is converted to log-Mel spectra and modeled as a smooth average spectrum (SAS) plus a residual component. A neural network predicts a conventional-microphone SAS. A vocoder synthesizes an enhanced audio signal after combining the predicted SAS with the residual component.

Type: Application

Filed: November 20, 2025

Publication date: March 19, 2026

Applicant: Intel Corporation

Inventors: Hector Alfonso Cordourier Maruri, Julio Cesar Zamora Esquivel, Paulo Lopez Meyer, Alejandro Ibarra Von Borstel, Leobardo Campos Macias, Margarita Jauregui Franco, Rodrigo Aldana Lopez, Edgar Macias Garcia, Georg Stemmer, Nathan Mataya, Priyanka Dhage, Johan Rivera, Karla Cruz-Lee, Saran Poovarodom
Method and system of automatic context-bound domain-specific speech recognition

Patent number: 12555572

Abstract: A system, article, and method of automatic context-bound domain-specific speech recognition uses general language models.

Type: Grant

Filed: December 24, 2021

Date of Patent: February 17, 2026

Assignee: Intel Corporation

Inventors: Szymon Jessa, Jakub Nowicki, Michal Papaj, Piotr Hoffmann, Krzysztof Swider, Georg Stemmer
Deepfake detection models utilizing subject-specific libraries

Patent number: 12475695

Abstract: An apparatus to facilitate deepfake detection models utilizing subject-specific libraries is disclosed. The apparatus includes one or more processors to store a plurality of deepfake detection models corresponding to a plurality of subjects of interest; receive a query to identify whether data pertaining to a target subject of interest is a deepfake, the target subject of interest comprised in the plurality of subjects of interest and associated with a subject identifier (ID); identify a deepfake detection model corresponding to the subject ID; extract features for deepfake detection from the data; input the extracted features to the identified deepfake detection model corresponding to the subject ID; and responsive to an output of the deepfake detection model exceeding a determined deepfake threshold, generate a notification, in response to the query, indicating a possible deepfake attack corresponding to the target subject of interest.

Type: Grant

Filed: September 22, 2021

Date of Patent: November 18, 2025

Assignee: INTEL CORPORATION

Inventors: Georg Stemmer, Carl Marshall, Satyam Srivastava, Ilke Demir
Enhanced spatial audio-based virtual seating arrangements

Patent number: 12475439

Abstract: This disclosure describes systems, methods, and devices related to presenting video conferencing virtual seating arrangements. A method may include generating a first similarity score indicative of a first similarity between a first voice of a first virtual meeting user and a second voice of a second virtual meeting user; generating a second similarity score indicative of a second similarity between the first voice of the first virtual meeting user and a third voice of a third virtual meeting user; determining, based on the first similarity score and the second similarity score, a similarity loss for a virtual seating arrangement; determining that the similarity loss is a minimum similarity loss of respective similarity losses for different virtual seating arrangements; generating presentation data, for the virtual meeting, including virtual representations of the virtual meeting users arranged based on the virtual seating arrangement; and presenting the presentation data.

Type: Grant

Filed: June 29, 2022

Date of Patent: November 18, 2025

Assignee: Intel Corporation

Inventors: Georg Stemmer, Willem Beltman, Hector Cordourier Maruri
VOICE COMMAND RECOGNITION FOR HUMAN-ROBOT COMMUNICATION

Publication number: 20250322831

Abstract: Techniques for the use of verbal commands in human-robot communication. The number of tasks the robot can perform is limited to a specific set, while providing syntactic flexibility to users. The system includes two components: a speech recognizer for speech-to-text conversion and a natural language understanding module that maps the text to a command for the robot. After speech is transcribed to text, a nearest neighbor classifier can be applied in the high dimensional space of embedding tokens. Multiple variants of each command are provided in a database of reference embeddings, and the classifier can identify the k nearest reference embedding tokens to determine the command. The text similarity model allows for quick detection solutions to be deployed locally on a robot or other device. Local deployment reduces potential latency caused by a cloud connection, which can be important in many assistant robot applications.

Type: Application

Filed: June 26, 2025

Publication date: October 16, 2025

Applicant: Intel Corporation

Inventors: Hector Alfonso Cordourier Maruri, Georg Stemmer, Laura Gonzalez Ojeda, Monica Rivera Aguilera, Nathan Mataya, Priyanka Dhage, Johan Rivera, Karla Cruz-Lee, Saran Poovarodom
Adaptively recognizing speech using key phrases

Patent number: 12125482

Abstract: An example apparatus for recognizing speech includes an audio receiver to receive a stream of audio. The apparatus also includes a key phrase detector to detect a key phrase in the stream of audio. The apparatus further includes a model adapter to dynamically adapt a model based on the detected key phrase. The apparatus also includes a query recognizer to detect a voice query following the key phrase in a stream of audio via the adapted model.

Type: Grant

Filed: November 22, 2019

Date of Patent: October 22, 2024

Assignee: Intel Corporation

Inventors: Krzysztof Czarnowski, Munir Nikolai Alexander Georges, Tobias Bocklet, Georg Stemmer
METHODS AND APPARATUS FOR REAL-TIME VOICE TYPE DETECTION IN AUDIO DATA

Publication number: 20240290343

Abstract: Methods, apparatus, systems, and articles of manufacture for real-time voice type detection in audio data are disclosed. An example non-transitory computer-readable medium disclosed herein includes instructions, which when executed, cause one or more processors to at least identify a first vocal effort of a first audio segment of first audio data and a second vocal effort of a second audio segment of the first audio data, train a neural network including training data, the training data including the first vocal effort, the first audio segment, the second audio segment, and the second vocal effort, and deploy the neural network, the neural network to distinguish between the first vocal effort and the second vocal effort.

Type: Application

Filed: February 28, 2023

Publication date: August 29, 2024

Inventors: Hector Alfonso Cordourier Maruri, Himanshu Bhalla, Georg Stemmer, Sinem Aslan, Julio Cesar Zamora, Jose Rodrigo Camacho Perez, Paulo Lopez Meyer, Alejandro Ibarra Von Borstel, Jose Israel Torres Ortega, Juan Antonio Del Hoyo Ontiveros
SOUND SOURCE SEPARATION USING ANGULAR LOCATION

Publication number: 20240274148

Abstract: Systems and methods for audio source separation. A deep learning-based system uses an azimuth angle location to separate an audio signal originating from a selected location from other sound. Techniques are disclosed for steering a virtual direction of a microphone towards a selected speaker. A deep-learning based audio regression method, which can be implemented as a neural network, learns to separate out various speakers by leveraging spectral and spatial characteristics of all sources. The neural network can focus on multiple sources in multiple respective target directions, and cancel out other sounds. A user can choose which source to listen to. The network can use the time-domain signal and a frequency-domain signal to separate out the target signal and generate a separated audio output. The direction of the selected speaker relative to the microphone array can be input to the system as a vector.

Type: Application

Filed: April 25, 2024

Publication date: August 15, 2024

Applicant: Intel Corporation

Inventors: Jesus Ferrer Romero, Hector Cordourier Maruri, Georg Stemmer, Willem Beltman
DEEP LEARNING SOLUTION FOR VIRTUAL ROTATION OF BINAURAL AUDIO SIGNALS

Publication number: 20240244389

Abstract: Techniques are provided herein for providing binaural sound signals that are virtually rotated to match head rotation, such that audio output to headphones is perceived to maintain its location relative to user when a user turns their head. In particular, techniques are presented to extract spherical location information already embedded in binaural signals to generate binaural sound signals that change to match head rotation. A deep-learning based audio regression method can use a 2-channel binaural audio signal and a rotation angle as input, and generate a new binaural audio output signal with the rotated environment corresponding to the rotation angle. The deep-learning based audio regression method can be implemented as a neural network, and can include deep learning operations, such as convolution, pooling, elementwise operation, linear operation, and nonlinear operation. A deep learning operation may be performed on internal parameters of the DNNs and one or more activations.

Type: Application

Filed: February 6, 2024

Publication date: July 18, 2024

Applicant: Intel Corporation

Inventors: Hector Cordourier Maruri, Jesus Ferrer Romero, Willem Beltman, Georg Stemmer
METHODS AND APPARATUS FOR AUDIO ADJUSTMENT BASED ON VOCAL EFFORT

Publication number: 20230410810

Abstract: Methods and apparatus to audio adjustment based on vocal effort are disclosed herein. An example apparatus comprising interface circuitry, machine readable instructions, and programmable circuitry to at least one of instantiate or execute the machine readable instructions to identify speech with a soft voice type in audio from a first user device, the speech with the soft voice type including phonation, modify the audio to generate modified audio based on the identification of the speech with the soft voice type, and output the modified audio from a second user device.

Type: Application

Filed: August 28, 2023

Publication date: December 21, 2023

Inventors: Hector Alfonso Cordourier Maruri, Georg Stemmer, Lukasz Kurylo, Himanshu Bhalla
ENHANCED SPATIAL AUDIO-BASED VIRTUAL SEATING ARRANGEMENTS

Publication number: 20220343289

Abstract: This disclosure describes systems, methods, and devices related to presenting video conferencing virtual seating arrangements. A method may include generating a first similarity score indicative of a first similarity between a first voice of a first virtual meeting user and a second voice of a second virtual meeting user; generating a second similarity score indicative of a second similarity between the first voice of the first virtual meeting user and a third voice of a third virtual meeting user; determining, based on the first similarity score and the second similarity score, a similarity loss for a virtual seating arrangement; determining that the similarity loss is a minimum similarity loss of respective similarity losses for different virtual seating arrangements; generating presentation data, for the virtual meeting, including virtual representations of the virtual meeting users arranged based on the virtual seating arrangement; and presenting the presentation data.

Type: Application

Filed: June 29, 2022

Publication date: October 27, 2022

Inventors: Georg Stemmer, Willem Beltman, Hector Cordourier Maruri
METHODS AND APPARATUS TO GENERATE BINAURAL SOUNDS FOR HEARING DEVICES

Publication number: 20220286798

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to generate binaural sounds for hearing devices. An example apparatus includes processor circuitry to at least access audio data corresponding to multiple devices, ones of the multiple devices positioned at spatial locations relative to a listener, identify a position of the listener relative to the multiple devices, adjust, based on the spatial locations and the position of the listener, the audio data associated with at least one of the multiple devices, transmit the adjusted audio data to a hearing device associated with the listener, the adjusted audio data including a binaural sound corresponding to each of the spatial locations.

Type: Application

Filed: May 27, 2022

Publication date: September 8, 2022

Inventors: Georg Stemmer, Hector Cordourier Maruri, Willem Beltman
METHOD AND SYSTEM OF AUTOMATIC CONTEXT-BOUND DOMAIN-SPECIFIC SPEECH RECOGNITION

Publication number: 20220122596

Abstract: A system, article, and method of automatic context-bound domain-specific speech recognition uses general language models.

Type: Application

Filed: December 24, 2021

Publication date: April 21, 2022

Applicant: Intel Corporation

Inventors: Szymon Jessa, Jakub Nowicki, Michal Papaj, Piotr Hoffmann, Krzysztof Swider, Georg Stemmer
Systems and methods for energy efficient and low power distributed automatic speech recognition on wearable devices

Patent number: 11308978

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for distributed automatic speech recognition. An example apparatus includes a detector to process an input audio signal and identify a portion of the input audio signal including a sound to be evaluated, the sound to be evaluated organized into a plurality of audio features representing the sound. The example apparatus includes a quantizer to process the audio features using a quantization process to reduce the audio features to generate a reduced set of audio features for transmission. The example apparatus includes a transmitter to transmit the reduced set of audio features over a low-energy communication channel for processing.

Type: Grant

Filed: August 5, 2019

Date of Patent: April 19, 2022

Assignee: INTEL CORPORATION

Inventors: Binuraj K. Ravindran, Francis M. Tharappel, Prabhakar R. Datta, Tobias Bocklet, Maciej Muchlinski, Tomasz Dorau, Josef G. Bauer, Saurin Shah, Georg Stemmer
AUTOMATIC PERSONAL IDENTIFIABLE INFORMATION REMOVAL FROM AUDIO

Publication number: 20220084521

Abstract: This disclosure describes systems, methods, and devices related to automatic personal identifiable information (PII) removal. A system may detect a sound signal received from a vicinity of a machine during the operation of the machine. The system may perform speech detection to detect a segment of the sound signal that comprises a speech signal. The system may modify the sound signal at the segment of the sound signal by performing a segment replacement mechanism. The system may generate a filtered sound signal to be used for monitoring the operation of the machine.

Type: Application

Filed: November 23, 2021

Publication date: March 17, 2022

Inventors: Raju Arvind, Jose Lopez, Georg Stemmer
DEEPFAKE DETECTION MODELS UTILIZING SUBJECT-SPECIFIC LIBRARIES

Publication number: 20220004904

Abstract: An apparatus to facilitate deepfake detection models utilizing subject-specific libraries is disclosed. The apparatus includes one or more processors to store a plurality of deepfake detection models corresponding to a plurality of subjects of interest; receive a query to identify whether data pertaining to a target subject of interest is a deepfake, the target subject of interest comprised in the plurality of subjects of interest and associated with a subject identifier (ID); identify a deepfake detection model corresponding to the subject ID; extract features for deepfake detection from the data; input the extracted features to the identified deepfake detection model corresponding to the subject ID; and responsive to an output of the deepfake detection model exceeding a determined deepfake threshold, generate a notification, in response to the query, indicating a possible deepfake attack corresponding to the target subject of interest.

Type: Application

Filed: September 22, 2021

Publication date: January 6, 2022

Applicant: Intel Corporation

Inventors: Georg Stemmer, Carl Marshall, Satyam Srivastava, Ilke Demir
Continuous topic detection and adaption in audio environments

Patent number: 11031005

Abstract: A mechanism is described for facilitating continuous topic detection and adaption in audio environments, according to one embodiment. A method of embodiments, as described herein, includes detecting a term relating to a topic in an audio input received from one or more microphones of the computing device including a voice-enabled device; analyzing the term based on the topic to determine an action to be performed by the computing device; and triggering an event to facilitate the computing device to perform the action consistent with the term and the topic.

Type: Grant

Filed: December 17, 2018

Date of Patent: June 8, 2021

Assignee: INTEL CORPORATION

Inventors: Georg Stemmer, Andrzej Mialkowski, Joachim Hofer, Piotr Rozen, Tomasz Szmelczynski

1 2 3 4 next