Patents by Inventor Luca Bondi

Luca Bondi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automatically selecting a sound recognition model for an environment based on audio data and image data associated with the environment

Patent number: 12293773

Abstract: A system for automatically selecting a sound recognition model for an environment based on audio data and image data associated with the environment. The system includes a camera, a microphone, a memory including a plurality of sound recognition models, and an electronic processor. The electronic processor is configured to receive the audio data associated with the environment from the microphone, receive the image data associated with the environment from the camera, and determine one or more characteristics of the environment based on the audio data and the image data. The electronic processor is also configured to select the sound recognition model from the plurality of sound recognition models based on the one or more characteristics of the environment, receive additional audio data associated with the environment from the microphone, and analyze the additional audio data using the sound recognition model to perform a sound recognition task.

Type: Grant

Filed: November 3, 2022

Date of Patent: May 6, 2025

Assignee: Robert Bosch GmbH

Inventors: Luca Bondi, Irtsam Ghazi
METHOD AND SYSTEM TO TRAIN AUDIO RETRIEVAL AND ZERO SHOT CLASSIFICATION SYSTEMS WITH COUNTER-FACTUAL PROMPTS

Publication number: 20250124292

Abstract: A method of machine learning network includes receiving one or more sound segments and one or more associated text labels indicating captions associated with the sound segments, generating, utilizing a large language model of the machine learning network, one or more counterfactual captions associated with the one or more sound segments, wherein the one or more counterfactual captions are adversarial captions, determining a loss associated with the one or more sound segments, one or more associated text labels, and one or more counterfactual captions, updating parameters associated with an audio encoder or text encoder of the machine learning network, in response to falling below a threshold, repeating steps list above, and in response to meeting the threshold and utilizing a ranking, updating final parameters associated with the machine learning network.

Type: Application

Filed: October 12, 2023

Publication date: April 17, 2025

Inventors: Luca BONDI, Mohammad Ali VOSOUGHI, Ho-Hsiang WU, Samarjit DAS
Systems and methods for false positive mitigation in impulsive sound detectors

Patent number: 12271815

Abstract: A method data augmentation includes receiving audio stream data associated with at least one impulse event, receiving a label associated with the audio stream data, and detecting, using an onset detector, at least one peak of the at least one impulse event. The method also includes extracting at least one positive sample of the audio stream data associated with the at least one impulse event. The method also includes applying, to the at least one positive sample, the label associated with the audio stream data and extracting at least one negative sample of the audio stream data associated with the at least one impulse event. The method also includes augmenting training data based on the at least one positive sample and the at least one negative sample and training at least one machine-learning model using the augmented training data.

Type: Grant

Filed: July 13, 2022

Date of Patent: April 8, 2025

Assignee: Robert Bosch GmbH

Inventors: Luca Bondi, Samarjit Das, Shabnam Ghaffarzadegan
ACTIVE LEARNING ON BIOLOGICAL SOUNDS FOR DETERMING PRESENCE OF MEDICAL CONDITION

Publication number: 20250099067

Abstract: Methods and systems for training an audio-based machine learning model to predict a health condition based on biological sounds emitted by a person. Audio data corresponding to biological sounds produced by the person is generated from a microphone. The audio data is segmented into a plurality of segments, each segment associated with a respective sound event. An audio-based machine learning model is executed on the plurality of segments. The audio-based machine learning model is configured to output, for each segment, a label of a medical condition and an associated a confidence score. The model is trained via active learning, in which a subset of the plurality of segments are selected based on their confidence score being below a threshold, and provided to a human for annotation.

Type: Application

Filed: September 26, 2023

Publication date: March 27, 2025

Inventors: Shabnam GHAFFARZADEGAN, Samarjit DAS, Luca BONDI, Ho-Hsiang WU, Joseph Aracri, Kelly J. SHIELDS, Sirajum MUNIR
SYSTEMS AND METHODS OF PROCESSING AUDIO DATA WITH A MULTI-RATE LEARNABLE AUDIO FRONTEND

Publication number: 20250095664

Abstract: Methods and systems of processing audio data with a multi-stage audio front end model is provided. A one-dimensional audio waveform is received as input and processed using a multi-stage audio frontend model to convert the one-dimensional waveform into a two-dimensional matrix representing features of the audio waveform. The multi-stage learnable audio frontend model is configured to apply a first filterbank to the audio waveform to generate a first time-frequency representation of the audio waveform; apply a first decimation filter to the audio waveform to generate a first decimated audio input; apply a second filterbank to the first decimated audio input to generate a second time-frequency representation of the audio waveform; and stack the first time-frequency representation and the second time-frequency representation together to generate the two-dimensional matrix.

Type: Application

Filed: September 14, 2023

Publication date: March 20, 2025

Inventors: Luca BONDI, Irtsam GHAZI, Charles SHELTON, Samarjit DAS
SYSTEM AND METHOD TO IMPROVE PRECISION AND RECALL OF PROTOTYPICAL NETWORKS FOR SOUND EVENT DETECTION

Publication number: 20250085708

Abstract: A method of training a prototypical network for sound event detection includes receiving samples of an audio signal that include positive samples corresponding to sound events and negative samples that do not correspond to sound events, determining, based on the positive samples, respective positive prototypes of a plurality of classes of sound events, determining, based on the negative samples, respective negative prototypes for respective groups of the negative samples, each of the negative prototypes corresponding to a combination of a plurality of the negative samples, and generating, based on comparisons between a first sample and the respective positive prototypes and each of the negative prototypes, an output signal that indicates whether the first sample belongs to one of the plurality of classes of sound events.

Type: Application

Filed: September 8, 2023

Publication date: March 13, 2025

Inventors: Md Ehsanul Haque Nirjhar, Luca Bondi, Shabnam Ghaffarzadegan, Samarjit Das
SYSTEM AND METHOD FOR GENERATING UNIFIED GOAL REPRESENTATIONS FOR CROSS TASK GENERALIZATION IN ROBOT NAVIGATION

Publication number: 20250053784

Abstract: The systems and methods described herein may include one or more processors configured to receive a command from a user related to a subject; access a representation space associated with the command; receive a first dataset related to the command, a second dataset related to the subject, and a third dataset which includes subjects related to the command; update the representation space based on at least one of the first, second, and third dataset; generate a goal representation based on the representation space; receive, from a plurality of sensors, a sensor data of a current environment; generate a first and a second series of steps based on the goal representation and the current environment; annotate the sensor data based on performance of the first series of steps to generate an annotated senor data; and update the second series of steps based on the annotated sensor data.

Type: Application

Filed: August 9, 2023

Publication date: February 13, 2025

Inventors: Jonathan FRANCIS, Gyan TATIYA, Luca BONDI, Bingqing CHEN, Pongtep ANGKITITRAKUL
SYSTEM AND METHOD FOR ANOMALY DETECTION IN UNLABELED COLLECTIONS OF AUDIO RECORDING

Publication number: 20250021770

Abstract: In some implementations, the device may include receiving a first and second audio dataset. In addition, the device may generate a first, a second, a third, and a fourth audio sample. Moreover, the device may include determining a level of similarity between the first and second audio samples. Also, the device may include combining the first and second audio samples into an audio pair. Further, the device may include training a machine learning model to map audio samples to a latent space visualization in view of time and the similarities between the first and second audio samples to yield a trained machine learning model. In addition, the device may include mapping, by the machine learning model, in the latent space visualization, the third and fourth audio samples, where placement of the third and fourth audio samples depends on the level of similarity between the third and fourth audio samples.

Type: Application

Filed: July 14, 2023

Publication date: January 16, 2025

Inventors: ALESSANDRO ILIC MEZZA, LUCA BONDI, SHABNAM GHAFFARZADEGAN, PONGTEP ANGKITITRAKUL
KNOWLEDGE-DRIVEN SCENE PRIORS FOR SEMANTIC AUDIO-VISUAL EMBODIED NAVIGATION

Publication number: 20250022296

Abstract: A method of controlling navigation of a device in an environment using machine learning (ML) models includes receiving visual and audio observation data of the environment as sensed by the device, determining classification scores for objects and regions in the environment based on the visual and audio observation data, encoding visual information based on the classification scores, determining audio-semantic feature embeddings based at least in part on the classification scores, the audio-semantic feature embeddings indicating spatial relationships between objects in the environment, between regions in the environment, and between objects and regions in the environment, and determining and outputting, based on the encoded visual information and the audio-semantic feature embeddings, a state representation corresponding to a state of the device within the environment.

Type: Application

Filed: July 14, 2023

Publication date: January 16, 2025

Inventors: Jonathan Francis, Luca Bondi, Gyan Tatiya, Ingrid Navarro
SYSTEM AND METHOD TO TRAIN AND EVALUATE A DEEP-LEARNING SYSTEM ON MULTIPLE GROUND-TRUTH SOURCES SUBJECT TO MEASUREMENT ERRORS

Publication number: 20250005426

Abstract: A method of training a machine learning (ML) model includes obtaining a dataset that includes first training data obtained using two or more ground truth sensing systems and second training data obtained using a prediction sensing system configured to implement the ML model, determining a loss function based on the first training data, the loss function defining a region of zero loss based on a minimum and a maximum of the first training data, calculating, using the ML model, a prediction output based on the second training data, calculating, using the loss function, a loss of the ML model based on the prediction output, and updating the ML model based on the calculated loss.

Type: Application

Filed: June 27, 2023

Publication date: January 2, 2025

Inventors: Luca Bondi, Shabnam Ghaffarzadegan, Samarjit Das
Systems and methods for automatic alignment between audio recordings and labels extracted from a multitude of asynchronous sensors in urban settings

Patent number: 12020156

Abstract: A method includes receiving audio stream data associated with a data capture environment, and receiving sensor data associated with the data capture environment. The method also includes identifying at least some events in the sensor data, and calculating at least one offset value for at least a portion of the audio stream data that corresponds to at least one event of the sensor data. The method also includes synchronizing at least a portion of the sensor data associated with the portion of the audio stream data that corresponds to the at least one event of the sensor data, and labeling at least the portion of the audio stream data that corresponds to the at least one event of the sensor data. The method also includes generating training data using at least some of the labeled portion of the audio stream data, and training a machine learning model using the training data.

Type: Grant

Filed: July 13, 2022

Date of Patent: June 25, 2024

Assignee: Robert Bosch GmbH

Inventors: Luca Bondi, Shabnam Ghaffarzadegan, Samarjit Das
AUTOMATICALLY SELECTING A SOUND RECOGNITION MODEL FOR AN ENVIRONMENT BASED ON AUDIO DATA AND IMAGE DATA ASSOCIATED WITH THE ENVIRONMENT

Publication number: 20240153524

Abstract: A system for automatically selecting a sound recognition model for an environment based on audio data and image data associated with the environment. The system includes a camera, a microphone, a memory including a plurality of sound recognition models, and an electronic processor. The electronic processor is configured to receive the audio data associated with the environment from the microphone, receive the image data associated with the environment from the camera, and determine one or more characteristics of the environment based on the audio data and the image data. The electronic processor is also configured to select the sound recognition model from the plurality of sound recognition models based on the one or more characteristics of the environment, receive additional audio data associated with the environment from the microphone, and analyze the additional audio data using the sound recognition model to perform a sound recognition task.

Type: Application

Filed: November 3, 2022

Publication date: May 9, 2024

Inventors: Luca Bondi, Irtsam Ghazi
SYSTEMS AND METHODS FOR AUTOMATIC ALIGNMENT BETWEEN AUDIO RECORDINGS AND LABELS EXTRACTED FROM A MULTITUDE OF ASYNCHRONOUS SENSORS IN URBAN SETTINGS

Publication number: 20240020525

Abstract: A method includes receiving audio stream data associated with a data capture environment, and receiving sensor data associated with the data capture environment. The method also includes identifying at least some events in the sensor data, and calculating at least one offset value for at least a portion of the audio stream data that corresponds to at least one event of the sensor data. The method also includes synchronizing at least a portion of the sensor data associated with the portion of the audio stream data that corresponds to the at least one event of the sensor data, and labeling at least the portion of the audio stream data that corresponds to the at least one event of the sensor data. The method also includes generating training data using at least some of the labeled portion of the audio stream data, and training a machine learning model using the training data.

Type: Application

Filed: July 13, 2022

Publication date: January 18, 2024

Inventors: Luca Bondi, Shabnam Ghaffarzadegan, Samarjit Das
SYSTEMS AND METHODS FOR FALSE POSITIVE MITIGATION IN IMPULSIVE SOUND DETECTORS

Publication number: 20240020526

Abstract: A method data augmentation includes receiving audio stream data associated with at least one impulse event, receiving a label associated with the audio stream data, and detecting, using an onset detector, at least one peak of the at least one impulse event. The method also includes extracting at least one positive sample of the audio stream data associated with the at least one impulse event. The method also includes applying, to the at least one positive sample, the label associated with the audio stream data and extracting at least one negative sample of the audio stream data associated with the at least one impulse event. The method also includes augmenting training data based on the at least one positive sample and the at least one negative sample and training at least one machine-learning model using the augmented training data.

Type: Application

Filed: July 13, 2022

Publication date: January 18, 2024

Inventors: Luca Bondi, Samarjit Das, Shabnam Ghaffarzadegan
Systems and methods for automatic extraction and alignment of labels derived from camera feed for moving sound sources recorded with a microphone array

Patent number: 11830239

Abstract: A method for labeling audio data includes receiving video stream data and audio stream data that corresponds to at least a portion of the video stream data. The method also includes labeling, at least some objects of the video stream data. The method also includes calculating at least one offset value for at least a portion of the audio stream data that corresponds to at least one labeled object of the video stream data. The method also includes synchronizing at least a portion of the video stream data with the portion of the audio stream data. The method also includes labeling at least the portion of the audio stream data that corresponds to the at least one labeled object of the video stream data and generating training data using at least some of the labeled portion of the audio stream data.

Type: Grant

Filed: July 13, 2022

Date of Patent: November 28, 2023

Assignee: Robert Bosch GmbH

Inventors: Shabnam Ghaffarzadegan, Samarjit Das, Luca Bondi
Domain Adapting Framework for Anomalous Detection

Publication number: 20230259810

Abstract: A computer-implemented system and method includes obtaining a plurality of tasks from a first domain. A machine learning system is trained to perform a first task. A first set of prototypes is generated. The first set of prototypes is associated with a first set of classes of the first task. The machine learning system is updated based on a first loss output. The first loss output includes a first task loss, which takes into account the first set of prototypes. The machine learning system is trained to perform a second task. A second set of prototypes is generated. The second set of prototypes is associated with a second set of classes of the second task. The machine learning system is updated based on a second loss output. The second loss output includes a second task loss, which takes into account the second set of prototypes. The machine learning system is updated based on the second loss output. The machine learning system is fine-tuned with a new task from a second domain.

Type: Application

Filed: February 11, 2022

Publication date: August 17, 2023

Inventors: Bingqing Chen, Luca Bondi, Samarjit Das