Patents by Inventor Han Lu

Han Lu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Materials and methods related to image processing

Patent number: 12626364

Abstract: The present disclosure provides materials and methods related to image processing. In particular, the present disclosure provides methods for enhancing target signal detection using imaging processing analysis that identifies and removes non-specific background signals. The image processing methods of the present disclosure are useful for enhancing target signals in a variety of assays that involve fluorescent detection (e.g., fluorescent in situ hybridization, immunofluorescence).

Type: Grant

Filed: April 15, 2022

Date of Patent: May 12, 2026

Assignee: Advanced Cell Diagnostics, Inc.

Inventors: Ching-Wei Chang, Xiao-Jun Ma, Han Lu, Bing-Qing Zhang, HaYeun Ji, Ming Yu
Frequency hopping communication method for short-distance wireless communication, and related device

Patent number: 12621020

Abstract: A frequency hopping communication method for short-distance wireless communication between a primary device and a secondary device comprises switching by the primary device from the second frequency hopping sequence to the first frequency hopping sequence at a time point of frequency hopping switching, wherein M reserved channels are between a first frequency hopping sequence and a second frequency hopping sequence; and performing frequency hopping communication with the secondary device based on the first frequency hopping sequence.

Type: Grant

Filed: March 22, 2024

Date of Patent: May 5, 2026

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Tong Chen, Han Lu, Zehong Zhang, Bixiang Hu, Yufei Yang, Rui Cui, Shaojie Xue
METHODS FOR MULTIPLEX DETECTION OF NUCLEIC ACIDS BY IN SITU HYBRIDIZATION

Publication number: 20260110019

Abstract: The invention relates to methods of multiplex detection of a plurality of target nucleic acids by contacting a sample comprising a cell with target probe sets that specifically hybridize to target nucleic acids, with pre-amplifiers or pre-pre-amplifiers specific for each target probe set, with amplifiers specific for the pre-amplifiers, and with label probes specific for the amplifiers, resulting in specific labeling of multiple target nucleic acids. The invention also relates to samples, slides and kits relating to detection of multiple target nucleic acids.

Type: Application

Filed: December 17, 2025

Publication date: April 23, 2026

Inventors: Xiao-Jun MA, Bingqing ZHANG, Li-chong WANG, Han LU, Li WANG, Hailing ZONG
Clustering and mining accented speech for inclusive and fair speech recognition

Patent number: 12609111

Abstract: A method of training an accent recognition model includes receiving a corpus of training utterances spoken across various accents, each training utterance in the corpus including training audio features characterizing the training utterance, and executing a training process to train the accent recognition model on the corpus of training utterances to teach the accent recognition model to learn how to predict accent representations from the training audio features. The accent recognition model includes one or more strided convolution layers, a stack of multi-headed attention layers, and a pooling layer configured to generate a corresponding accent representation.

Type: Grant

Filed: February 26, 2024

Date of Patent: April 21, 2026

Assignee: Google LLC

Inventors: Jaeyoung Kim, Han Lu, Soheil Khorram, Anshuman Tripathi, Qian Zhang, Hasim Sak
ACCELERATING SPEAKER DIARIZATION WITH MULTI-STAGE CLUSTERING

Publication number: 20260105919

Abstract: A method (500) includes receiving an input audio signal (122) that corresponds to utterances (120) spoken by multiple speakers. The method also includes processing the input audio to generate a transcription (200) of the utterances and a sequence of speaker turn tokens (224) each indicating a location of a respective speaker turn. The method also includes segmenting the input audio signal into a plurality of speaker segments (225) based on the sequence of speaker tokens. The method also includes extracting a speaker-discriminative embedding from each speaker segment and performing spectral clustering on the speaker-discriminative embeddings to cluster the plurality of speaker segments into k classes. The method also includes assigning a respective speaker label (250) to each speaker segment clustered into the respective class that is different than the respective speaker label assigned to the speaker segments clustered into each other class of the k classes.

Type: Application

Filed: October 5, 2022

Publication date: April 16, 2026

Applicant: Google LLC

Inventors: Quan Wang, Yiling Huang, Han Lu, Guanlong Zhao
METHODS FOR DETECTING TARGET NUCLEIC ACIDS USING RNA BLOCKING MOLECULES

Publication number: 20260078430

Abstract: Embodiments of the present disclosure include compositions and methods for performing in situ hybridization reactions. In particular, the present disclosure provides RNA blocking molecules that enhance detection of a target RNA molecule (e.g., an mRNA molecule, a microRNA (miRNA) molecule, a small non-coding RNA (sncRNA) molecule, a PIWI-interacting RNA (piRNA) molecule, a small interfering RNA (siRNA) molecule, and/or an anti-sense oligo (ASO) molecule) by reducing binding of a target probe to a non-target RNA molecule in a sample.

Type: Application

Filed: September 7, 2023

Publication date: March 19, 2026

Inventors: Sonali Anantprakash Deshpande, Han Lu, Aparna Sahajan, Manvir Sambhi, Wei Wei, Xiao-Jun Ma, Bingqing Zhang
SPEAKER-TURN-BASED ONLINE SPEAKER DIARIZATION WITH CONSTRAINED SPECTRAL CLUSTERING

Publication number: 20260073923

Abstract: A method includes receiving an input audio signal that corresponds to utterances spoken by multiple speakers. The method also includes processing the input audio to generate a transcription of the utterances and a sequence of speaker turn tokens each indicating a location of a respective speaker turn. The method also includes segmenting the input audio signal into a plurality of speaker segments based on the sequence of speaker tokens. The method also includes extracting a speaker-discriminative embedding from each speaker segment and performing spectral clustering on the speaker-discriminative embeddings to cluster the plurality of speaker segments into k classes. The method also includes assigning a respective speaker label to each speaker segment clustered into the respective class that is different than the respective speaker label assigned to the speaker segments clustered into each other class of the k classes.

Type: Application

Filed: November 13, 2025

Publication date: March 12, 2026

Applicant: Google LLC

Inventors: Quan Wang, Han Lu, Evan Clark, Ignacio Lopez Moreno, Hasim Sak, Wei Xia, Taral Joglekar, Anshuman Tripathi
Methods for multiplex detection of nucleic acids by in situ hybridization

Patent number: 12571028

Abstract: The invention relates to methods of multiplex detection of a plurality of target nucleic acids by contacting a sample comprising a cell with target probe sets that specifically hybridize to target nucleic acids, with pre-amplifiers or pre-pre-amplifiers specific for each target probe set, with amplifiers specific for the pre-amplifiers, and with label probes specific for the amplifiers, resulting in specific labeling of multiple target nucleic acids. The invention also relates to samples, slides and kits relating to detection of multiple target nucleic acids.

Type: Grant

Filed: February 14, 2020

Date of Patent: March 10, 2026

Assignee: ADVANCED CELL DIAGNOSTICS, INC.

Inventors: Xiao-Jun Ma, Bingqing Zhang, Li-chong Wang, Han Lu, Li Wang, Hailing Zong
EFFICIENT TRAINING TECHNIQUES FOR GENERATIVE MODEL BASED RESPONSE SYSTEMS

Publication number: 20260037822

Abstract: Some implementations relate to receiving input data; generating, using a low-rank representation of a machine-learned generative model, a generative output from the input data; determining, based on a machine-learned reward model, a corresponding reward from the generative output, and updating, based on the corresponding reward, one or more parameters of the low-rank representation of the machine-learned model. Further, some additional or alternative implementations relate to receiving input data associated with a client device; generating, using a general purpose agent, responsive content to the input data, wherein the general purpose agent is configured based on a machine-learned generative model and a low-rank representation of the machine-learned generative model; and causing the client device to render the responsive content.

Type: Application

Filed: August 5, 2024

Publication date: February 5, 2026

Inventors: Sanil Jain, Mark Geller, Majd Al Merey, Rakesh Shivanna, Valentin Anklin, Ciprian Baetu, Martin Bölle, Hongkun Yu, Han Lu
Evaluation-based speaker change detection evaluation metrics

Patent number: 12518762

Abstract: A method includes obtaining a multi-utterance training sample that includes audio data characterizing utterances spoken by two or more different speakers and obtaining ground-truth speaker change intervals indicating time intervals in the audio data where speaker changes among the two or more different speakers occur. The method also includes processing the audio data to generate a sequence of predicted speaker change tokens using a sequence transduction model. For each corresponding predicted speaker change token, the method includes labeling the corresponding predicted speaker change token as correct when the predicted speaker change token overlaps with one of the ground-truth speaker change intervals. The method also includes determining a precision metric of the sequence transduction model based on a number of the predicted speaker change tokens labeled as correct and a total number of the predicted speaker change tokens in the sequence of predicted speaker change tokens.

Type: Grant

Filed: October 9, 2023

Date of Patent: January 6, 2026

Assignee: Google LLC

Inventors: Guanlong Zhao, Quan Wang, Han Lu, Yiling Huang, Jason Pelecanos
Application Programming Interfaces For On-Device Speech Services

Publication number: 20250378286

Abstract: A method (500) includes receiving, from an application (50) executing on a client device (110), at a speech service interface (200), configuration parameters (211) for integrating a speech service (250) into the application. The configuration parameters include a language pack directory (225) that maps a primary language code (235) to an on-device path of a primary language pack (110) of the speech service for use in recognizing speech in a primary language and each of one or more codeswitch language codes to an on-device path. The method also includes receiving audio data (102) characterizing an utterance (106) and processing, using a language ID predictor model (230), the audio data to determine that the audio data is associated with the primary language code. The method also includes processing, using the primary language pack, the audio data to determine a transcription (120) that includes one or more words in the primary language.

Type: Application

Filed: November 23, 2022

Publication date: December 11, 2025

Applicant: Google LLC

Inventors: Quan Wang, Evan Clark, Yang Yu, Han Lu, Taral Pradeep Joglekar, Qi Cao, Dharmeshkumar Mokani, Diego Melendo Casado, Ignacio Lopez Moreno, Hasim Sak
Speaker-turn-based online speaker diarization with constrained spectral clustering

Patent number: 12482470

Abstract: A method includes receiving an input audio signal that corresponds to utterances spoken by multiple speakers. The method also includes processing the input audio to generate a transcription of the utterances and a sequence of speaker turn tokens each indicating a location of a respective speaker turn. The method also includes segmenting the input audio signal into a plurality of speaker segments based on the sequence of speaker tokens. The method also includes extracting a speaker-discriminative embedding from each speaker segment and performing spectral clustering on the speaker-discriminative embeddings to cluster the plurality of speaker segments into k classes. The method also includes assigning a respective speaker label to each speaker segment clustered into the respective class that is different than the respective speaker label assigned to the speaker segments clustered into each other class of the k classes.

Type: Grant

Filed: December 14, 2021

Date of Patent: November 25, 2025

Assignee: Google LLC

Inventors: Quan Wang, Han Lu, Evan Clark, Ignacio Lopez Moreno, Hasim Sak, Wei Xia, Taral Joglekar, Anshuman Tripathi
METHODS OF LOCATING A TARGET OBJECT IN A RANGING DEVICE, AND RANGING DEVICES THEREOF

Publication number: 20250271563

Abstract: Methods of locating a target object in a ranging device and the ranging devices thereof are provided. First, a signal processor is used to process a first signal of a first signal module corresponding to a first target object, where the first signal at least includes location information. Then, a first target distance corresponding to the first target object is determined based on a device location of the ranging device and the location information in the first signal. After that, a laser ranging unit is used to perform a ranging operation on the first target object to obtain a first laser ranging result, and a display unit is used to display a first icon corresponding to the first target object according to the first target distance or the first laser ranging result.

Type: Application

Filed: December 4, 2024

Publication date: August 28, 2025

Inventors: Hua-Tang Liu, Sheng Luo, Han Lu, Peng-Fei Song, Yang-Yang Yu
WIRELESS CHARGING RANGEFINDER

Publication number: 20250266716

Abstract: A wireless charging rangefinder includes at least one rangefinder body and a magnetic unit. The rangefinder body includes a distance measuring unit, a first control unit, a first power source, a first coil, an objective lens unit, and an eyepiece unit, wherein the distance measuring unit, the first power source and the first coil are electrically connected to the first control unit, and an optical axis is configured to pass through the objective lens unit and the eyepiece unit. The magnetic unit is placed with the first coil corresponding to a second coil of a charging device so that the charging device can charge the rangefinder body.

Type: Application

Filed: January 2, 2025

Publication date: August 21, 2025

Inventors: Chun-Guang Zhang, Han Lu, Hua-Tang Liu, Sheng Luo, Yin-Liang Liao, Yu-Hui Hsu
Contrastive Siamese network for semi-supervised speech recognition

Patent number: 12334059

Abstract: A method includes receiving a plurality of unlabeled audio samples corresponding to spoken utterances not paired with corresponding transcriptions. At a target branch of a contrastive Siamese network, the method also includes generating a sequence of encoder outputs for the plurality of unlabeled audio samples and modifying time characteristics of the encoder outputs to generate a sequence of target branch outputs. At an augmentation branch of a contrastive Siamese network, the method also includes performing augmentation on the unlabeled audio samples, generating a sequence of augmented encoder outputs for the augmented unlabeled audio samples, and generating predictions of the sequence of target branch outputs generated at the target branch. The method also includes determining an unsupervised loss term based on target branch outputs and predictions of the sequence of target branch outputs. The method also includes updating parameters of the audio encoder based on the unsupervised loss term.

Type: Grant

Filed: March 28, 2024

Date of Patent: June 17, 2025

Assignee: Google LLC

Inventors: Jaeyoung Kim, Soheil Khorram, Hasim Sak, Anshuman Tripathi, Han Lu, Qian Zhang
Semi-supervised training scheme for speech recognition

Patent number: 12315499

Abstract: A method includes receiving a sequence of acoustic frames extracted from unlabeled audio samples that correspond to spoken utterances not paired with any corresponding transcriptions. The method also includes generating, using a supervised audio encoder, a target higher order feature representation for a corresponding acoustic frame. The method also includes augmenting the sequence of acoustic frames and generating, as output form an unsupervised audio encoder, a predicted higher order feature representation for a corresponding augmented acoustic frame in the sequence of augmented acoustic frames. The method also includes determining an unsupervised loss term based on the target higher order feature representation and the predicted higher order feature representation and updating parameters of the speech recognition model based on the unsupervised loss term.

Type: Grant

Filed: December 14, 2022

Date of Patent: May 27, 2025

Assignee: Google LLC

Inventors: Soheil Khorram, Anshuman Tripathi, Kim Jaeyoung, Han Lu, Qian Zhang, Hasim Sak
WORD-LEVEL END-TO-END NEURAL SPEAKER DIARIZATION WITH AUXNET

Publication number: 20250118292

Abstract: A method includes obtaining labeled training data including a plurality of spoken terms spoken during a conversation. For each respective spoken term, the method includes generating a corresponding sequence of intermediate audio encodings from a corresponding sequence of acoustic frames, generating a corresponding sequence of final audio encodings from the corresponding sequence of intermediate audio encodings, generating a corresponding speech recognition result, and generating a respective speaker token representing a predicted identity of a speaker for each corresponding speech recognition result. The method also includes training the joint speech recognition and speaker diarization model jointly based on a first loss derived from the generated speech recognition results and the corresponding transcriptions and a second loss derived from the generated speaker tokens and the corresponding speaker labels.

Type: Application

Filed: September 20, 2024

Publication date: April 10, 2025

Applicant: Google LLC

Inventors: Yiling Huang, Weiran Wang, Quan Wang, Guanlong Zhao, Hank Liao, Han Lu
End-to-end multi-talker overlapping speech recognition

Patent number: 12266347

Abstract: A method for training a speech recognition model with a loss function includes receiving an audio signal including a first segment corresponding to audio spoken by a first speaker, a second segment corresponding to audio spoken by a second speaker, and an overlapping region where the first segment overlaps the second segment. The overlapping region includes a known start time and a known end time. The method also includes generating a respective masked audio embedding for each of the first and second speakers. The method also includes applying a masking loss after the known end time to the respective masked audio embedding for the first speaker when the first speaker was speaking prior to the known start time, or applying the masking loss prior to the known start time when the first speaker was speaking after the known end time.

Type: Grant

Filed: November 15, 2022

Date of Patent: April 1, 2025

Assignee: Google LLC

Inventors: Anshuman Tripathi, Han Lu, Hasim Sak
One model unifying streaming and non-streaming speech recognition

Patent number: 12254869

Abstract: A transformer-transducer model for unifying streaming and non-streaming speech recognition includes an audio encoder, a label encoder, and a joint network. The audio encoder receives a sequence of acoustic frames, and generates, at each of a plurality of time steps, a higher order feature representation for a corresponding acoustic frame. The label encoder receives a sequence of non-blank symbols output by a final softmax layer, and generates, at each of the plurality of time steps, a dense representation. The joint network receives the higher order feature representation and the dense representation at each of the plurality of time steps, and generates a probability distribution over possible speech recognition hypothesis. The audio encoder of the model further includes a neural network having an initial stack of transformer layers trained with zero look ahead audio context, and a final stack of transformer layers trained with a variable look ahead audio context.

Type: Grant

Filed: July 24, 2023

Date of Patent: March 18, 2025

Assignee: Google LLC

Inventors: Anshuman Tripathi, Hasim Sak, Han Lu, Qian Zhang, Jaeyoung Kim
Reducing Streaming ASR Model Delay With Self Alignment

Publication number: 20240371379

Abstract: A streaming speech recognition model includes an audio encoder configured to receive a sequence of acoustic frames and generate a higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The streaming speech recognition model also includes a label encoder configured to receive a sequence of non-blank symbols output by a final softmax layer and generate a dense representation. The streaming speech recognition model also includes a joint network configured to receive the higher order feature representation generated by the audio encoder and the dense representation generated by the label encoder and generate a probability distribution over possible speech recognition hypotheses. Here, the streaming speech recognition model is trained using self-alignment to reduce prediction delay by encouraging an alignment path that is one frame left from a reference forced-alignment frame.

Type: Application

Filed: July 17, 2024

Publication date: November 7, 2024

Applicant: Google LLC

Inventors: Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak

1 2 3 4 5 next