Patents by Inventor Xiao Shi Huang

Xiao Shi Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Initialization of parameters for machine-learned transformer neural network architectures

Patent number: 12321861

Abstract: An online system trains a transformer architecture by an initialization method which allows the transformer architecture to be trained without normalization layers of learning rate warmup, resulting in significant improvements in computational efficiency for transformer architectures. Specifically, an attention block included in an encoder or a decoder of the transformer architecture generates the set of attention representations by applying a key matrix to the input key, a query matrix to the input query, a value matrix to the input value to generate an output, and applying an output matrix to the output to generate the set of attention representations. The initialization method may be performed by scaling the parameters of the value matrix and the output matrix with a factor that is inverse to a number of the set of encoders or a number of the set of decoders.

Type: Grant

Filed: April 19, 2023

Date of Patent: June 3, 2025

Assignee: The Toronto-Dominion Bank

Inventors: Maksims Volkovs, Xiao Shi Huang, Juan Felipe Perez Vallejo
SYSTEM AND METHOD FOR PREDICTING FRAUD AND PROVIDING ACTIONS USING MACHINE LEARNING

Publication number: 20240281818

Abstract: The disclosed method and system provide an unsupervised machine learning model, specifically an isolation forest model, to identify potential fraudulent activity in policies communicated within a distributed computing system. This model is trained and fine-tuned using tabular data, including social graph connectivity features, with a training dataset containing unlabelled data and a tuning dataset comprising labelled instances of fraudulent activity. Through iterative tuning, the model adjusts its features (e.g. model splitting thresholds) to optimize detection accuracy, ensuring that anomalies predicted by the model align with labelled fraudulent policies in the tuning dataset. Subsequently, computerized actions are triggered based on the model's predictions to manage displaying, routing and processing the policy to one or more other computing devices for action based on the prediction within the distributed computing system.

Type: Application

Filed: February 16, 2024

Publication date: August 22, 2024

Inventors: KEYVAN GOLESTAN IRANI, XIAO SHI HUANG, MARIE-CHRISTINE DOUCAS, DALIA FARAG, JEAN-CHRISTOPHE BOUËTTÉ
Initialization of Parameters for Machine-Learned Transformer Neural Network Architectures

Publication number: 20230252301

Abstract: An online system trains a transformer architecture by an initialization method which allows the transformer architecture to be trained without normalization layers of learning rate warmup, resulting in significant improvements in computational efficiency for transformer architectures. Specifically, an attention block included in an encoder or a decoder of the transformer architecture generates the set of attention representations by applying a key matrix to the input key, a query matrix to the input query, a value matrix to the input value to generate an output, and applying an output matrix to the output to generate the set of attention representations. The initialization method may be performed by scaling the parameters of the value matrix and the output matrix with a factor that is inverse to a number of the set of encoders or a number of the set of decoders.

Type: Application

Filed: April 19, 2023

Publication date: August 10, 2023

Inventors: Maksims Volkovs, Xiao Shi Huang, Juan Felipe Perez Vallejo
Initialization of parameters for machine-learned transformer neural network architectures

Patent number: 11663488

Abstract: An online system trains a transformer architecture by an initialization method which allows the transformer architecture to be trained without normalization layers of learning rate warmup, resulting in significant improvements in computational efficiency for transformer architectures. Specifically, an attention block included in an encoder or a decoder of the transformer architecture generates the set of attention representations by applying a key matrix to the input key, a query matrix to the input query, a value matrix to the input value to generate an output, and applying an output matrix to the output to generate the set of attention representations. The initialization method may be performed by scaling the parameters of the value matrix and the output matrix with a factor that is inverse to a number of the set of encoders or a number of the set of decoders.

Type: Grant

Filed: February 5, 2021

Date of Patent: May 30, 2023

Assignee: THE TORONTO-DOMINION BANK

Inventors: Maksims Volkovs, Xiao Shi Huang, Juan Felipe Perez Vallejo
TRANSLATION MODEL WITH LEARNED POSITION AND CORRECTIVE LOSS

Publication number: 20230119108

Abstract: An autoencoder model includes an encoder portion and a decoder portion. The encoder encodes an input token sequence to an input sequence representation that is decoded by the decoder to generate an output token sequence. The autoencoder model may decode multiple output tokens in parallel, such that the decoder may be applied iteratively. The decoder may receive an output estimate from a prior iteration to predict output tokens. To improve positional representation and reduce positional errors and repetitive tokens, the autoencoder may include a trained layer for combining token embeddings with positional encodings. In addition, the model may be trained with a corrective loss based on output predictions when the model receives a masked input as the output estimate.

Type: Application

Filed: October 18, 2022

Publication date: April 20, 2023

Inventors: Maksims Volkovs, Juan Felipe Perez Vallejo, Xiao Shi Huang
SYSTEM AND METHOD FOR DYNAMICALLY PREDICTING FRAUD USING MACHINE LEARNING

Publication number: 20220300903

Abstract: A computing device configured to communicate with a central server in order to predict likelihood of fraud in current transactions for a target claim. The computing device then extracts from information stored in the central server (relating to the target claim and past transactions for past claims including those marked as fraud), a plurality of distinct sets of features: text-based features derived from the descriptions of communications between the requesting device and the endpoint device, graph-based features derived from information relating to a network of claims and policies connected through shared information, and tabular features derived from the details related to claim information and exposure details. The features are input into a machine learning model for generating a likelihood of fraud in the current transactions and triggering an action based on the likelihood of fraud (e.g. stopping subsequent related transactions to the target claim).

Type: Application

Filed: March 19, 2021

Publication date: September 22, 2022

Inventors: XIAO SHI HUANG, SANDRA AZIZ, JUAN FELIPE PEREZ VALLEJO, JEAN-CHRISTOPHE BOUËTTÉ, JENNIFER BOUCHARD, MATHIEU JEAN RÉMI RAVAUT, MAKSIMS VOLKOVS, TOMI JOHAN POUTANEN, JOSEPH PUN, GHAITH KAZMA, OLIVIER GANDOUET
Initialization of Parameters for Machine-Learned Transformer Neural Network Architectures

Publication number: 20210255862

Abstract: An online system trains a transformer architecture by an initialization method which allows the transformer architecture to be trained without normalization layers of learning rate warmup, resulting in significant improvements in computational efficiency for transformer architectures. Specifically, an attention block included in an encoder or a decoder of the transformer architecture generates the set of attention representations by applying a key matrix to the input key, a query matrix to the input query, a value matrix to the input value to generate an output, and applying an output matrix to the output to generate the set of attention representations. The initialization method may be performed by scaling the parameters of the value matrix and the output matrix with a factor that is inverse to a number of the set of encoders or a number of the set of decoders.

Type: Application

Filed: February 5, 2021

Publication date: August 19, 2021

Inventors: Maksims Volkovs, Xiao Shi Huang, Juan Felipe Perez Vallejo

Initialization of parameters for machine-learned transformer neural network architectures

SYSTEM AND METHOD FOR PREDICTING FRAUD AND PROVIDING ACTIONS USING MACHINE LEARNING

Initialization of Parameters for Machine-Learned Transformer Neural Network Architectures

Initialization of parameters for machine-learned transformer neural network architectures

TRANSLATION MODEL WITH LEARNED POSITION AND CORRECTIVE LOSS

SYSTEM AND METHOD FOR DYNAMICALLY PREDICTING FRAUD USING MACHINE LEARNING

Initialization of Parameters for Machine-Learned Transformer Neural Network Architectures