Patents by Inventor Mohsen Fayyaz

Mohsen Fayyaz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Executing a Client Model Using a Task Prompt Produced by a Main System

Publication number: 20250086187

Abstract: A technique executes a client machine-trained model (“client model”) on a client device. In operation, the client device submits a description of a task to be performed by the client device to a network-accessible main system. The main system uses a main-system machine-trained model (“main-system model”) to produce a task prompt based on the task description. The client device subsequently uses the task prompt to process queries pertaining to the task. The main-system is trained to increase the accuracy of responses produced by the client model, while reducing the sizes of task prompts produced by the main system. The training process is performed by holding weights of the client model constant.

Type: Application

Filed: September 9, 2023

Publication date: March 13, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Mohsen FAYYAZ, Ayyoob IMANIGOOGHARI, Eric Chris Wolfgang SOMMERLADE
Reducing Size of a Machine-Trained Model to Facilitate Storage and Transfer

Publication number: 20250053852

Abstract: A data structure describes a machine-trained model using a data structure that includes a plurality paths between a root node and respective leaf nodes. One such path is a main root-to-leaf (RTL) path, while other paths are referred to as non-main-RTL paths. Each node along the RTL path is associated with a portion of base model weights. At least one node along a non-main-RTL path is associated with a portion of model-variance information. A training system trains the portions of model-variance information as variations of corresponding portions of base model weights, while keeping the portion of base model weights fixed. In some cases, a local system obtains portions of model weights described by the data structure from a source system on an as needed-basis. The above characteristics contribute to the efficient storage, transfer, and execution of the machine-trained model.

Type: Application

Filed: August 10, 2023

Publication date: February 13, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Mohsen FAYYAZ, Eric Chris Wolfgang SOMMERLADE, Marcelo GENNARI DO NASCIMENTO, Ebey Paulose ABRAHAM
Compressing Information Provided to a Machine-Trained Model Using Abstract Tokens

Publication number: 20250053748

Abstract: A technique uses a machine-trained model to generate a response based on a prompt which expresses current input information and abstract token information. The abstract token information summarizes a full dialogue history of a dialogue, and is generated by the model itself. The technique reduces the size of the prompt by incorporating the abstract summary information in lieu of the full dialogue history. A training system trains the machine-trained model by successively improving the predictive accuracy of the machine-trained model, while rewarding the machine-trained model based on an extent to which the machine-trained model compresses instances of abstract token information.

Type: Application

Filed: August 10, 2023

Publication date: February 13, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Mohsen FAYYAZ, Eric Chris Wolfgang SOMMERLADE, Justin James WAGLE, Vivek PRADEEP
Using Fixed-Weight Language Models to Create and Interact with a Retrieval Index

Publication number: 20240354317

Abstract: A technique uses an encoder system to produce an index of target item embeddings. Each target item embedding is input-agnostic and universal in the sense that different expressions of a target concept, produced using different combinations of input modes, map to the same target item embedding in the index. The encoder system throttles the amount of computations it performs based on the assessed capabilities of an execution platform. A retrieval system processes a multimodal input query by first generating a candidate set of target item embeddings in the index that match the input query, and then using a filtering operation to identify those target item embeddings that are most likely to match the input query. The encoder system and the retrieval system rely on language-based components having weights that are held constant during a training operation. Other weights of these systems are updated during the training operation.

Type: Application

Filed: April 21, 2023

Publication date: October 24, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Mohsen FAYYAZ, Eric Chris Wolfgang SOMMERLADE, Justin James WAGLE
Feature prediction for efficient video processing

Patent number: 12106487

Abstract: A technique is described herein that interprets some frames in a stream of video content as key frames and other frames as predicted frames. The technique uses an image analysis system to produce feature information for each key frame. The technique uses a prediction model to produce feature information for each predicted frame. The prediction model operates on two inputs: (1) feature information that has been computed for an immediately-preceding frame; and (2) frame-change information. A motion-determining model produces the frame-change information by computing the change in video content between the current frame being predicted and the immediately-preceding frame. The technique reduces the amount of image-processing operations that are used to process the stream of video content compared to a base case of processing all of the frames using the image analysis system. As such, the technique uses less computing resources compared to the base case.

Type: Grant

Filed: November 24, 2021

Date of Patent: October 1, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Mohsen Fayyaz, Hamidreza Vaezi Joze, Eric Chris Wolfgang Sommerlade
Executing a Machine-Trained Model using Selectively Streamed Model Weights

Publication number: 20240296373

Abstract: A technique implements a machine-trained model using resources of a local system. The technique operates by successively obtaining portions of model weights on an as-needed basis. The local system obtains at least some of the portions by downloading them from a source system in a streaming operation. The technique further successively executes parts of the machine-trained model in the local system using the portions of model weights that have been obtained, to provide an output result. An entirety of the model weights used by the local system to provide the output result is less than an entirety of the model weights available for download at the source system. The technique enables the local system to locally execute the machine-trained model without overburdening its local resources, and with reduced consumption of network resources.

Type: Application

Filed: March 1, 2023

Publication date: September 5, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Eric Chris Wolfgang SOMMERLADE, Marcelo GENNARI DO NASCIMENTO, Mohsen FAYYAZ, Aleksandar UZELAC
Performing Computing Tasks Using Decoupled Models for Different Data Types

Publication number: 20240184629

Abstract: A technique executes tasks using a data store of machine-trained models. The data store specifically includes a subset of encoder-type machine-trained models for converting input data items having different input data types into respective embeddings in a vector space, and a subset of decoder-type machine-trained models for converting embeddings in the same vector space into data items having respective different output data types. When executing a particular task that involves one or more data types, the technique selects one or more machine-trained models that match those data types. In some implementations, the technique provides a clipboard store for storing embeddings produced by the encoder-type machine-trained models and consumable by the decoder-type machine-trained models. The technique includes provisions for ensuring that any decoder-type machine-model is capable of processing embeddings produced by different versions of the encoder-type machine-trained models.

Type: Application

Filed: December 1, 2022

Publication date: June 6, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Eric Chris Wolfgang SOMMERLADE, Mohsen FAYYAZ, Nazuk JAIN
Adaptive Token Sampling for Efficient Transformer

Publication number: 20230153379

Abstract: A transformer is described herein for using transformer-based technology to process data items (e.g., image items). The transformer increases the efficiency of the transformer-based technology by using a modified attention component. In operation, the modified attention component accepts embedding vectors that represent a plurality of item tokens, together with a classification token. A first stage of the modified attention component generates original attention information based on the embedding vectors. A second stage generates score information based on a portion of the original attention information that pertains to the classification token. A third stage produces modified attention information by removing attention values from the original attention information, as guided by a sampling operation that is performed on the score information. The second and third stages do not rely on machine-trained values, which expedites the deployment of these functions in existing transformers.

Type: Application

Filed: November 14, 2021

Publication date: May 18, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Mohsen FAYYAZ, Soroush ABBASI KOOHPAYEGANI, Eric Chris Wolfgang SOMMERLADE, Hamidreza VAEZI JOZE
METHOD FOR CODING A SEQUENCE OF VIDEO IMAGES

Publication number: 20230036743

Abstract: A method for coding a predefined time sequence of video images in a representation which is evaluable by machine made up of stationary features and nonstationary features. In the method: at least one function parameterized using trainable parameters is provided, which maps sequences of video images on representations; from the sequence of video images, N adjoining, nonoverlapping short extracts and one long extract, which contains all N short extracts are selected; using the parameterized function, a representation of the long extract and multiple representations of the short extracts are ascertained; the parameterized function is assessed; the parameters of the function are optimized with the goal that the assessment of the cost function for representations ascertained in future is expected to improve; using the function parameterized by the finished optimized parameters, the predefined time sequence of video images is mapped on the sought representation.

Type: Application

Filed: July 7, 2022

Publication date: February 2, 2023

Inventors: Mehdi Noroozi, Mohsen Fayyaz, Nadine Behrmann