Patents by Inventor Leo Moreno BETTHAUSER
Leo Moreno BETTHAUSER has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240370570Abstract: Disclosed is a machine learning model architecture that leverages existing large language models to analyze log files for security vulnerabilities. In some configurations, log files are processed by an encoder machine learning model to generate embeddings. Embeddings generated by the encoder model are used to construct graphs. The graphs are in turn used to train a graph classifier model for identifying security vulnerabilities. The encoder model may be an existing general-purpose large language model. In some configurations, the nodes of the graphs are the embedding vectors generated by the encoder model while edges represent similarities between nodes. Graphs constructed in this way may be pruned to highlight more meaningful node topologies. The graphs may then be labeled based on a security analysis of the corresponding log files. A graph classifier model trained on the labeled graphs may be used to identify security vulnerabilities.Type: ApplicationFiled: May 4, 2023Publication date: November 7, 2024Inventors: Leo Moreno BETTHAUSER, Andrew White WICKER, Bryan (Ning) XIA
-
Publication number: 20240370714Abstract: Disclosed is a machine learning model architecture that can incorporate structure information from multiple types of structured text into a single unified machine learning model. For example, a single unified model may be trained with structure information from XML files, tabular data, and/or flat text files. A structure-aware attention mechanism builds on the attention mechanism of the transformer architecture. Specifically, values computed for a traditional transformer attention mechanism are used to compute structure-aware attention scores. In some configurations, the location of a token in the structured text is incorporated into that token's embedding. Similarly, metadata about a token, such as whether the token is a key or a value of a key/value pair, may be incorporated into the token's embedding. This enables the model to reason over token metadata and the location of the token in the structured text in addition to the meaning of the token itself.Type: ApplicationFiled: May 4, 2023Publication date: November 7, 2024Inventors: Leo Moreno BETTHAUSER, Muhammed Fatih BULUT, Bryan (Ning) XIA
-
Publication number: 20240330446Abstract: Methods and apparatuses for improving the performance and energy efficiency of machine learning systems that generate security specific machine learning models and generate security related information using security specific machine learning models are described. A security specific machine learning model may comprise a security specific large language model (LLM). The security specific LLM may be trained and deployed to generate semantically related security information. The security specific LLM may be pretrained with a security specific data set that was generated using similarity deduplication and long line handling, and with security specific objectives, such as next log line prediction based on host, system, application, and cyber attacker behavior. The security specific large language model may be fine-tuned using a security specific similarity dataset that may be generated to align the security specific LLM to capture similarity between different security events.Type: ApplicationFiled: June 14, 2023Publication date: October 3, 2024Inventors: Muhammed Fatih BULUT, Lloyd Geoffrey GREENWALD, Aditi Kamlesh SHAH, Leo Moreno BETTHAUSER, Yingqi LIU, Ning XIA, Siyue WANG
-
Publication number: 20240256948Abstract: In some examples, a method for orchestrating an execution plan is provided. The method includes receiving an input embedding that is generated by a machine-learning model and receiving a plurality of stored semantic embeddings, from an embedding object memory, based on the input embedding. The plurality of stored semantic embeddings each correspond to a respective historic plan. Each historic plan includes one or more executable skills. The method further includes determining a subset of semantic embeddings from the plurality of stored semantic embeddings based on a similarity to the input embedding, and generating a new plan based on the subset of semantic embeddings and the input embedding. The new plan may be different than the historic plans that correspond to the subset of semantic embeddings. The method further includes providing the new plan as an output.Type: ApplicationFiled: March 24, 2023Publication date: August 1, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Leo Moreno BETTHAUSER, William BLUM, Andrew W. WICKER, Eric Paul DOUGLAS, Lloyd Geoffrey GREENWALD, Nicholas BECKER
-
Publication number: 20240256780Abstract: In some examples, a method of generating a security report is provided. The method includes receiving a user query and security data, and providing the user query and security data to a semantic model. The semantic model generates one or more first embeddings. The method further includes receiving, from a data model, one or more second embeddings. The data model is generated based on historical threat intelligence data. The model further includes generating an execution plan based on the one or more first embeddings and the one or more second embeddings, and returning a report that corresponds to the execution plan.Type: ApplicationFiled: March 24, 2023Publication date: August 1, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Eric Paul DOUGLAS, Mario Davis GOERTZEL, Lloyd Geoffrey GREENWALD, Aditi Kamlesh SHAH, Leo Moreno BETTHAUSER, Daniel Lee MACE, Nicholas BECKER
-
Publication number: 20240211796Abstract: The present disclosure relates to utilizing an embedding space relationship query exploration system to explore embedding spaces generated by machine-learning models. For example, the embedding space relationship query exploration system facilitates efficiently and flexibly revealing relationships that are encoded in a machine-learning model during training and inferencing. In particular, the embedding space relationship query exploration system utilizes various embeddings relationship query models to explore and discover the relationship types being learned and preserved within the embedding space of a machine-learning model.Type: ApplicationFiled: December 22, 2022Publication date: June 27, 2024Inventors: Maurice DIESENDRUCK, Leo Moreno BETTHAUSER, Urszula Stefania CHAJEWSKA, Rohith Venkata PESALA, Robin ABRAHAM
-
Patent number: 11868358Abstract: A data processing system implements obtaining query parameters for a query for content items in a datastore, the query parameters including attributes of content items for which a search is to be conducted; obtaining a first set of content items from a content datastore based on the query parameters; analyzing the first set of content items using a first machine learning model trained to generate relevant content information that identifies a plurality of relevant content items included in the first set of content items; and analyzing the plurality of relevant content items using a second machine learning model configured to output novel content information, the novel content information including a plurality of content items predicted to be relevant and novel, the novel content information ranking the plurality of content items predicted to be relevant and novel based on a novelty score associated with each respective content item.Type: GrantFiled: June 15, 2022Date of Patent: January 9, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Leo Moreno Betthauser, Jing Tian, Yijian Xiang, Pramod Kumar Sharma
-
Publication number: 20230418845Abstract: The interpretation of a graph data structure represented on a computing system in which the connection between a pair of nodes in the graph may be interpreted by which intermediary entity (node or edge) on a path (e.g., a shortest path) between the node pair is most dominant. That is, if the intermediary entity were not present, a detour path is determined. The greater the difference between the detour path and the original path, the more significant that intermediary entity is. The significance of multiple intermediary entities in the original path may be determined in this way.Type: ApplicationFiled: September 11, 2023Publication date: December 28, 2023Inventors: Leo Moreno BETTHAUSER, Maurice DIESENDRUCK, Harsh SHRIVASTAVA
-
Publication number: 20230409581Abstract: A data processing system implements obtaining query parameters for a query for content items in a datastore, the query parameters including attributes of content items for which a search is to be conducted; obtaining a first set of content items from a content datastore based on the query parameters; analyzing the first set of content items using a first machine learning model trained to generate relevant content information that identifies a plurality of relevant content items included in the first set of content items; and analyzing the plurality of relevant content items using a second machine learning model configured to output novel content information, the novel content information including a plurality of content items predicted to be relevant and novel, the novel content information ranking the plurality of content items predicted to be relevant and novel based on a novelty score associated with each respective content item.Type: ApplicationFiled: June 15, 2022Publication date: December 21, 2023Applicant: Microsoft Technology Licensing, LLCInventors: Leo Moreno BETTHAUSER, Jing TIAN, Yijian XIANG, Pramod Kumar SHARMA
-
Publication number: 20230401491Abstract: A data processing system implements obtaining attention matrices from a first machine learning model that is pretrained and includes a plurality of self-attention layers. The data processing system further implements analyzing the attention matrices to generate a computation graph based on the attention matrices. The computation graph provides a representation of behavior of the first machine learning model across the plurality of self-attention layers. The data processing system is further implements analyzing the computation graph using a second machine learning model. The second machine learning model is trained to receive the computation graph to output model behavior information. The model behavior information identifying which layers of model performed specific tasks associated with generating predictions by the first machine learning model.Type: ApplicationFiled: June 14, 2022Publication date: December 14, 2023Applicant: Microsoft Technology Licensing, LLCInventors: Leo Moreno BETTHAUSER, Maurice DIESENDRUCK
-
Patent number: 11797580Abstract: The interpretation of a graph data structure represented on a computing system in which the connection between a pair of nodes in the graph may be interpreted by which intermediary entity (node or edge) on a path (e.g., a shortest path) between the node pair is most dominant. That is, if the intermediary entity were not present, a detour path is determined. The greater the difference between the detour path and the original path, the more significant that intermediary entity is. The significance of multiple intermediary entities in the original path may be determined in this way.Type: GrantFiled: December 20, 2021Date of Patent: October 24, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Leo Moreno Betthauser, Maurice Diesendruck, Harsh Shrivastava
-
Publication number: 20230195838Abstract: The monitoring of performance of a machine-learned model for use in generating an embedding space. The system uses two embedding spaces: a reference embedding space generated by applying an embedding model to reference data, and an evaluation embedding space generated by applying the embedding model to evaluation data. The system obtains multiple views of the reference embedding space, and uses those multiple views to determine a distance threshold. The system determines a distance between the evaluation and reference embedding spaces, and compares that distance with the fitness threshold. Based on the comparison, the system determines a level of acceptability of the model for use with the evaluation dataset.Type: ApplicationFiled: December 20, 2021Publication date: June 22, 2023Inventors: Leo Moreno BETTHAUSER, Urszula Stefania CHAJEWSKA, Maurice DIESENDRUCK, Rohith Venkata PESALA
-
Publication number: 20230196181Abstract: A computer system is configured to provide an intelligent machine-learning (ML) model catalog containing data associated with multiple ML models. The multiple ML models are trained over multiple training datasets respectively, and the intelligent ML model catalog contains at least multiple training data spaces of embeddings generated based on the multiple ML models and the multiple training datasets. In response to receiving a user dataset, for at least one ML model in the plurality of ML models, the computer system is configured to extract a user data space of embeddings based on the at least one ML model and the user dataset, and evaluate the user data space against the training data space to determine whether the at least one ML model is a good fit for the user dataset.Type: ApplicationFiled: December 20, 2021Publication date: June 22, 2023Inventors: Leo Moreno BETTHAUSER, Urszula Stefania CHAJEWSKA, Maurice DIESENDRUCK, Henry Hun-Li Reid PAN, Rohith Venkata PESALA
-
Publication number: 20230195758Abstract: The interpretation of a graph data structure represented on a computing system in which the connection between a pair of nodes in the graph may be interpreted by which intermediary entity (node or edge) on a path (e.g., a shortest path) between the node pair is most dominant. That is, if the intermediary entity were not present, a detour path is determined. The greater the difference between the detour path and the original path, the more significant that intermediary entity is. The significance of multiple intermediary entities in the original path may be determined in this way.Type: ApplicationFiled: December 20, 2021Publication date: June 22, 2023Inventors: Leo Moreno BETTHAUSER, Maurice DIESENDRUCK, Harsh SHRIVASTAVA
-
Publication number: 20230044182Abstract: A computer implemented method includes obtaining deep learning model embedding for each instance present in a dataset, the embedding incorporating a measure of concept similarity. An identifier of a first instance of the dataset is received. A similarity distance is determined based on the respective embeddings of the first instance and a second instance. Similarity distances between embeddings, represented as points, imply a graph, where each instance's embedding is connected by an edge to a set of similar instances' embeddings. Sequences of connected points, referred to as walks, provide valuable information about the dataset and the deep learning model.Type: ApplicationFiled: July 29, 2021Publication date: February 9, 2023Inventors: Robin Abraham, Leo Moreno Betthauser, Maurice Diesendruck, Urszula Stefania Chajewska
-
Publication number: 20220230053Abstract: Creating a machine learning graph neural network configured to process signals. A method includes identifying a plurality of machine learning graphs where each of the machine learning graphs are for different types of data. The method further includes receiving input identifying shared content of different machine learning graph nodes from different graphs in the plurality of machine learning graphs. The method further includes creating a combined machine learning graph neural network, configured to process signals, using the plurality of machine learning graphs based on the shared content, the combined machine learning graph neural network comprising nodes corresponding to nodes in the plurality of machine learning graphs such that output from the combined machine learning graph neural network comprises outputs generated based on relationships of nodes in the combined machine learning graph corresponding to nodes in different machine learning graphs in the plurality of machine learning graphs.Type: ApplicationFiled: January 15, 2021Publication date: July 21, 2022Inventors: Leo Moreno BETTHAUSER, Ziyao LI