Patents by Inventor Leo Moreno BETTHAUSER

Leo Moreno BETTHAUSER has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240370570
    Abstract: Disclosed is a machine learning model architecture that leverages existing large language models to analyze log files for security vulnerabilities. In some configurations, log files are processed by an encoder machine learning model to generate embeddings. Embeddings generated by the encoder model are used to construct graphs. The graphs are in turn used to train a graph classifier model for identifying security vulnerabilities. The encoder model may be an existing general-purpose large language model. In some configurations, the nodes of the graphs are the embedding vectors generated by the encoder model while edges represent similarities between nodes. Graphs constructed in this way may be pruned to highlight more meaningful node topologies. The graphs may then be labeled based on a security analysis of the corresponding log files. A graph classifier model trained on the labeled graphs may be used to identify security vulnerabilities.
    Type: Application
    Filed: May 4, 2023
    Publication date: November 7, 2024
    Inventors: Leo Moreno BETTHAUSER, Andrew White WICKER, Bryan (Ning) XIA
  • Publication number: 20240370714
    Abstract: Disclosed is a machine learning model architecture that can incorporate structure information from multiple types of structured text into a single unified machine learning model. For example, a single unified model may be trained with structure information from XML files, tabular data, and/or flat text files. A structure-aware attention mechanism builds on the attention mechanism of the transformer architecture. Specifically, values computed for a traditional transformer attention mechanism are used to compute structure-aware attention scores. In some configurations, the location of a token in the structured text is incorporated into that token's embedding. Similarly, metadata about a token, such as whether the token is a key or a value of a key/value pair, may be incorporated into the token's embedding. This enables the model to reason over token metadata and the location of the token in the structured text in addition to the meaning of the token itself.
    Type: Application
    Filed: May 4, 2023
    Publication date: November 7, 2024
    Inventors: Leo Moreno BETTHAUSER, Muhammed Fatih BULUT, Bryan (Ning) XIA
  • Publication number: 20240330446
    Abstract: Methods and apparatuses for improving the performance and energy efficiency of machine learning systems that generate security specific machine learning models and generate security related information using security specific machine learning models are described. A security specific machine learning model may comprise a security specific large language model (LLM). The security specific LLM may be trained and deployed to generate semantically related security information. The security specific LLM may be pretrained with a security specific data set that was generated using similarity deduplication and long line handling, and with security specific objectives, such as next log line prediction based on host, system, application, and cyber attacker behavior. The security specific large language model may be fine-tuned using a security specific similarity dataset that may be generated to align the security specific LLM to capture similarity between different security events.
    Type: Application
    Filed: June 14, 2023
    Publication date: October 3, 2024
    Inventors: Muhammed Fatih BULUT, Lloyd Geoffrey GREENWALD, Aditi Kamlesh SHAH, Leo Moreno BETTHAUSER, Yingqi LIU, Ning XIA, Siyue WANG
  • Publication number: 20240256948
    Abstract: In some examples, a method for orchestrating an execution plan is provided. The method includes receiving an input embedding that is generated by a machine-learning model and receiving a plurality of stored semantic embeddings, from an embedding object memory, based on the input embedding. The plurality of stored semantic embeddings each correspond to a respective historic plan. Each historic plan includes one or more executable skills. The method further includes determining a subset of semantic embeddings from the plurality of stored semantic embeddings based on a similarity to the input embedding, and generating a new plan based on the subset of semantic embeddings and the input embedding. The new plan may be different than the historic plans that correspond to the subset of semantic embeddings. The method further includes providing the new plan as an output.
    Type: Application
    Filed: March 24, 2023
    Publication date: August 1, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Leo Moreno BETTHAUSER, William BLUM, Andrew W. WICKER, Eric Paul DOUGLAS, Lloyd Geoffrey GREENWALD, Nicholas BECKER
  • Publication number: 20240256780
    Abstract: In some examples, a method of generating a security report is provided. The method includes receiving a user query and security data, and providing the user query and security data to a semantic model. The semantic model generates one or more first embeddings. The method further includes receiving, from a data model, one or more second embeddings. The data model is generated based on historical threat intelligence data. The model further includes generating an execution plan based on the one or more first embeddings and the one or more second embeddings, and returning a report that corresponds to the execution plan.
    Type: Application
    Filed: March 24, 2023
    Publication date: August 1, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Eric Paul DOUGLAS, Mario Davis GOERTZEL, Lloyd Geoffrey GREENWALD, Aditi Kamlesh SHAH, Leo Moreno BETTHAUSER, Daniel Lee MACE, Nicholas BECKER
  • Publication number: 20240211796
    Abstract: The present disclosure relates to utilizing an embedding space relationship query exploration system to explore embedding spaces generated by machine-learning models. For example, the embedding space relationship query exploration system facilitates efficiently and flexibly revealing relationships that are encoded in a machine-learning model during training and inferencing. In particular, the embedding space relationship query exploration system utilizes various embeddings relationship query models to explore and discover the relationship types being learned and preserved within the embedding space of a machine-learning model.
    Type: Application
    Filed: December 22, 2022
    Publication date: June 27, 2024
    Inventors: Maurice DIESENDRUCK, Leo Moreno BETTHAUSER, Urszula Stefania CHAJEWSKA, Rohith Venkata PESALA, Robin ABRAHAM
  • Patent number: 11868358
    Abstract: A data processing system implements obtaining query parameters for a query for content items in a datastore, the query parameters including attributes of content items for which a search is to be conducted; obtaining a first set of content items from a content datastore based on the query parameters; analyzing the first set of content items using a first machine learning model trained to generate relevant content information that identifies a plurality of relevant content items included in the first set of content items; and analyzing the plurality of relevant content items using a second machine learning model configured to output novel content information, the novel content information including a plurality of content items predicted to be relevant and novel, the novel content information ranking the plurality of content items predicted to be relevant and novel based on a novelty score associated with each respective content item.
    Type: Grant
    Filed: June 15, 2022
    Date of Patent: January 9, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Leo Moreno Betthauser, Jing Tian, Yijian Xiang, Pramod Kumar Sharma
  • Publication number: 20230418845
    Abstract: The interpretation of a graph data structure represented on a computing system in which the connection between a pair of nodes in the graph may be interpreted by which intermediary entity (node or edge) on a path (e.g., a shortest path) between the node pair is most dominant. That is, if the intermediary entity were not present, a detour path is determined. The greater the difference between the detour path and the original path, the more significant that intermediary entity is. The significance of multiple intermediary entities in the original path may be determined in this way.
    Type: Application
    Filed: September 11, 2023
    Publication date: December 28, 2023
    Inventors: Leo Moreno BETTHAUSER, Maurice DIESENDRUCK, Harsh SHRIVASTAVA
  • Publication number: 20230409581
    Abstract: A data processing system implements obtaining query parameters for a query for content items in a datastore, the query parameters including attributes of content items for which a search is to be conducted; obtaining a first set of content items from a content datastore based on the query parameters; analyzing the first set of content items using a first machine learning model trained to generate relevant content information that identifies a plurality of relevant content items included in the first set of content items; and analyzing the plurality of relevant content items using a second machine learning model configured to output novel content information, the novel content information including a plurality of content items predicted to be relevant and novel, the novel content information ranking the plurality of content items predicted to be relevant and novel based on a novelty score associated with each respective content item.
    Type: Application
    Filed: June 15, 2022
    Publication date: December 21, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Leo Moreno BETTHAUSER, Jing TIAN, Yijian XIANG, Pramod Kumar SHARMA
  • Publication number: 20230401491
    Abstract: A data processing system implements obtaining attention matrices from a first machine learning model that is pretrained and includes a plurality of self-attention layers. The data processing system further implements analyzing the attention matrices to generate a computation graph based on the attention matrices. The computation graph provides a representation of behavior of the first machine learning model across the plurality of self-attention layers. The data processing system is further implements analyzing the computation graph using a second machine learning model. The second machine learning model is trained to receive the computation graph to output model behavior information. The model behavior information identifying which layers of model performed specific tasks associated with generating predictions by the first machine learning model.
    Type: Application
    Filed: June 14, 2022
    Publication date: December 14, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Leo Moreno BETTHAUSER, Maurice DIESENDRUCK
  • Patent number: 11797580
    Abstract: The interpretation of a graph data structure represented on a computing system in which the connection between a pair of nodes in the graph may be interpreted by which intermediary entity (node or edge) on a path (e.g., a shortest path) between the node pair is most dominant. That is, if the intermediary entity were not present, a detour path is determined. The greater the difference between the detour path and the original path, the more significant that intermediary entity is. The significance of multiple intermediary entities in the original path may be determined in this way.
    Type: Grant
    Filed: December 20, 2021
    Date of Patent: October 24, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Leo Moreno Betthauser, Maurice Diesendruck, Harsh Shrivastava
  • Publication number: 20230195838
    Abstract: The monitoring of performance of a machine-learned model for use in generating an embedding space. The system uses two embedding spaces: a reference embedding space generated by applying an embedding model to reference data, and an evaluation embedding space generated by applying the embedding model to evaluation data. The system obtains multiple views of the reference embedding space, and uses those multiple views to determine a distance threshold. The system determines a distance between the evaluation and reference embedding spaces, and compares that distance with the fitness threshold. Based on the comparison, the system determines a level of acceptability of the model for use with the evaluation dataset.
    Type: Application
    Filed: December 20, 2021
    Publication date: June 22, 2023
    Inventors: Leo Moreno BETTHAUSER, Urszula Stefania CHAJEWSKA, Maurice DIESENDRUCK, Rohith Venkata PESALA
  • Publication number: 20230196181
    Abstract: A computer system is configured to provide an intelligent machine-learning (ML) model catalog containing data associated with multiple ML models. The multiple ML models are trained over multiple training datasets respectively, and the intelligent ML model catalog contains at least multiple training data spaces of embeddings generated based on the multiple ML models and the multiple training datasets. In response to receiving a user dataset, for at least one ML model in the plurality of ML models, the computer system is configured to extract a user data space of embeddings based on the at least one ML model and the user dataset, and evaluate the user data space against the training data space to determine whether the at least one ML model is a good fit for the user dataset.
    Type: Application
    Filed: December 20, 2021
    Publication date: June 22, 2023
    Inventors: Leo Moreno BETTHAUSER, Urszula Stefania CHAJEWSKA, Maurice DIESENDRUCK, Henry Hun-Li Reid PAN, Rohith Venkata PESALA
  • Publication number: 20230195758
    Abstract: The interpretation of a graph data structure represented on a computing system in which the connection between a pair of nodes in the graph may be interpreted by which intermediary entity (node or edge) on a path (e.g., a shortest path) between the node pair is most dominant. That is, if the intermediary entity were not present, a detour path is determined. The greater the difference between the detour path and the original path, the more significant that intermediary entity is. The significance of multiple intermediary entities in the original path may be determined in this way.
    Type: Application
    Filed: December 20, 2021
    Publication date: June 22, 2023
    Inventors: Leo Moreno BETTHAUSER, Maurice DIESENDRUCK, Harsh SHRIVASTAVA
  • Publication number: 20230044182
    Abstract: A computer implemented method includes obtaining deep learning model embedding for each instance present in a dataset, the embedding incorporating a measure of concept similarity. An identifier of a first instance of the dataset is received. A similarity distance is determined based on the respective embeddings of the first instance and a second instance. Similarity distances between embeddings, represented as points, imply a graph, where each instance's embedding is connected by an edge to a set of similar instances' embeddings. Sequences of connected points, referred to as walks, provide valuable information about the dataset and the deep learning model.
    Type: Application
    Filed: July 29, 2021
    Publication date: February 9, 2023
    Inventors: Robin Abraham, Leo Moreno Betthauser, Maurice Diesendruck, Urszula Stefania Chajewska
  • Publication number: 20220230053
    Abstract: Creating a machine learning graph neural network configured to process signals. A method includes identifying a plurality of machine learning graphs where each of the machine learning graphs are for different types of data. The method further includes receiving input identifying shared content of different machine learning graph nodes from different graphs in the plurality of machine learning graphs. The method further includes creating a combined machine learning graph neural network, configured to process signals, using the plurality of machine learning graphs based on the shared content, the combined machine learning graph neural network comprising nodes corresponding to nodes in the plurality of machine learning graphs such that output from the combined machine learning graph neural network comprises outputs generated based on relationships of nodes in the combined machine learning graph corresponding to nodes in different machine learning graphs in the plurality of machine learning graphs.
    Type: Application
    Filed: January 15, 2021
    Publication date: July 21, 2022
    Inventors: Leo Moreno BETTHAUSER, Ziyao LI