Patents by Inventor Jannik Stroetgen
Jannik Stroetgen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12223440Abstract: A device and method for determining a knowledge graph. A second embedding is determined for a first embedding for a word including a function. A first classification, which determines whether or not the word is an entity for the knowledge graph, or which defines to which entity or to which type of entity for the knowledge graph the word in the knowledge graph is to be assigned, is determined for the second embedding using a first classifier. A second classification, which defines to which type of embeddings from a plurality of types of embeddings the second embedding is to be assigned, is determined for the second embedding using a second classifier. At least one parameter for the function is trained in a training as a function of a gradient for the training of the first classifier and as a function of a gradient for the training of the second classifier.Type: GrantFiled: April 22, 2021Date of Patent: February 11, 2025Assignee: ROBERT BOSCH GMBHInventors: Heike Adel-Vu, Jannik Stroetgen, Lukas Lange
-
Publication number: 20240354602Abstract: A device and a computer-implemented method for machine learning a fact in particular for populating a knowledge base. A character string is provided. A first set of embeddings of parts of the character string is determined. A second set of embeddings of parts of the character string is determined. For mutually corresponding embeddings from the sets, one of the variables for predicting the fact is determined in each case. The fact is determined, in particular in the knowledge base, depending on the variables.Type: ApplicationFiled: April 11, 2024Publication date: October 24, 2024Inventors: Lukas Lange, Heike Adel-Vu, Jannik Stroetgen
-
Patent number: 12061871Abstract: A device for the automatic analysis of multilingual text, including an embedder, which is configured for assigning a numeric representation to each of the text components from the multilingual text, and a temporal tagger, which is configured for identifying and tagging temporal expressions in the multilingual text depending on the assigned numeric representations. The embedder is configured for assigning the numeric representations of temporal expressions in such a way that it is not possible to ascertain, on the basis of the numeric representation, in which language the associated text component is written.Type: GrantFiled: March 30, 2021Date of Patent: August 13, 2024Assignee: ROBERT BOSCH GMBHInventors: Anastasiia Iurshina, Heike Adel-Vu, Jannik Stroetgen, Lukas Lange
-
Publication number: 20230351108Abstract: A method and device for processing temporal expressions from unstructured texts for filling a knowledge database. A temporal expression in a text is determined. A type of the temporal expression is determined as a function of the text. The temporal expression and the type are mapped on a prediction of a value of the temporal expression in a context-free representation of the temporal expression.Type: ApplicationFiled: April 24, 2023Publication date: November 2, 2023Inventors: Lukas Lange, Jannik Stroetgen, Heike Adel-Vu
-
Patent number: 11783202Abstract: A method for predicting a persistence over time of entries of a knowledge base variable over time, the knowledge base including triples of entities, property identifiers of properties of the respective entities, and expressions of these respective properties, the prediction being made as a function of an output value of a classifier, and the classifier being trained as a function of triples that are present in the knowledge base at two different points in time separated by a time interval, to output the output value that characterizes for a predefinable triple whether or not the expression stored in the triple is stable over this time interval.Type: GrantFiled: October 29, 2019Date of Patent: October 10, 2023Assignee: ROBERT BOSCH GMBHInventors: Simon Razniewski, Ioannis Dikeoulias, Jannik Stroetgen
-
Publication number: 20230306283Abstract: A device and method for training a model for linking a mention in textual context to an entity across knowledge bases. I the method, depending on training data, training the model for mapping an entity of a first knowledge base to its first representation in a vector space, for mapping an entity of a second knowledge base to its second representation in the vector space, for mapping the mention to a third representation in the vector space. The training data includes a set of pairs in which each pair includes a mention in a textual context and its corresponding reference entity in either the first knowledge base or the second knowledge base. Training the model includes evaluating a loss function.Type: ApplicationFiled: March 3, 2023Publication date: September 28, 2023Inventors: Hassan Soliman, Dragan Milchevski, Heike Adel-Vu, Mohamed Gad-Elrab, Jannik Stroetgen
-
Publication number: 20230267341Abstract: A device and a computer-implemented method for adding a quantity fact to a knowledge base, in particular a knowledge graph. The method includes: providing the knowledge base; providing a textual resource; providing an entity from the knowledge base; providing a relation from the knowledge base; providing a set of different units; determining a quantity comprising a unit within the set of different units that is within the textual resource depending on the entity, the relation, and the set of different units; determining a quantity fact comprising the entity, the relation, the quantity and the unit; and adding the quantity fact to the knowledge base.Type: ApplicationFiled: February 14, 2023Publication date: August 24, 2023Inventors: Daria Stepanova, Dragan Milchevski, Gerhard Weikum, Jannik Stroetgen, Vinh Thinh Ho
-
Patent number: 11687725Abstract: A computer-implemented method for processing text data including a multitude of text modules. In the method, a representation of the text is provided, and a model is used which predicts a classification for a respective text module of the text as a function of the representation of the text. The provision of the representation of the text includes the provision of a total word vector for a respective text module of the text. The total word vector is formed from at least two, preferably multiple word vectors, and a respective word vector being weighted as a function of properties of the respective text module.Type: GrantFiled: October 21, 2020Date of Patent: June 27, 2023Assignee: ROBERT BOSCH GMBHInventors: Heike Adel-Vu, Jannik Stroetgen, Lukas Lange
-
Publication number: 20220300758Abstract: A device and a computer-implemented method, for determining a similarity between data sets. A first data set that includes a plurality of first embeddings, and a second data set that includes a plurality of second embeddings, are predefined. A first model is trained on the first data set, and a second model is trained on the second data set. A set of first features of the first model is determined on the second data set, which for each second embedding includes a feature of the first model, and a set of second features of the second model is determined on the second data set, which for each second embedding includes a feature of the second model. A map that optimally maps the set of first features onto the set of second features is determined. The similarity is determined as a function of a distance of the map from a reference.Type: ApplicationFiled: March 11, 2022Publication date: September 22, 2022Inventors: Lukas Lange, Heike Adel-Vu, Jannik Stroetgen
-
Publication number: 20220300750Abstract: A device and method for classifying data sets are provided. A model for solving a task, and training data sets are predefined. For each of the training data sets, a trained model for solving the task is determined by pretraining the model on the training data set and training the model on a reference training data set. A trained reference model for solving the task is determined by training the model on the reference training data set without pretraining with the plurality of training data sets. The trained models are classified as suitable or unsuitable for the pretraining as a function of a deviation of their particular quality from a reference quality. In the plurality of training data sets, nearest neighbors of a data set are determined. Each data set is classified as suitable or unsuitable for the pretraining.Type: ApplicationFiled: March 11, 2022Publication date: September 22, 2022Inventors: Lukas Lange, Jannik Stroetgen
-
Publication number: 20210350077Abstract: A computer-implemented method. The method includes: providing input data for a model; anonymizing at least a portion of the input data, the anonymizing including the provision of masked embeddings of the input data, and extracting pieces of information from the masked embeddings. The steps for anonymizing at least a portion of the input data and for extracting pieces of information are carried out using a hierarchical model.Type: ApplicationFiled: May 3, 2021Publication date: November 11, 2021Inventors: Lukas Lange, Heike Adel-Vu, Jannik Stroetgen
-
Publication number: 20210342716Abstract: A device and method for determining a knowledge graph. A second embedding is determined for a first embedding for a word including a function. A first classification, which determines whether or not the word is an entity for the knowledge graph, or which defines to which entity or to which type of entity for the knowledge graph the word in the knowledge graph is to be assigned, is determined for the second embedding using a first classifier. A second classification, which defines to which type of embeddings from a plurality of types of embeddings the second embedding is to be assigned, is determined for the second embedding using a second classifier. At least one parameter for the function is trained in a training as a function of a gradient for the training of the first classifier and as a function of a gradient for the training of the second classifier.Type: ApplicationFiled: April 22, 2021Publication date: November 4, 2021Inventors: Heike Adel-Vu, Jannik Stroetgen, Lukas Lange
-
Publication number: 20210326530Abstract: A device for the automatic analysis of multilingual text, including an embedder, which is configured for assigning a numeric representation to each of the text components from the multilingual text, and a temporal tagger, which is configured for identifying and tagging temporal expressions in the multilingual text depending on the assigned numeric representations. The embedder is configured for assigning the numeric representations of temporal expressions in such a way that it is not possible to ascertain, on the basis of the numeric representation, in which language the associated text component is written.Type: ApplicationFiled: March 30, 2021Publication date: October 21, 2021Inventors: Anastasiia Iurshina, Heike Adel-Vu, Jannik Stroetgen, Lukas Lange
-
Publication number: 20210192316Abstract: A computer-implemented method for processing digital data of a specific domain, the digital data including a multitude of data sequences, a respective data sequence including in each case multiple data elements, and the data elements, following a logical and/or syntactic structure being joined together to form the respective data sequence. The method encompasses the following steps: parsing a respective data sequence into multiple components utilizing its logical and/or syntactic structure, providing vector representations of the components, determining degrees of similarity between individual vector representations and determining degrees of similarity between individual data sequences based on degrees of similarity between individual vector representations.Type: ApplicationFiled: December 3, 2020Publication date: June 24, 2021Inventors: Jannik Stroetgen, Marvin Schiller
-
Publication number: 20210124877Abstract: A computer-implemented method for processing text data including a multitude of text modules. In the method, a representation of the text is provided, and a model is used which predicts a classification for a respective text module of the text as a function of the representation of the text. The provision of the representation of the text includes the provision of a total word vector for a respective text module of the text. The total word vector is formed from at least two, preferably multiple word vectors, and a respective word vector being weighted as a function of properties of the respective text module.Type: ApplicationFiled: October 21, 2020Publication date: April 29, 2021Inventors: Heike Adel-Vu, Jannik Stroetgen, Lukas Lange
-
Publication number: 20210056448Abstract: A computer-implemented method for computing inconsistency explanations in a first data set, enhanced with an ontology, the first data set comprising data elements, called individuals, and facts about the individuals; the facts are expressed according to an ontology language in terms of class assertions and/or property assertions, a class assertion relates one individual with a class and a property assertion relates one individual with a second individual. The ontology includes a formal explicit description of the classes and/or properties and further including axioms about the classes and/or properties; wherein the method includes the steps of: constructing a second data set being an abstract description of the first data set; computing inconsistency explanations in the second data set with regard to the axioms of the ontology, and computing inconsistency explanations for the first data set with regard to the ontology based on the computed inconsistency explanations in the second data set.Type: ApplicationFiled: July 20, 2020Publication date: February 25, 2021Inventors: Daria Stepanova, Evgeny Kharlamov, Jannik Stroetgen, Mohamed Gad-Elrab, Trung Kien Tran
-
Publication number: 20210034988Abstract: A device and method for activating a machine or for machine learning or for filling a knowledge graph. Training data are made available, including texts having labels with regard to a structured piece of information. A system for classification is trained using the training data, the system for classification including an attention function that weighs individual vector representations of individual parts of a sentence as a function of weights, a classification of the sentence is determined as a function of an output of the attention function. The machine is activated in response to the input data or a knowledge graph is filled with information, i.e., expanded or built anew, in response to input data.Type: ApplicationFiled: July 23, 2020Publication date: February 4, 2021Inventors: Heike Adel-Vu, Jannik Stroetgen
-
Publication number: 20210027139Abstract: A computer-implemented method for machine learning and processing of a digital data stream as well as devices for this purpose. A representation of a text is provided independently of a domain, a representation of a structure of the domain being provided, and a model for automatically detecting sensitive text elements being trained as a function of the representations, and data from at least a portion of the data stream, which represent a word, being replaced by data that represent a placeholder for the word, an output of the model being determined as a function of the data, data to be replaced in the data and data that replace the data to be replaced being determined as a function of the output of the model.Type: ApplicationFiled: July 17, 2020Publication date: January 28, 2021Inventors: Lukas Lange, Heike Adel-Vu, Jannik Stroetgen
-
Publication number: 20200202228Abstract: A method for predicting a persistence over time of entries of a knowledge base variable over time, the knowledge base including triples of entities, property identifiers of properties of the respective entities, and expressions of these respective properties, the prediction being made as a function of an output value of a classifier, and the classifier being trained as a function of triples that are present in the knowledge base at two different points in time separated by a time interval, to output the output value that characterizes for a predefinable triple whether or not the expression stored in the triple is stable over this time interval.Type: ApplicationFiled: October 29, 2019Publication date: June 25, 2020Inventors: Simon Razniewski, Ioannis Dikeoulias, Jannik Stroetgen