Patents by Inventor Francesco Fusco
Francesco Fusco has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11663228Abstract: Various embodiments are provided for intelligent management of data flows in a computing environment by a processor. One or more data transformation in time-series data applications templates may be created and managed according to concepts, one or more instances of the concepts, relationships between the concepts, and a mapping of the concepts to one or more data sources.Type: GrantFiled: January 15, 2020Date of Patent: May 30, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Francesco Fusco, Robert Gormally, Mark Purcell, Seshu Tirupathi
-
Patent number: 11663407Abstract: A tool for managing text-item recognition systems such as NER (Named Entity Recognition) systems. The tool applies the system to a text corpus containing instances of text items, such as named entities, to be recognized by the system, and selecting from the text corpus a set of instances of text items which the system recognized. The tool tokenizes the text corpus such that each instance in the aforementioned set is encoded as a single token and processing the tokenized text via a word embedding scheme to generate a word embedding matrix. The tool, responsive to selecting a seed token corresponding to an instance in the aforementioned set, performs a nearest-neighbor search of the embedding space to identify a set of neighboring tokens for the seed token, and identifies the text corresponding to each neighboring token as a potential instance of a text item to be annotated.Type: GrantFiled: December 2, 2020Date of Patent: May 30, 2023Assignee: International Business Machines CorporationInventors: Francesco Fusco, Abderrahim Labbi, Peter Willem Jan Staar
-
Publication number: 20230129390Abstract: Various embodiments are provided for managing performance of a data processing system in a computing environment using one or more processors in a computing system. A drift may be dynamically detected in one or more machine learning models generating a plurality of predictions and deployed in a computing system. A plurality of metrics and data may be collected of the one or more machine learning models based on the drift. One or more additional machine learning models may be trained based of the drift and the plurality of metrics and data.Type: ApplicationFiled: October 27, 2021Publication date: April 27, 2023Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Francesco FUSCO, Venkata Sitaramagiridharganesh GANAPAVARAPU, Seshu TIRUPATHI
-
Publication number: 20230055769Abstract: Ranking a plurality of text elements, each comprising at least one word, by specificity. For each text element to be ranked, such a method includes computing an embedding vector that locates a text element in an embedding space, and selecting a set of text fragments from reference text. Each of these text fragments contains the text element to be ranked and further text elements. For each text fragment, the method calculates respective distances in the embedding space between the further text elements. The method further includes calculating a specificity score for the text element to be ranked and storing the specificity score. After ranking the plurality of text elements, a text data structure using the specificity scores for text elements to extract data having a desired specificity from the data structure may be processed.Type: ApplicationFiled: August 23, 2021Publication date: February 23, 2023Inventors: Francesco Fusco, Cesar Berrospi Ramis, Peter Willem Jan Staar
-
Patent number: 11507601Abstract: A method for matching first elements with second elements. Each of the first elements and second elements is a character string. The method comprises: calculating a first integer hash value for each of the first elements using a string hash function, wherein the first integer hash value is an output integer calculated from using each of the first elements as an input character string of the function; calculating second integer hash values for each of the second elements using the function; grouping each of the first elements into at least one group of a set of blocking groups using its first integer hash value; grouping each of the second elements into at the least one group of the set of blocking groups using its second integer hash value; and matching first elements with second elements within each group of the set of blocking groups using a string comparison function.Type: GrantFiled: August 18, 2016Date of Patent: November 22, 2022Assignee: International Business Machines CorporationInventors: Francesco Fusco, Yves G. Ineichen, Michel F. Speiser
-
Patent number: 11475332Abstract: A computer-implemented method, a computer program product, and a computer system for selecting predictions by models. A computer receives a request for a forecast of a dependent variable in a time domain, where the time domain includes first time periods that have normal labels due to normal predictor variable data and second time periods that have anomalous labels due to anomalous predictor variable data. The computer retrieves accuracy scores and robustness scores of models, where the accuracy scores indicate forecasting accuracy in the first time periods and the robustness scores indicate forecasting accuracy in the second time periods. For predictions in the first time period, the computer selects dependent variable values predicted by a first model that has highest values of the accuracy scores. For predictions in the second time periods, the computer selects dependent variable values predicted by a second model that has highest values of the robustness scores.Type: GrantFiled: July 12, 2020Date of Patent: October 18, 2022Assignee: International Business Machines CorporationInventors: Robert Gormally, Bradley Eck, Francesco Fusco, Mark Purcell, Seshu Tirupathi
-
Publication number: 20220207349Abstract: A computer-implemented method of generating a machine learning model pipeline (“pipeline”) for a task, where the pipeline includes a machine learning model and at least one feature. A machine learning task including a data set and a set of first tags related to the task are received from a user. It is determined whether a database stores a first machine learning model pipeline correlated in the database with a second tag matching at least one first tag received from the user. Upon determining that the database stores the first machine learning model pipeline, the first machine learning model pipeline is retrieved, the retrieved first machine learning model pipeline is run, and the machine learning task is responded to. Pipelines may also be created based on stored pipelines correlated with a tag related to a tag in the task, or from received feature generator(s) and models.Type: ApplicationFiled: December 29, 2020Publication date: June 30, 2022Inventors: Francesco Fusco, Fearghal O'Donncha, Seshu Tirupathi
-
Patent number: 11361571Abstract: A language model is fine-tuned by extracting terminology terms from a text document. The method comprises identifying a text snippet, identifying candidate multi-word expressions using part of speech tags, and determining a specificity score value for each of the candidate multi-word expressions. Moreover, the method comprises determining a topic similarity score value for each of the candidate multi-word expressions, selecting remaining expressions from the candidate multi-word expressions using a function of a specificity value and a topic similarity value of each of the candidate multi-word expressions, adding a noun comprised in the text snippet to the remaining expressions depending on a correlation function, labeling the remaining multi-word expressions, and fine-tuning an existing pre-trained transformer-based language model using as training data the identified text snippet marked with the labeled remaining expressions.Type: GrantFiled: June 28, 2021Date of Patent: June 14, 2022Assignee: International Business Machines CorporationInventors: Francesco Fusco, Peter Willem Jan Staar
-
Publication number: 20220171931Abstract: A tool for managing text-item recognition systems such as NER (Named Entity Recognition) systems. The tool applies the system to a text corpus containing instances of text items, such as named entities, to be recognized by the system, and selecting from the text corpus a set of instances of text items which the system recognized. The tool tokenizes the text corpus such that each instance in the aforementioned set is encoded as a single token and processing the tokenized text via a word embedding scheme to generate a word embedding matrix. The tool, responsive to selecting a seed token corresponding to an instance in the aforementioned set, performs a nearest-neighbor search of the embedding space to identify a set of neighboring tokens for the seed token, and identifies the text corresponding to each neighboring token as a potential instance of a text item to be annotated.Type: ApplicationFiled: December 2, 2020Publication date: June 2, 2022Inventors: Francesco Fusco, Abderrahim Labbi, Peter Willem Jan Staar
-
Publication number: 20220075809Abstract: Computer-implemented methods and systems are provided for generating training datasets for bootstrapping text classifiers. Such a method includes providing a word embedding matrix. This matrix is generated from a text corpus by encoding words in the text as respective tokens such that selected compound keywords in the text are encoded as single tokens. The method includes receiving, via a user interface, a user-selected set of the keywords a nearest neighbor search of the embedding space is performed for each keyword in the set to identify neighboring keywords, and a plurality of the neighboring keywords are added to the keyword-set. The method further comprises, for a corpus of documents, string-matching keywords in the keyword-sets to text in each document to identify, based on results of the string-matching, documents associated with each text class. The documents identified for each text class are stored as the training dataset for the classifier.Type: ApplicationFiled: September 10, 2020Publication date: March 10, 2022Inventors: Francesco Fusco, Mattia Atzeni, Abderrahim Labbi
-
Publication number: 20220012609Abstract: A computer-implemented method, a computer program product, and a computer system for selecting predictions by models. A computer receives a request for a forecast of a dependent variable in a time domain, where the time domain includes first time periods that have normal labels due to normal predictor variable data and second time periods that have anomalous labels due to anomalous predictor variable data. The computer retrieves accuracy scores and robustness scores of models, where the accuracy scores indicate forecasting accuracy in the first time periods and the robustness scores indicate forecasting accuracy in the second time periods. For predictions in the first time period, the computer selects dependent variable values predicted by a first model that has highest values of the accuracy scores. For predictions in the second time periods, the computer selects dependent variable values predicted by a second model that has highest values of the robustness scores.Type: ApplicationFiled: July 12, 2020Publication date: January 13, 2022Inventors: Robert Gormally, Bradley Eck, Francesco Fusco, Mark Purcell, Seshu Tirupathi
-
Patent number: 11176148Abstract: Embodiments for automated data exploration and validation by a processor. One or more optimal data flows are provided in response to a query for one or more heterogeneous data sources according to an inference model based on a knowledge graph a plurality of data flows between one or more heterogeneous data sources relating to the query. An analytical flow is provided for one or more of the plurality of data flows for those of the one or more heterogeneous data sources that are undetected, and two or more of the one or more of the plurality of data flows are aggregated or disaggregated for the one or more heterogeneous data sources that are nested within the knowledge graph. One or more criteria is received from a user via an interactive graphical user interface (GUI) to use for defining the one or more optimal data flows.Type: GrantFiled: August 9, 2019Date of Patent: November 16, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ulrike Fischer, Francesco Fusco, Pascal Pompey, Mathieu Sinn
-
Publication number: 20210216545Abstract: Various embodiments are provided for intelligent management of data flows in a computing environment by a processor. One or more data transformation in time-series data applications templates may be created and managed according to concepts, one or more instances of the concepts, relationships between the concepts, and a mapping of the concepts to one or more data sources.Type: ApplicationFiled: January 15, 2020Publication date: July 15, 2021Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Francesco FUSCO, Robert Gormally, Mark PURCELL, Seshu Tirupathi
-
Patent number: 10782655Abstract: A sensor data fusion system includes a processor coupled to a plurality of sensors. The system is initialized by providing access to a data store storing at least one time series of sensor data; a semantic store storing semantic data including system variables, and relations between the system variables; and a mapping therebetween. A registration of a set of one or more variables of interest for which appropriate data is not available is obtained. An initially empty inference model is extended with the set of variables, to obtain an extended model. A request to observe a given one of the set of variables at a given timestamp is obtained. Responsive thereto, time series data for the set of registered variables is retrieved. The extended model is run with the retrieved data to obtain an estimate of the given one of the variables at the given timestamp.Type: GrantFiled: June 16, 2018Date of Patent: September 22, 2020Assignee: International Business Machines CorporationInventors: Bradley Eck, Francesco Fusco, Seshu Tirupathi
-
Patent number: 10657299Abstract: A system for posterior estimation of variables. Receiving a set of data inputs. Determining a first model of the water distribution network based on the set of data inputs. Determining a second model of the water distribution network based on the set of data inputs, and the first model.Type: GrantFiled: September 21, 2018Date of Patent: May 19, 2020Assignee: International Business Machines CorporationInventors: Francesco Fusco, Sergiy Zhuk
-
Patent number: 10635763Abstract: A fluid is modeled as a set of discrete particles. Each of the particles is associated with one or more properties, and a spatial distance comprising a smoothing length over which the one or more properties are to be smoothed. A corresponding trajectory is simulated for each of the particles. The corresponding trajectory is used to formulate a first solution for simulating transport within the fluid. A first predicted error is determined for the first solution. An iterative adjustment is performed to at least one of: a quantity of particles, the smoothing length, or the one or more corresponding properties, to formulate a second solution for simulating transport with the fluid, and a second predicted error is determined for the second solution, until the second predicted error is within a predefined boundary.Type: GrantFiled: March 7, 2017Date of Patent: April 28, 2020Assignee: International Business Machines CorporationInventors: Francesco Fusco, Fearghal O'Donncha, Emanuele Ragnoli
-
Patent number: 10528578Abstract: A method for data mining on compressed data vectors by a certain metric being expressible as a function of the Euclidean distance is suggested. In a first step, for each compressed data vector, positions and values of such coefficients having the largest energy in the compressed data vector are stored. In a second step, for each compressed data vector, the coefficients having not the largest energy in the compressed data vector are discarded. In a third step, for each compressed data vector, a compression error is determined in dependence on the discarded coefficients in the compressed data vector. In a fourth step, at least one of an upper and a lower bound for the certain metric is retrieved in dependence on the stored positions and the stored values of the coefficients having the largest energy and the determined compression errors.Type: GrantFiled: April 24, 2013Date of Patent: January 7, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nikolaos Freris, Francesco Fusco, Michail Vlachos
-
Publication number: 20190384235Abstract: A sensor data fusion system includes a processor coupled to a plurality of sensors. The system is initialized by providing access to a data store storing at least one time series of sensor data; a semantic store storing semantic data including system variables, and relations between the system variables; and a mapping therebetween. A registration of a set of one or more variables of interest for which appropriate data is not available is obtained. An initially empty inference model is extended with the set of variables, to obtain an extended model. A request to observe a given one of the set of variables at a given timestamp is obtained. Responsive thereto, time series data for the set of registered variables is retrieved. The extended model is run with the retrieved data to obtain an estimate of the given one of the variables at the given timestamp.Type: ApplicationFiled: June 16, 2018Publication date: December 19, 2019Inventors: BRADLEY ECK, FRANCESCO FUSCO, SESHU TIRUPATHI
-
Publication number: 20190361902Abstract: Embodiments for automated data exploration and validation by a processor. One or more optimal data flows are provided in response to a query for one or more heterogeneous data sources according to an inference model based on a knowledge graph a plurality of data flows between one or more heterogeneous data sources relating to the query. An analytical flow is provided for one or more of the plurality of data flows for those of the one or more heterogeneous data sources that are undetected, and two or more of the one or more of the plurality of data flows are aggregated or disaggregated for the one or more heterogeneous data sources that are nested within the knowledge graph. One or more criteria is received from a user via an interactive graphical user interface (GUI) to use for defining the one or more optimal data flows.Type: ApplicationFiled: August 9, 2019Publication date: November 28, 2019Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ulrike FISCHER, Francesco FUSCO, Pascal POMPEY, Mathieu SINN
-
Publication number: 20190335688Abstract: Embodiments for groundwater and solar energy usage optimization for an agricultural region in an Internet of Things (IoT) computing environment by one or more processors are described. An amount of water required for an agricultural region and an amount of solar energy required to pump the water in a water pumping system for the agricultural region may be determined according to groundwater characteristics, weather data, weather forecasts, solar energy forecasts, historical water pumping data, crop and soil characteristics, agricultural management strategies, or a combination thereof.Type: ApplicationFiled: May 4, 2018Publication date: November 7, 2019Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Seshu TIRUPATHI, Francesco FUSCO, Sean A. MCKENNA