Patents Assigned to Informatica LLC
-
Patent number: 10324947Abstract: A data analysis server maintains database operation history data and context data for database operations performed on tables by a set of training users. The data analysis server builds predictive models for using the maintained data to recommend database operations and operands to a set of guided users. The data analysis server trains the predictive models by determining and weighting features derived from context data that are predictive of performing database operations to tables with similar context data. Using the predictive model, the data analysis server generates recommended database operations and operands based on context data received from a data analysis application of a guided user and sends the recommendations to the data analysis application for presentation to the guided user.Type: GrantFiled: April 26, 2016Date of Patent: June 18, 2019Assignee: Informatica LLCInventors: Atreyee Dey, Sanjay Kaluskar, Udayakumar Dhansingh
-
Patent number: 10257211Abstract: An apparatus, computer-readable medium, and computer-implemented method for detecting anomalous user behavior, including storing user activity data collected over an observation interval, the user activity data comprising a plurality of data objects and corresponding to a plurality of users, grouping a plurality of data objects into a plurality of clusters, calculating one or more outlier metrics corresponding to each cluster, calculating an irregularity score for each of one or more data objects in the plurality of data objects, generating one or more object postures for the one or more data objects, comparing each of at least one object posture in the one or more object postures with one or more previous object postures corresponding to a same user as the object posture to identify anomalous activity of one or more users in the plurality of users.Type: GrantFiled: May 20, 2016Date of Patent: April 9, 2019Assignee: Informatica LLCInventor: Igor Balabine
-
Patent number: 10235437Abstract: A computer system and computer implemented method for extracting data set from data clusters that comprises of rows and columns of heterogeneous data values. A plurality of random data groups comprising of at least one of a plurality of contiguous row or columns of data values are selected. Each data value has a data type. A table template type is identified based on detection of a pattern between the data cells of the contiguous rows or columns. A table template header is identified that comprises of a starting position, and ending position and a width. A reference row or reference column indicating a start of a table body is determined. The data cells of the subsequent rows or columns in the table body are compared to the data cells of the reference rows to identify noise rows or columns that are removed from the table body.Type: GrantFiled: March 31, 2015Date of Patent: March 19, 2019Assignee: Informatica LLCInventors: Saurabh Diwan, Shivananda P. J.
-
Patent number: 10164945Abstract: An apparatus, computer-readable medium and computer-implemented method for masking data, including applying an irreversible function to a first data element to generate a derivative data element, the first data element being of a first data type and the derivative data element being of a second data type different than the first data type, selecting at least a portion of the derivative data element to serve as a template, generating a masked data element as the result of converting the template from the second data type to the first data type.Type: GrantFiled: May 23, 2016Date of Patent: December 25, 2018Assignee: Informatica LLCInventors: Igor Balabine, Bala Kumaresan
-
Patent number: 10135854Abstract: An apparatus, computer-readable medium, and computer-implemented method for generating a data proliferation graph, including receiving a selection of a target data store, identifying a plurality of data stores which have either received data that was previously on the target data store or which have sent data that was subsequently on the target data store, the plurality of data stores being divided into a plurality of proliferation levels corresponding to degrees of separation from the target data store and direction of data propagation relative to the target data store, generating a data proliferation graph, and transmitting at least one portion of the data proliferation graph.Type: GrantFiled: April 7, 2015Date of Patent: November 20, 2018Assignee: Informatica LLCInventors: Richard Grondin, Gary Patterson, Rahul Gupta, Ranjeet Tayi, Vikram Tyarla
-
Patent number: 10061816Abstract: A system and method are disclosed for providing metric recommendations by a cloud event log analytics system. The log analytics system includes a user interface which allows users to view metric recommendations, view, modify, annotate, delete, or create log metrics. In a first embodiment, centroid vectors are created from metadata associated with user access of log metrics. The centroid vectors are compared to metrics vectors created from log metrics and the results are ranked and provided to users as metric recommendations. In a second embodiment, classification rules are inferred for metric matrix tables containing metadata about log metric usage. Classification rules are assigned to a decision tree used to calculate composite probabilities of interest of log metrics. A recommendation matrix incorporate the composite probabilities of interest to predict the degree of interest an analytics user may have in a log metric for a given role.Type: GrantFiled: May 11, 2015Date of Patent: August 28, 2018Assignee: Informatica LLCInventors: Gregorio Convertino, Mark Detweiler, Maoyuan Sun
-
Patent number: 9785795Abstract: A data management service identifies sensitive data stored on enterprise databases according to record classification rules that classify a data record as having a sensitive data type if the data record includes fields matching at least one of the record classification rules. The data management service determines assessment scores for enterprise databases according to sensitive data records and protection policies on the enterprise databases. The data management service provides an interface that groups enterprise databases having common attributes or common sensitive data types and indicates aggregated assessment scores for the groups of enterprise databases. Through the interface with the grouped enterprise databases, an administrator apply protection policies to enterprise databases. To apply the protection policy, the data management service applies the protection policy to a source database from which dependent enterprise databases access the sensitive database.Type: GrantFiled: May 6, 2015Date of Patent: October 10, 2017Assignee: Informatica, LLCInventors: Richard Grondin, Rahul Gupta
-
Patent number: 9779158Abstract: An apparatus, computer-readable medium, and computer-implemented method for data subsetting, including receiving a request for a subset of data from a plurality of tables, generating an entity graph corresponding to the plurality of tables, expanding the entity graph if the entity graph does not have any cycles, and performing acyclic subset processing on the expanded entity graph if the entity graph does not have any cycles and the expanded entity graph does not have any cycles.Type: GrantFiled: December 29, 2015Date of Patent: October 3, 2017Assignee: Informatica LLCInventors: Vinayak Borkar, Richard Grondin, Ankur Gupta, Bhupendra Chopra
-
Patent number: 9762603Abstract: A data management service identifies sensitive data stored on enterprise databases according to record classification rules that classify a data record as having a sensitive data type if the data record includes fields matching at least one of the record classification rules. Methods and systems rely on a set of impact factors each having a set of set of value bands representing a range for the impact factor and a corresponding value (between 0 and 1). The factors, ranges, and values all are customizable for an organization. Impact scoring calculations take into account each of the impact factors, and each is weighted to represent a specific risk perception or assessment type. A similar impact scoring is applied to data quality using volume of data as a key attribute of the quality.Type: GrantFiled: May 8, 2015Date of Patent: September 12, 2017Assignee: Informatica LLCInventors: Richard Grondin, Rahul Gupta, Bala Kumaresan
-
Patent number: 9672272Abstract: An apparatus, computer-readable medium, and computer-implemented method for efficiently performing operations on distinct data values, including receiving a query directed to a column of data, the query defining one or more group sets for grouping the data retrieved in response to the query, and for each of the one or more group sets, generating one or more entity map vectors, the length of each entity map vector being equal to the number of unique data values in a domain which corresponds to the column of data, the position of each bit in the entity map vector corresponding to the lexical position of a corresponding unique data value in a lexically ordered list of the unique data values, and the value of each bit in the entity map vector indicating the presence or absence of the corresponding unique data value in the group set.Type: GrantFiled: November 16, 2015Date of Patent: June 6, 2017Assignee: Informatica LLCInventors: Richard Grondin, Evgueni Fadeitchev
-
Patent number: 9477729Abstract: A database keyword search technique that relies on a domain based storage infrastructure is disclosed. In operation, a keyword search string is processed to generate a set of search string permutations. Each string permutation specifies a different ordering of one or more portions of the search string. A domain based search process is then executed asynchronously for each string permutation. Each execution generates a search result set that identifies rows in the database that include data relevant to the string permutation. The results in each result set are scored and ranked based in part on the similarity between the string permutation and the search string provided by the user. The rankings determine which of the results are to be presented to the user.Type: GrantFiled: May 9, 2014Date of Patent: October 25, 2016Assignee: Informatica LLCInventors: Pradeep Bhattiprolu, Richard Grondin
-
Patent number: 9418237Abstract: A system, computer-readable medium, and method for masking data including receiving a request directed to a network service, applying a rule set to the request to identify sensitive data which is responsive to the request, rewriting the request, based on the rule set, such that the rewritten request will result in the sensitive data being retrieved and converted into a masked format according to one or more instructions in the rewritten request, and transmitting the rewritten request to the network service.Type: GrantFiled: July 30, 2014Date of Patent: August 16, 2016Assignee: Informatica LLCInventor: Eric Boukobza
-
Patent number: 9336256Abstract: An apparatus, computer-readable medium, and computer-implemented method for data tokenization are disclosed. The method includes receiving, at a database network router, a database access request directed to a tokenized database, the tokenized database containing one or more tokenized data values, applying one or more rules to the request, rewriting the request based on at least one of the one or more rules, such that data values being added to the database will be tokenized data values, and data values received from the database will be non-tokenized data values, and transmitting the rewritten request to the database.Type: GrantFiled: March 15, 2013Date of Patent: May 10, 2016Assignee: Informatica LLCInventor: Eric Boukobza
-
Patent number: 9235496Abstract: A test data extraction and persistence technique that relies on a data domain based storage infrastructure is disclosed. In operation, a test data server receives a test data query that specifies selection parameters for selecting test data and any transformation operations to be performed on the test data. The test data server identifies domains associated with the selection parameters and traverses the tables in the database based on the identified domains to extract test data that satisfies the selection parameters. The test data server optionally performs transformation operations, such as masking operations, specified by the test data query on the extracted data. The identified domains are stored such that test data that satisfies the test data query may be extracted from the database repetitively without reevaluating the test data query each time.Type: GrantFiled: October 17, 2013Date of Patent: January 12, 2016Assignee: Informatica LLCInventor: Richard Grondin
-
Patent number: 9218379Abstract: An apparatus, computer-readable medium, and computer-implemented method for efficiently performing operations on distinct data values, including storing a tokenized column of data in a table by mapping each unique data value in a corresponding domain to a unique entity ID, and replacing each of the data values in the column with the corresponding entity ID to generate a column of tokenized data containing one or more entity IDs, receiving a query directed to the column of data, the query defining one or more group sets for grouping the data retrieved in response to the query, and generating an entity map vector for each group set, the length of each entity map vector equal to the number of unique entity IDs for the domain, and the value of each bit in the entity map vector indicating the presence or absence of a different unique entity ID in the group set.Type: GrantFiled: March 15, 2013Date of Patent: December 22, 2015Assignee: Informatica LLCInventors: Richard Grondin, Evgueni Fadeitchev