Patents by Inventor Alexander Gorelik
Alexander Gorelik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11281626Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.Type: GrantFiled: May 17, 2019Date of Patent: March 22, 2022Assignee: HITACHI VANTARA LLCInventor: Alexander Gorelik
-
Publication number: 20190384745Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.Type: ApplicationFiled: May 17, 2019Publication date: December 19, 2019Inventor: Alexander Gorelik
-
Patent number: 10346358Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.Type: GrantFiled: June 4, 2014Date of Patent: July 9, 2019Assignee: Waterline Data Science, Inc.Inventor: Alexander Gorelik
-
Patent number: 10242016Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.Type: GrantFiled: November 14, 2016Date of Patent: March 26, 2019Assignee: Waterline Data Science, Inc.Inventor: Alexander Gorelik
-
Patent number: 10198460Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.Type: GrantFiled: June 4, 2014Date of Patent: February 5, 2019Assignee: Waterline Data Science, Inc.Inventor: Alexander Gorelik
-
Patent number: 9659072Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.Type: GrantFiled: January 29, 2016Date of Patent: May 23, 2017Assignee: International Business Machines CorporationInventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
-
Publication number: 20160147851Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.Type: ApplicationFiled: January 29, 2016Publication date: May 26, 2016Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
-
Patent number: 9336253Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.Type: GrantFiled: October 6, 2014Date of Patent: May 10, 2016Assignee: International Business Machines CorporationInventors: Alexander Gorelik, Lingling Yan
-
Patent number: 9336246Abstract: According to one embodiment of the present invention, a system determines key relationships between database tables and includes a computer system including at least one processor. The system determines a sampling range for one or more matching columns between first and second database tables. The matching columns satisfy one or more matching criteria and the sampling range is based on quantities of distinct values within the matching columns. Data is sampled from the first and second database tables in accordance with the sampling ranges to determine a sample set. Keys between the first and second database tables are determined based on matching between columns within the sample set. Embodiments of the present invention further include a method and computer program product for determining key relationships between database tables in substantially the same manner described above.Type: GrantFiled: February 28, 2012Date of Patent: May 10, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alexander Gorelik, Sharad Santhanam, Lev M. Tsentsiper
-
Patent number: 9305067Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.Type: GrantFiled: July 19, 2013Date of Patent: April 5, 2016Assignee: International Business Machines CorporationInventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
-
Patent number: 9239853Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.Type: GrantFiled: September 18, 2014Date of Patent: January 19, 2016Assignee: International Business Machines CorporationInventor: Alexander Gorelik
-
Publication number: 20150074117Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.Type: ApplicationFiled: October 6, 2014Publication date: March 12, 2015Inventors: Alexander Gorelik, Lingling Yan
-
Publication number: 20150026115Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.Type: ApplicationFiled: July 19, 2013Publication date: January 22, 2015Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
-
Patent number: 8930303Abstract: According to a present invention embodiment, a system determines a relationship between source and target database tables, and includes a computer system including at least one processor. Potential pivot keys of the target database table are determined, and maps are created for each potential pivot key between the database tables based on distinct values. Transformations for each map are generated that enable target data to be produced from source data. The transformations for each potential pivot key are analyzed and the potential pivot key with the transformations that generate the greatest amount of matching data is selected as the resulting pivot key. The database table columns corresponding to the resulting pivot key are determined to be associated by the relationship. Embodiments of the present invention further include a method and computer program product for determining a relationship between source and target database tables in substantially the same manner described above.Type: GrantFiled: March 30, 2012Date of Patent: January 6, 2015Assignee: International Business Machines CorporationInventors: Leon Burda, Salil Datta, Alexander Gorelik, Dongmei Ren, Lev M. Tsentsiper
-
Publication number: 20150006542Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.Type: ApplicationFiled: September 18, 2014Publication date: January 1, 2015Inventor: Alexander Gorelik
-
Patent number: 8898194Abstract: A data source is accessed to provide information.Type: GrantFiled: September 14, 2012Date of Patent: November 25, 2014Assignee: International Business Machines CorporationInventor: Alexander Gorelik
-
Patent number: 8892525Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.Type: GrantFiled: September 6, 2013Date of Patent: November 18, 2014Assignee: International Business Machines CorporationInventor: Alexander Gorelik
-
Patent number: 8874613Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.Type: GrantFiled: May 9, 2013Date of Patent: October 28, 2014Assignee: International Business Machines CorporationInventors: Alexander Gorelik, Lingling Yan
-
Patent number: 8856085Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.Type: GrantFiled: July 19, 2011Date of Patent: October 7, 2014Assignee: International Business Machines CorporationInventor: Alexander Gorelik
-
Publication number: 20140100879Abstract: The invention disclosed herein overcomes obstacles in providing connectivity to personal medical meters and mobile computing devices. Systems, methods, and devices of the invention allow subjects to connect personal medical monitors/meters to the cloud through mobile computing devices. The invention further allows a user or a subject remote access to medical meter data using a mobile computing device on demand.Type: ApplicationFiled: October 7, 2013Publication date: April 10, 2014Applicant: Infometers, Inc.Inventors: Akhsar Kharebov, Alan Kharebov, Alexander Gorelik