Patents by Inventor Alexander Gorelik

Alexander Gorelik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for management of data platforms

Patent number: 11281626

Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.

Type: Grant

Filed: May 17, 2019

Date of Patent: March 22, 2022

Assignee: HITACHI VANTARA LLC

Inventor: Alexander Gorelik
Systems and Methods for Management of Data Platforms

Publication number: 20190384745

Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.

Type: Application

Filed: May 17, 2019

Publication date: December 19, 2019

Inventor: Alexander Gorelik
Systems and methods for management of data platforms

Patent number: 10346358

Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.

Type: Grant

Filed: June 4, 2014

Date of Patent: July 9, 2019

Assignee: Waterline Data Science, Inc.

Inventor: Alexander Gorelik
Systems and methods for management of data platforms

Patent number: 10242016

Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.

Type: Grant

Filed: November 14, 2016

Date of Patent: March 26, 2019

Assignee: Waterline Data Science, Inc.

Inventor: Alexander Gorelik
Systems and methods for management of data platforms

Patent number: 10198460

Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.

Type: Grant

Filed: June 4, 2014

Date of Patent: February 5, 2019

Assignee: Waterline Data Science, Inc.

Inventor: Alexander Gorelik
Creation of change-based data integration jobs

Patent number: 9659072

Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.

Type: Grant

Filed: January 29, 2016

Date of Patent: May 23, 2017

Assignee: International Business Machines Corporation

Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
CREATION OF CHANGE-BASED DATA INTEGRATION JOBS

Publication number: 20160147851

Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.

Type: Application

Filed: January 29, 2016

Publication date: May 26, 2016

Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
Semantic discovery and mapping between data sources

Patent number: 9336253

Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.

Type: Grant

Filed: October 6, 2014

Date of Patent: May 10, 2016

Assignee: International Business Machines Corporation

Inventors: Alexander Gorelik, Lingling Yan
Generating composite key relationships between database objects based on sampling

Patent number: 9336246

Abstract: According to one embodiment of the present invention, a system determines key relationships between database tables and includes a computer system including at least one processor. The system determines a sampling range for one or more matching columns between first and second database tables. The matching columns satisfy one or more matching criteria and the sampling range is based on quantities of distinct values within the matching columns. Data is sampled from the first and second database tables in accordance with the sampling ranges to determine a sample set. Keys between the first and second database tables are determined based on matching between columns within the sample set. Embodiments of the present invention further include a method and computer program product for determining key relationships between database tables in substantially the same manner described above.

Type: Grant

Filed: February 28, 2012

Date of Patent: May 10, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexander Gorelik, Sharad Santhanam, Lev M. Tsentsiper
Creation of change-based data integration jobs

Patent number: 9305067

Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.

Type: Grant

Filed: July 19, 2013

Date of Patent: April 5, 2016

Assignee: International Business Machines Corporation

Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
Automatic consistent sampling for data analysis

Patent number: 9239853

Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.

Type: Grant

Filed: September 18, 2014

Date of Patent: January 19, 2016

Assignee: International Business Machines Corporation

Inventor: Alexander Gorelik
SEMANTIC DISCOVERY AND MAPPING BETWEEN DATA SOURCES

Publication number: 20150074117

Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.

Type: Application

Filed: October 6, 2014

Publication date: March 12, 2015

Inventors: Alexander Gorelik, Lingling Yan
CREATION OF CHANGE-BASED DATA INTEGRATION JOBS

Publication number: 20150026115

Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.

Type: Application

Filed: July 19, 2013

Publication date: January 22, 2015

Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
Discovering pivot type relationships between database objects

Patent number: 8930303

Abstract: According to a present invention embodiment, a system determines a relationship between source and target database tables, and includes a computer system including at least one processor. Potential pivot keys of the target database table are determined, and maps are created for each potential pivot key between the database tables based on distinct values. Transformations for each map are generated that enable target data to be produced from source data. The transformations for each potential pivot key are analyzed and the potential pivot key with the transformations that generate the greatest amount of matching data is selected as the resulting pivot key. The database table columns corresponding to the resulting pivot key are determined to be associated by the relationship. Embodiments of the present invention further include a method and computer program product for determining a relationship between source and target database tables in substantially the same manner described above.

Type: Grant

Filed: March 30, 2012

Date of Patent: January 6, 2015

Assignee: International Business Machines Corporation

Inventors: Leon Burda, Salil Datta, Alexander Gorelik, Dongmei Ren, Lev M. Tsentsiper
AUTOMATIC CONSISTENT SAMPLING FOR DATA ANALYSIS

Publication number: 20150006542

Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.

Type: Application

Filed: September 18, 2014

Publication date: January 1, 2015

Inventor: Alexander Gorelik
Searching and displaying data objects residing in data management systems

Patent number: 8898194

Abstract: A data source is accessed to provide information.

Type: Grant

Filed: September 14, 2012

Date of Patent: November 25, 2014

Assignee: International Business Machines Corporation

Inventor: Alexander Gorelik
Automatic consistent sampling for data analysis

Patent number: 8892525

Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.

Type: Grant

Filed: September 6, 2013

Date of Patent: November 18, 2014

Assignee: International Business Machines Corporation

Inventor: Alexander Gorelik
Semantic discovery and mapping between data sources

Patent number: 8874613

Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.

Type: Grant

Filed: May 9, 2013

Date of Patent: October 28, 2014

Assignee: International Business Machines Corporation

Inventors: Alexander Gorelik, Lingling Yan
Automatic consistent sampling for data analysis

Patent number: 8856085

Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.

Type: Grant

Filed: July 19, 2011

Date of Patent: October 7, 2014

Assignee: International Business Machines Corporation

Inventor: Alexander Gorelik
Systems and Methods for Device and Meter Monitoring

Publication number: 20140100879

Abstract: The invention disclosed herein overcomes obstacles in providing connectivity to personal medical meters and mobile computing devices. Systems, methods, and devices of the invention allow subjects to connect personal medical monitors/meters to the cloud through mobile computing devices. The invention further allows a user or a subject remote access to medical meter data using a mobile computing device on demand.

Type: Application

Filed: October 7, 2013

Publication date: April 10, 2014

Applicant: Infometers, Inc.

Inventors: Akhsar Kharebov, Alan Kharebov, Alexander Gorelik

1 2 next