Patents by Inventor Alexander Gorelik

Alexander Gorelik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11281626
    Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.
    Type: Grant
    Filed: May 17, 2019
    Date of Patent: March 22, 2022
    Assignee: HITACHI VANTARA LLC
    Inventor: Alexander Gorelik
  • Publication number: 20190384745
    Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.
    Type: Application
    Filed: May 17, 2019
    Publication date: December 19, 2019
    Inventor: Alexander Gorelik
  • Patent number: 10346358
    Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.
    Type: Grant
    Filed: June 4, 2014
    Date of Patent: July 9, 2019
    Assignee: Waterline Data Science, Inc.
    Inventor: Alexander Gorelik
  • Patent number: 10242016
    Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.
    Type: Grant
    Filed: November 14, 2016
    Date of Patent: March 26, 2019
    Assignee: Waterline Data Science, Inc.
    Inventor: Alexander Gorelik
  • Patent number: 10198460
    Abstract: In system for analyzing large data sets, document/file format can be discovered by attempting to parse the file using several parsers to generate a schema, assigning a score to each parsing, and selecting a parser based on the assigned scores. Schema element attributes, such as statistical parameters, can be derived and used in identifying schema elements associated with other files. Attributes of identified schema elements can be used to substitute missing data values with values based on such attributes. Data values corresponding schema elements can be selected and highlighted, and schema elements and/or attributes thereof can be highlighted based on selected data values. From a cluster of files, a lineage relationship between file pairs, indicating whether one file is derived from another, can be determined for several files. In reducing/compacting data, utilization of all available reducers can be optimized according to current utilization of one or more reducers.
    Type: Grant
    Filed: June 4, 2014
    Date of Patent: February 5, 2019
    Assignee: Waterline Data Science, Inc.
    Inventor: Alexander Gorelik
  • Patent number: 9659072
    Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.
    Type: Grant
    Filed: January 29, 2016
    Date of Patent: May 23, 2017
    Assignee: International Business Machines Corporation
    Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
  • Publication number: 20160147851
    Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.
    Type: Application
    Filed: January 29, 2016
    Publication date: May 26, 2016
    Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
  • Patent number: 9336253
    Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.
    Type: Grant
    Filed: October 6, 2014
    Date of Patent: May 10, 2016
    Assignee: International Business Machines Corporation
    Inventors: Alexander Gorelik, Lingling Yan
  • Patent number: 9336246
    Abstract: According to one embodiment of the present invention, a system determines key relationships between database tables and includes a computer system including at least one processor. The system determines a sampling range for one or more matching columns between first and second database tables. The matching columns satisfy one or more matching criteria and the sampling range is based on quantities of distinct values within the matching columns. Data is sampled from the first and second database tables in accordance with the sampling ranges to determine a sample set. Keys between the first and second database tables are determined based on matching between columns within the sample set. Embodiments of the present invention further include a method and computer program product for determining key relationships between database tables in substantially the same manner described above.
    Type: Grant
    Filed: February 28, 2012
    Date of Patent: May 10, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexander Gorelik, Sharad Santhanam, Lev M. Tsentsiper
  • Patent number: 9305067
    Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.
    Type: Grant
    Filed: July 19, 2013
    Date of Patent: April 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
  • Patent number: 9239853
    Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.
    Type: Grant
    Filed: September 18, 2014
    Date of Patent: January 19, 2016
    Assignee: International Business Machines Corporation
    Inventor: Alexander Gorelik
  • Publication number: 20150074117
    Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.
    Type: Application
    Filed: October 6, 2014
    Publication date: March 12, 2015
    Inventors: Alexander Gorelik, Lingling Yan
  • Publication number: 20150026115
    Abstract: A computer software implemented method for transforming a first extract transform load (ETL) job having at least some unload transform load (UTL) portions. The method includes the following steps: (i) decomposing the first ETL job into an intermediate set of one or more jobs; and (ii) for each job of the intermediate set, transforming the job into a transactionally equivalent job to yield a final set of one or more jobs. The decomposing is performed so that each job of the intermediate jobs set is a Simple UTL job. The transforming is performed so that each job of the final set includes no UTL portions.
    Type: Application
    Filed: July 19, 2013
    Publication date: January 22, 2015
    Inventors: Alexander Gorelik, Sriram K. Padmanabhan, James D. Spyker
  • Patent number: 8930303
    Abstract: According to a present invention embodiment, a system determines a relationship between source and target database tables, and includes a computer system including at least one processor. Potential pivot keys of the target database table are determined, and maps are created for each potential pivot key between the database tables based on distinct values. Transformations for each map are generated that enable target data to be produced from source data. The transformations for each potential pivot key are analyzed and the potential pivot key with the transformations that generate the greatest amount of matching data is selected as the resulting pivot key. The database table columns corresponding to the resulting pivot key are determined to be associated by the relationship. Embodiments of the present invention further include a method and computer program product for determining a relationship between source and target database tables in substantially the same manner described above.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Leon Burda, Salil Datta, Alexander Gorelik, Dongmei Ren, Lev M. Tsentsiper
  • Publication number: 20150006542
    Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.
    Type: Application
    Filed: September 18, 2014
    Publication date: January 1, 2015
    Inventor: Alexander Gorelik
  • Patent number: 8898194
    Abstract: A data source is accessed to provide information.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventor: Alexander Gorelik
  • Patent number: 8892525
    Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: November 18, 2014
    Assignee: International Business Machines Corporation
    Inventor: Alexander Gorelik
  • Patent number: 8874613
    Abstract: An apparatus and method are described for the discovery of semantics, relationships and mappings between data in different software applications, databases, files, reports, messages, or systems. In one aspect, semantics and relationships and mappings are identified between a first and a second data source. A binding condition is discovered between portions of data in the first and the second data source. The binding condition is used to discover correlations between portions of data in the first and the second data source. The binding condition and the correlations are used to discover a transformation function between portions of data in the first and the second data source.
    Type: Grant
    Filed: May 9, 2013
    Date of Patent: October 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alexander Gorelik, Lingling Yan
  • Patent number: 8856085
    Abstract: A method, computer program product, and system for analyzing data within one or more databases, comprising selecting one or more databases for analysis, each database comprising one or more database objects comprising one or more data values, applying a function to each data value in each database object within the one or more databases, where the function produces function values limited to a predetermined range, identifying for analysis the data values producing a certain function value within the predetermined range to form a sampled data set, and analyzing the sampled data set to determine relationships between the database objects within and across the one or more databases.
    Type: Grant
    Filed: July 19, 2011
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventor: Alexander Gorelik
  • Publication number: 20140100879
    Abstract: The invention disclosed herein overcomes obstacles in providing connectivity to personal medical meters and mobile computing devices. Systems, methods, and devices of the invention allow subjects to connect personal medical monitors/meters to the cloud through mobile computing devices. The invention further allows a user or a subject remote access to medical meter data using a mobile computing device on demand.
    Type: Application
    Filed: October 7, 2013
    Publication date: April 10, 2014
    Applicant: Infometers, Inc.
    Inventors: Akhsar Kharebov, Alan Kharebov, Alexander Gorelik