Patents by Inventor Isil Pekel

Isil Pekel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Runtime estimation for machine learning data processing pipeline

Patent number: 12299544

Abstract: Inputs may be received for constructing a data processing pipeline configured to implement an process to generate a machine learning model for performing a task associated with an input dataset. The process may include a plurality of machine learning trials, each of which applying, to a training dataset and/or a validation dataset generated based on the input dataset, a different type of machine learning model and/or a different set of trial parameters. The machine learning model being generated based on a result of the plurality of machine learning trials. A runtime estimate for the process to generate the machine learning model may be determined. The runtime estimate may enable the allocation of a sufficient time budget for the process. Moreover, the process may be executed if the runtime of the process does not exceed the available time budget.

Type: Grant

Filed: September 24, 2020

Date of Patent: May 13, 2025

Assignee: SAP SE

Inventors: Steven Jaeger, Isil Pekel, Manuel Zeise
Preparing data for machine learning processing

Patent number: 11886961

Abstract: Data for processing by a machine learning model may be prepared by encoding a first portion of the data including a spatial data. The spatial data may include a spatial coordinate including one or more values identifying a geographical location. The encoding of the first portion of the data may include mapping, to a cell in a grid system, the spatial coordinate such that the spatial coordinate is represented by an identifier of the cell instead of the one or more values. The data may be further prepared by embedding a second portion of the data including textual data, preparing a third portion of the data including hierarchical data, and/or preparing a fourth portion of the data including numerical data. The machine learning model may be applied to the prepared data in order to train, validate, test, and/or deploy the machine learning model to perform a cognitive task.

Type: Grant

Filed: September 25, 2019

Date of Patent: January 30, 2024

Assignee: SAP SE

Inventors: Manuel Zeise, Isil Pekel, Steven Jaeger
Optimizations for machine learning data processing pipeline

Patent number: 11797885

Abstract: A data processing pipeline may be generated to include an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset. The executor node may execute machine learning trials by applying, to the training dataset, a machine learning model and/or a different set of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, a machine learning model for performing a task. The execution of the data processing pipeline may be optimized. Examples of optimizations include pooling multiple machine learning trials for execution at a single executor node, executing at least some machine learning trials using a sub-sample of the training dataset, and adjusting a proportion of trial parameters sampled from a uniform distribution to avoid a premature convergence to a local minima within the hyper-parameter space for generating the machine learning model.

Type: Grant

Filed: September 24, 2020

Date of Patent: October 24, 2023

Assignee: SAP SE

Inventors: Steven Jaeger, Isil Pekel, Manuel Zeise
Hyper-parameter space optimization for machine learning data processing pipeline

Patent number: 11544136

Abstract: A data processing pipeline may be generated to include an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset. The executor node may execute machine learning trials by applying, to the training dataset, a machine learning model and/or a different set of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, a machine learning model for performing a task. Data associated with the execution of the data processing pipeline may be collected for storage in a tracking database. A report including de-normalized and enriched data from the tracking database may be generated. The hyper-parameter space of the machine learning model may be analyzed based on the report. A root cause of at least one fault associated with the execution of the data processing pipeline may be identified based on the analysis.

Type: Grant

Filed: August 5, 2021

Date of Patent: January 3, 2023

Assignee: SAP SE

Inventors: Isil Pekel, Steven Jaeger, Manuel Zeise
Machine learning data processing pipeline

Patent number: 11443234

Abstract: A user interface may be generated to receive inputs for constructing a data processing pipeline that includes an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset and a validation dataset for a machine learning model. The executor node may execute machine learning trials by applying, to the training dataset and the validation dataset, machine learning models having different sets of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, an optimal machine learning model for performing a task. The data processing pipeline may be adapted dynamically based on the input dataset and/or computational resource budget. The optimal machine learning model for performing the task may be generated by executing, based on the graph, the data processing pipeline the orchestrator node, the preparator node, and the executor node.

Type: Grant

Filed: September 25, 2019

Date of Patent: September 13, 2022

Assignee: SAP SE

Inventors: Manuel Zeise, Isil Pekel, Steven Jaeger
OPTIMIZATIONS FOR MACHINE LEARNING DATA PROCESSING PIPELINE

Publication number: 20220092471

Abstract: A data processing pipeline may be generated to include an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset. The executor node may execute machine learning trials by applying, to the training dataset, a machine learning model and/or a different set of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, a machine learning model for performing a task. The execution of the data processing pipeline may be optimized. Examples of optimizations include pooling multiple machine learning trials for execution at a single executor node, executing at least some machine learning trials using a sub-sample of the training dataset, and adjusting a proportion of trial parameters sampled from a uniform distribution to avoid a premature convergence to a local minima within the hyper-parameter space for generating the machine learning model.

Type: Application

Filed: September 24, 2020

Publication date: March 24, 2022

Inventors: Steven Jaeger, Isil Pekel, Manuel Zeise
RUNTIME ESTIMATION FOR MACHINE LEARNING DATA PROCESSING PIPELINE

Publication number: 20220092470

Abstract: Inputs may be received for constructing a data processing pipeline configured to implement an process to generate a machine learning model for performing a task associated with an input dataset. The process may include a plurality of machine learning trials, each of which applying, to a training dataset and/or a validation dataset generated based on the input dataset, a different type of machine learning model and/or a different set of trial parameters. The machine learning model being generated based on a result of the plurality of machine learning trials. A runtime estimate for the process to generate the machine learning model may be determined. The runtime estimate may enable the allocation of a sufficient time budget for the process. Moreover, the process may be executed if the runtime of the process does not exceed the available time budget.

Type: Application

Filed: September 24, 2020

Publication date: March 24, 2022

Inventors: Steven Jaeger, Isil Pekel, Manuel Zeise
PREPARING DATA FOR MACHINE LEARNING PROCESSING

Publication number: 20210089970

Abstract: Data for processing by a machine learning model may be prepared by encoding a first portion of the data including a spatial data. The spatial data may include a spatial coordinate including one or more values identifying a geographical location. The encoding of the first portion of the data may include mapping, to a cell in a grid system, the spatial coordinate such that the spatial coordinate is represented by an identifier of the cell instead of the one or more values. The data may be further prepared by embedding a second portion of the data including textual data, preparing a third portion of the data including hierarchical data, and/or preparing a fourth portion of the data including numerical data. The machine learning model may be applied to the prepared data in order to train, validate, test, and/or deploy the machine learning model to perform a cognitive task.

Type: Application

Filed: September 25, 2019

Publication date: March 25, 2021

Inventors: Manuel Zeise, Isil Pekel, Steven Jaeger
MACHINE LEARNING DATA PROCESSING PIPELINE

Publication number: 20210089961

Abstract: A user interface may be generated to receive inputs for constructing a data processing pipeline that includes an orchestrator node, a preparator node, and an executor node. The preparator node may generate a training dataset and a validation dataset for a machine learning model. The executor node may execute machine learning trials by applying, to the training dataset and the validation dataset, machine learning models having different sets of trial parameters. The orchestrator node may identify, based on a result of the machine learning trials, an optimal machine learning model for performing a task. The data processing pipeline may be adapted dynamically based on the input dataset and/or computational resource budget. The optimal machine learning model for performing the task may be generated by executing, based on the graph, the data processing pipeline the orchestrator node, the preparator node, and the executor node.

Type: Application

Filed: September 25, 2019

Publication date: March 25, 2021

Inventors: Manuel Zeise, Isil Pekel, Steven Jaeger
Querying spatial data in column stores using grid-order scans

Patent number: 10380130

Abstract: A query of spatial data is received by a database comprising a columnar data store storing data in a column-oriented structure. Thereafter, a minimal bounding rectangle associated with the query is identified using a grid order scanning technique. The spatial data set corresponding to the received query is then mapped to physical storage in the database using the identified minimal bounding rectangle so that the spatial data set can be retrieved. Related apparatus, systems, techniques and articles are also described.

Type: Grant

Filed: June 27, 2017

Date of Patent: August 13, 2019

Assignee: SAP SE

Inventors: Edward-Robert Tyercha, Gerrit Simon Kazmaier, Hinnerk Gildhoff, Isil Pekel, Lars Volker, Tim Grouisborn
Database calculation engine with nested multiprovider merging

Patent number: 10324930

Abstract: A query is received by a database server from a remote application server that is associated with a calculation scenario that defines a data flow model including one or more calculation nodes including stacked multiproviders. Subsequently, the database server instantiates the calculation scenario and afterwards optimizes the calculation scenario. As part of the optimization, the calculation scenario is optimized by merging the two multiproviders. Thereafter, the operations defined by the calculation nodes of the optimized calculation scenario can be executed to result in a responsive data set. Next, the data set is provided to the application server by the database server.

Type: Grant

Filed: May 27, 2015

Date of Patent: June 18, 2019

Assignee: SAP SE

Inventors: Christoph Weyerhaeuser, Tobias Mindnich, Johannes Merx, Julian Schwing, Daniel Patejdl, Isil Pekel
Data-driven union pruning in a database semantic layer

Patent number: 10324927

Abstract: Methods and apparatus, including computer program products, are provided for union node pruning. In one aspect, there is provided a method, which may include receiving, by a calculation engine, a query; processing a calculation scenario including a union node; accessing a pruning table associated with the union node, wherein the pruning table includes semantic information describing the first input from the first data source node and the second input from the second data source node; determining whether the first data source node and the second data source node can be pruned by at least comparing the semantic information to at least one filter of the query; and pruning, based on a result of the determining, at least one the first data source node or the second data source node. Related apparatus, systems, methods, and articles are also described.

Type: Grant

Filed: November 19, 2015

Date of Patent: June 18, 2019

Assignee: SAP SE

Inventors: Tobias Mindnich, Julian Schwing, Christoph Weyerhaeuser, Isil Pekel, Johannes Merx, Alena Bakulina
Hilbert curve partitioning for parallelization of DBSCAN

Patent number: 10318557

Abstract: DBSCAN clustering analyses can be improved by pre-processing of a data set using a Hilbert curve to intelligently identify the centers for initial partitional analysis by a partitional clustering algorithm such as CLARANS. Partitions output by the partitional clustering algorithm can be process by DBSCAN running in parallel before intermediate cluster results are merged.

Type: Grant

Filed: June 9, 2017

Date of Patent: June 11, 2019

Assignee: SAP SE

Inventors: Edward-Robert Tyercha, Gerrit Simon Kazmaier, Hinnerk Gildhoff, Isil Pekel, Lars Volker, Tim Grouisborn
Database calculation engine with dynamic top operator

Patent number: 10275490

Abstract: A database server receives a query from a remote application server that is associated with a calculation scenario. The calculation scenario defines a data flow model that includes one or more calculation nodes that each define one or more operations to execute on the database server. A top operator node of the calculation nodes specifies a plurality of attributes and the query requests a subset of the attributes specified by the top operator node; Thereafter, the database server instantiates the calculation scenario so that it is optimized by requesting only the subset of attributes. The database server then executes the operations defined by the calculation nodes of the optimized calculation scenario to result in a responsive data set. The database server then provides the data set to the application server.

Type: Grant

Filed: January 28, 2015

Date of Patent: April 30, 2019

Assignee: SAP SE

Inventors: Christoph Weyerhaeuser, Tobias Mindnich, Isil Pekel, Johannes Merx, Daniel Patejdl
Hilbert Curve Partitioning for Parallelization of DBSCAN

Publication number: 20170308605

Abstract: DBSCAN clustering analyses can be improved by pre-processing of a data set using a Hilbert curve to intelligently identify the centers for initial partitional analysis by a partitional clustering algorithm such as CLARANS. Partitions output by the partitional clustering algorithm can be process by DBSCAN running in parallel before intermediate cluster results are merged.

Type: Application

Filed: June 9, 2017

Publication date: October 26, 2017

Inventors: Edward-Robert Tyercha, Gerrit Simon Kazmaier, Hinnerk Gildhoff, Isil Pekel, Lars Volker, Tim Grouisborn
QUERYING SPATIAL DATA IN COLUMN STORES USING GRID-ORDER SCANS

Publication number: 20170293662

Abstract: A query of spatial data is received by a database comprising a columnar data store storing data in a column-oriented structure. Thereafter, a minimal bounding rectangle associated with the query is identified using a grid order scanning technique. The spatial data set corresponding to the received query is then mapped to physical storage in the database using the identified minimal bounding rectangle so that the spatial data set can be retrieved. Related apparatus, systems, techniques and articles are also described.

Type: Application

Filed: June 27, 2017

Publication date: October 12, 2017

Inventors: Edward-Robert Tyercha, Gerrit Simon Kazmaier, Hinnerk Gildhoff, Isil Pekel, Lars Volker, Tim Grouisborm
Querying spatial data in column stores using grid-order scans

Patent number: 9720931

Abstract: A query of spatial data is received by a database comprising a columnar data store storing data in a column-oriented structure. Thereafter, a minimal bounding rectangle associated with the query is identified using a grid order scanning technique. The spatial data set corresponding to the received query is then mapped to physical storage in the database using the identified minimal bounding rectangle so that the spatial data set can be retrieved. Related apparatus, systems, techniques and articles are also described.

Type: Grant

Filed: May 9, 2014

Date of Patent: August 1, 2017

Assignee: SAP SE

Inventors: Edward-Robert Tyercha, Gerrit Simon Kazmaier, Hinnerk Gildhoff, Isil Pekel, Lars Volker, Tim Grouisborn
Hilbert curve partitioning for parallelization of DBSCAN

Patent number: 9703856

Abstract: DBSCAN clustering analyses can be improved by pre-processing of a data set using a Hilbert curve to intelligently identify the centers for initial partitional analysis by a partitional clustering algorithm such as CLARANS. Partitions output by the partitional clustering algorithm can be process by DBSCAN running in parallel before intermediate cluster results are merged.

Type: Grant

Filed: July 7, 2014

Date of Patent: July 11, 2017

Assignee: SAP SE

Inventors: Edward-Robert Tyercha, Gerrit Simon Kazmaier, Hinnerk Gildhoff, Isil Pekel, Lars Volker, Tim Grouisborn
DATA-DRIVEN UNION PRUNING IN A DATABASE SEMANTIC LAYER

Publication number: 20170147637

Abstract: Methods and apparatus, including computer program products, are provided for union node pruning. In one aspect, there is provided a method, which may include receiving, by a calculation engine, a query; processing a calculation scenario including a union node; accessing a pruning table associated with the union node, wherein the pruning table includes semantic information describing the first input from the first data source node and the second input from the second data source node; determining whether the first data source node and the second data source node can be pruned by at least comparing the semantic information to at least one filter of the query; and pruning, based on a result of the determining, at least one the first data source node or the second data source node. Related apparatus, systems, methods, and articles are also described.

Type: Application

Filed: November 19, 2015

Publication date: May 25, 2017

Inventors: Tobias Mindnich, Julian Schwing, Christoph Weyerhaeuser, Isil Pekel, Johannes Merx, Alena Bakulina
Querying spatial data in column stores using tree-order scans

Patent number: 9613055

Abstract: A query of spatial data is received by a database comprising a columnar data store storing data in a column-oriented structure. Thereafter, a minimal bounding rectangle associated with the query is identified using a tree-order scanning technique. A spatial data set that corresponds to the received query is then mapped to the physical storage in the database using the identified minimal bounding rectangle. Next, the spatial data set is then retrieved. Related apparatus, systems, techniques and articles are also described.

Type: Grant

Filed: May 9, 2014

Date of Patent: April 4, 2017

Assignee: SAP SE

Inventors: Edward-Robert Tyercha, Gerrit Simon Kazmaier, Hinnerk Gildhoff, Isil Pekel, Lars Volker, Tim Grouisborn

1 2 next