Patents by Inventor Fatma Ozcan
Fatma Ozcan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20190034444Abstract: The embodiments relate to assigning data to processors of a file system. Metadata associated with respective blocks of data, and an initial batch of the blocks is assigned to nodes of a file system based on the metadata. Unassigned blocks are selectively assigned to one or more of the nodes. The selective assignment includes constructing a linear regression model based on node data, and determining a value for each node based on the linear regression model. Each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks.Type: ApplicationFiled: October 4, 2018Publication date: January 31, 2019Applicant: International Business Machines CorporationInventors: Uttam Jain, Nimrod Megiddo, Umar F. Minhas, Fatma Ozcan, Robbert Van Der Linden
-
Patent number: 10127237Abstract: The embodiments relate to assigning data to processors of a file system. Metadata associated with respective blocks of data, and an initial batch of the blocks is assigned to nodes of a file system based on the metadata. Unassigned blocks are selectively assigned to one or more of the nodes. The selective assignment includes constructing a linear regression model based on node data, and determining a value for each node based on the linear regression model. Each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks.Type: GrantFiled: December 18, 2015Date of Patent: November 13, 2018Assignee: International Business Machines CorporationInventors: Uttam Jain, Nimrod Megiddo, Umar F. Minhas, Fatma Ozcan, Robbert Van Der Linden
-
Publication number: 20180253653Abstract: A computer program product, according to one embodiment, includes a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: identify, by the processor, key concepts in a domain ontology, and use, by the processor, the key concepts to create a rich entity. Identifying the key concepts includes performing centrality analysis of concepts extracted from the domain ontology. Other systems, methods, and computer program products are described in additional embodiments.Type: ApplicationFiled: March 3, 2017Publication date: September 6, 2018Inventors: Fatma Ozcan, Abdul H. Quamar, Konstantinos Xirogiannopoulos
-
Patent number: 10067885Abstract: In one embodiment, a computer-implemented method includes inserting a set of accessed objects into a cache, where the set of accessed objects varies in size. An object includes a set of object components, and responsive to receiving a request to access the object, it is determined that the object does not fit into the cache given the set of accessed objects and a total size of the cache. A heuristic algorithm is applied, by a computer processor, to identify in the set of object components one or more object components for insertion into the cache. The heuristic algorithm considers at least a priority of the object compared to priorities of one or more objects in the set of accessed objects. The one or more object components are inserted into the cache.Type: GrantFiled: November 22, 2016Date of Patent: September 4, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Avrilia Floratou, Uday B. Kale, Nimrod Megiddo, Fatma Ozcan, Navneet S. Potti
-
Publication number: 20180225215Abstract: A computer-implemented method according to one embodiment includes receiving a request for data, locating the data at one or more partitions of a heterogeneously partitioned table, determining an access method associated with each of the one or more partitions, and requesting the data from the one or more partitions, utilizing the access method associated with each of the one or more partitions.Type: ApplicationFiled: February 6, 2017Publication date: August 9, 2018Inventors: Avrilia Floratou, Fatma Ozcan, Mir H. Pirahesh, Navneet S. Potti
-
Publication number: 20180196828Abstract: Embodiments of the present invention relate to elimination of blocks such as splits in distributed processing systems such as MapReduce systems using the Hadoop Distributed Filing System (HDFS). In one embodiment, a method of and computer program product for optimizing queries in distributed processing systems are provided. A query is received. The query includes at least one predicate. The query refers to data. The data includes a plurality of records. Each record comprises a plurality of values in a plurality of attributes. Each record is located in at least one of a plurality of blocks of a distributed file system. Each block has a unique identifier. For each block of the distributed file system, at least one value cluster is determined for an attribute of the plurality of attributes. Each value cluster has a range. The predicate of the query is compared with the at least one value cluster of each block.Type: ApplicationFiled: March 5, 2018Publication date: July 12, 2018Inventors: Mohamed Eltabakh, Peter J. Haas, Fatma Ozcan, Mir Hamid Pirahesh, John (Yannis) Sismanis, Jan Vondrak
-
Patent number: 9910860Abstract: Embodiments of the present invention relate to elimination of blocks such as splits in distributed processing systems such as MapReduce systems using the Hadoop Distributed Filing System (HDFS). In one embodiment, a method of and computer program product for optimizing queries in distributed processing systems are provided. A query is received. The query includes at least one predicate. The query refers to data. The data includes a plurality of records. Each record comprises a plurality of values in a plurality of attributes. Each record is located in at least one of a plurality of blocks of a distributed file system. Each block has a unique identifier. For each block of the distributed file system, at least one value cluster is determined for an attribute of the plurality of attributes. Each value cluster has a range. The predicate of the query is compared with the at least one value cluster of each block.Type: GrantFiled: February 6, 2014Date of Patent: March 6, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Mohamed Eltabakh, Peter J. Haas, Fatma Ozcan, Mir Hamid Pirahesh, John (Yannis) Sismanis, Jan Vondrak
-
Publication number: 20180018329Abstract: A computer-implemented method according to one embodiment includes receiving an ontology language query, receiving a mapping of an ontology to a relational database, and generating a structured query language (SQL) query, utilizing the ontology language query and the mapping of the ontology to the relational database.Type: ApplicationFiled: July 18, 2016Publication date: January 18, 2018Inventors: Avrilia Floratou, Fatma Ozcan
-
Patent number: 9836506Abstract: In one embodiment, a computer-implemented method includes selecting one or more sub-expressions of a query during compile time. One or more pilot runs are performed by one or more computer processors. The one or more pilot runs include a pilot run associated with each of one or more of the selected sub-expressions, and each pilot run includes at least partial execution of the associated selected sub-expression. The pilot runs are performed during execution time. Statistics are collected on the one or more pilot runs during performance of the one or more pilot runs. The query is optimized based at least in part on the statistics collected during the one or more pilot runs, where the optimization includes basing cardinality and cost estimates on the statistics collected during the pilot runs.Type: GrantFiled: June 11, 2014Date of Patent: December 5, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Andrey Balmin, Vuk Ercegovac, Jesse E. Jackson, Konstantinos Karanasos, Marcel Kutsch, Fatma Ozcan, Chunyang Xia
-
Patent number: 9779031Abstract: In one embodiment, a computer-implemented method includes inserting a set of accessed objects into a cache, where the set of accessed objects varies in size. An object includes a set of object components, and responsive to receiving a request to access the object, it is determined that the object does not fit into the cache given the set of accessed objects and a total size of the cache. A heuristic algorithm is applied, by a computer processor, to identify in the set of object components one or more object components for insertion into the cache. The heuristic algorithm considers at least a priority of the object compared to priorities of one or more objects in the set of accessed objects. The one or more object components are inserted into the cache.Type: GrantFiled: June 17, 2015Date of Patent: October 3, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Avrilia Floratou, Uday B. Kale, Nimrod Megiddo, Fatma Ozcan, Navneet S. Potti
-
Patent number: 9774682Abstract: Embodiments relate to parallel data streaming between a first computer system and a second computer system. Aspects include transmitting a request to establish an authenticated connection between a processing job on the first computer system and a process on the second computer system and transmitting a query to the process on the second computer system over the authenticated connection. Aspects further include creating one or more tasks on the first computer system configured to receive data from the second computer system in parallel and reading data received by the one or more tasks by the processing job on the first computer system.Type: GrantFiled: January 8, 2015Date of Patent: September 26, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sangeeta T. Doraiswamy, Marc Hörsken, Fatma Ozcan, Mir H. Pirahesh
-
Patent number: 9767149Abstract: Embodiments relate to joining data across a parallel database and a distributed processing system. Aspects include receiving a query on data stored in parallel database T and data stored in distributed processing system L, applying local query predicates and projection to data T to create T?, and applying local query predicates and projection to L to create L?. Based on determining that a size of L? is less than a size of T? and that the size of L? is less than a first threshold, transmitting L? to the parallel database and executing a join between T? and L?. Based on determining that a number of the nodes distributed processing system n multiplied by the size of T? is less than the size of L? and that the size of T? is less than a second threshold; transmitting T? to the distributed processing system and executing a join between T? and L?.Type: GrantFiled: October 10, 2014Date of Patent: September 19, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Fatma Ozcan, Hamid Pirahesh, Yuanyuan Tian, Tao Zou
-
Publication number: 20170177599Abstract: The embodiments relate to assigning data to processors of a file system. Metadata associated with respective blocks of data, and an initial batch of the blocks is assigned to nodes of a file system based on the metadata. Unassigned blocks are selectively assigned to one or more of the nodes. The selective assignment includes constructing a linear regression model based on node data, and determining a value for each node based on the linear regression model. Each value is associated with a predicted load corresponding to a new assignment of one or more unassigned blocks.Type: ApplicationFiled: December 18, 2015Publication date: June 22, 2017Applicant: International Business Machines CorporationInventors: Uttam Jain, Nimrod Megiddo, Umar F. Minhas, Fatma Ozcan, Robbert Van Der Linden
-
Publication number: 20170075819Abstract: In one embodiment, a computer-implemented method includes inserting a set of accessed objects into a cache, where the set of accessed objects varies in size. An object includes a set of object components, and responsive to receiving a request to access the object, it is determined that the object does not fit into the cache given the set of accessed objects and a total size of the cache. A heuristic algorithm is applied, by a computer processor, to identify in the set of object components one or more object components for insertion into the cache. The heuristic algorithm considers at least a priority of the object compared to priorities of one or more objects in the set of accessed objects. The one or more object components are inserted into the cache.Type: ApplicationFiled: November 22, 2016Publication date: March 16, 2017Inventors: Avrilia Floratou, Uday B. Kale, Nimrod Megiddo, Fatma Ozcan, Navneet S. Potti
-
Patent number: 9576000Abstract: Scheduling mechanisms for assigning data in a distributed file system to database workers are provided. In one embodiment, a method of and computer program product for assignment of data blocks to database workers are provided. A request for table data is received. Metadata for a plurality of blocks in a file system is retrieved from a metadata store. Each of the plurality of blocks contains a subset of the table data. A request for work is received from a requestor. An assignment of one or more of the plurality of blocks is provided to the requestor.Type: GrantFiled: April 25, 2014Date of Patent: February 21, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Andrey Balmin, Romulo Antonio Pereira Goncalves, Fatma Ozcan, Jonas Traub
-
Publication number: 20160371193Abstract: In one embodiment, a computer-implemented method includes inserting a set of accessed objects into a cache, where the set of accessed objects varies in size. An object includes a set of object components, and responsive to receiving a request to access the object, it is determined that the object does not fit into the cache given the set of accessed objects and a total size of the cache. A heuristic algorithm is applied, by a computer processor, to identify in the set of object components one or more object components for insertion into the cache. The heuristic algorithm considers at least a priority of the object compared to priorities of one or more objects in the set of accessed objects. The one or more object components are inserted into the cache.Type: ApplicationFiled: June 17, 2015Publication date: December 22, 2016Inventors: Avrilia Floratou, Uday B. Kale, Nimrod Megiddo, Fatma Ozcan, Navneet S. Potti
-
Publication number: 20160205188Abstract: Embodiments relate to parallel data streaming between a first computer system and a second computer system. Aspects include transmitting a request to establish an authenticated connection between a processing job on the first computer system and a process on the second computer system and transmitting a query to the process on the second computer system over the authenticated connection. Aspects further include creating one or more tasks on the first computer system configured to receive data from the second computer system in parallel and reading data received by the one or more tasks by the processing job on the first computer system.Type: ApplicationFiled: January 8, 2015Publication date: July 14, 2016Inventors: Sangeeta T. Doraiswamy, Marc Hörsken, Fatma Ozcan, Mir H. Pirahesh
-
Publication number: 20160103877Abstract: Embodiments relate to joining data across a parallel database and a distributed processing system. Aspects include receiving a query on data stored in parallel database T and data stored in distributed processing system L, applying local query predicates and projection to data T to create T?, and applying local query predicates and projection to L to create L?. Based on determining that a size of L? is less than a size of T? and that the size of L? is less than a first threshold, transmitting L? to the parallel database and executing a join between T? and L?. Based on determining that a number of the nodes distributed processing system n multiplied by the size of T? is less than the size of L? and that the size of T? is less than a second threshold; transmitting T? to the distributed processing system and executing a join between T? and L?.Type: ApplicationFiled: October 10, 2014Publication date: April 14, 2016Inventors: Fatma Ozcan, Hamid Pirahesh, Yuanyuan Tian, Tao Zou
-
Publication number: 20150363466Abstract: In one embodiment, a computer-implemented method includes selecting one or more sub-expressions of a query during compile time. One or more pilot runs are performed by one or more computer processors. The one or more pilot runs include a pilot run associated with each of one or more of the selected sub-expressions, and each pilot run includes at least partial execution of the associated selected sub-expression. The pilot runs are performed during execution time. Statistics are collected on the one or more pilot runs during performance of the one or more pilot runs. The query is optimized based at least in part on the statistics collected during the one or more pilot runs, where the optimization includes basing cardinality and cost estimates on the statistics collected during the pilot runs.Type: ApplicationFiled: June 11, 2014Publication date: December 17, 2015Inventors: Andrey Balmin, Vuk Ercegovac, Jesse E. Jackson, Konstantinos Karanasos, Marcel Kutsch, Fatma Ozcan, Chunyang Xia
-
Publication number: 20150310030Abstract: Scheduling mechanisms for assigning data in a distributed file system to database workers are provided. In one embodiment, a method of and computer program product for assignment of data blocks to database workers are provided. A request for table data is received. Metadata for a plurality of blocks in a file system is retrieved from a metadata store. Each of the plurality of blocks contains a subset of the table data. A request for work is received from a requestor. An assignment of one or more of the plurality of blocks is provided to the requestor.Type: ApplicationFiled: April 25, 2014Publication date: October 29, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Andrey Balmin, Romulo Antonio Pereira Goncalves, Fatma Ozcan, Jonas Traub