Patents Assigned to Qubole, Inc.
  • Patent number: 11704316
    Abstract: The present invention is generally directed to systems and methods of determining and provisioning peak memory requirements in Structured Query Language Processing engines. More specifically, methods may include determining or obtaining a query execution plan; gathering statistics associated with each database table; breaking the query execution plan into one or more subtasks: calculating an estimated memory usage for each subtask using the statistics; determining or obtaining a dependency graph of the one or more subtasks; based at least in part on the dependency graph, determining which subtasks can execute concurrently on a single worker node; and totaling the amount of estimated memory for each subtask that can execute concurrently on a single worker node and setting this amount of estimated memory as the estimated peak memory requirement for the specefic database query.
    Type: Grant
    Filed: July 24, 2019
    Date of Patent: July 18, 2023
    Assignee: Qubole, Inc.
    Inventors: Ankit Dixit, Shubham Tagra
  • Patent number: 11436667
    Abstract: The present invention is generally directed to systems and methods of providing automatic scaling pure-spot clusters. Such dusters may be dynamically rebalanced for further costs savings. In accordance with some methods of the present invention may include a method of utilizing a cluster in a big data cloud computing environment where instances may include reserved on-demand instances for a set price and on-demand spot instances that may be bid on by a user, the method including: creating one or more stable nodes, comprising spot instances with a bid price above a price for an equivalent on-demand instance; creating one or more volatile nodes, comprising spot instances with a bid price below a price for an equivalent on-demand instance; using one or more of the stable nodes as a master node; and using the volatile nodes as slave nodes.
    Type: Grant
    Filed: June 7, 2016
    Date of Patent: September 6, 2022
    Assignee: Qubole, Inc.
    Inventors: Hariharan Iyer, Joydeep Sen Sarma, Mayank Ahuja
  • Patent number: 11080207
    Abstract: The present invention is generally directed to a caching framework that provides a common abstraction across one or more big data engines, comprising a cache filesystem including a cache filesystem interface used by applications to access cloud storage through a cache subsystem, the cache filesystem interface in communication with a big data engine extension and a cache manager; the big data engine extension, providing cluster information to the cache filesystem and working with the cache filesystem interface to determine which nodes cache which part of a file; and a cache manager for maintaining metadata about the cache, the metadata comprising the status of blocks for each file. The invention may provide common abstraction across big data engines that does not require changes to the setup of infrastructure or user workloads, allows sharing of cached data and caching only the parts of files that are required, can process columnar format.
    Type: Grant
    Filed: June 7, 2017
    Date of Patent: August 3, 2021
    Assignee: Qubole, Inc.
    Inventors: Joydeep Sen Sarma, Rajat Venkatesh, Shubham Tagra
  • Patent number: 10733024
    Abstract: In general, the invention is directed to systems and methods of distributing tasks amongst servers or nodes in a cluster in a cloud-based big data environment, including: establishing a high_server_threshold; dividing active servers/nodes into at least three (3) categories of high usage servers, comprising servers on which usage is greater than the high_server_threshold; medium usage servers, comprising servers on which usage is less than the high_server_threshold, but is greater than zero; and low usage servers, comprising servers that are currently not utilized; receiving one or more tasks to be performed; scheduling the tasks by: first requesting that medium usage servers take tasks; if tasks remain that are not scheduled on the medium usage servers, schedule remaining tasks on low usage servers; if any tasks remain that are not scheduled on medium usage servers or low usage servers, scheduling remaining tasks on high usage servers.
    Type: Grant
    Filed: May 24, 2018
    Date of Patent: August 4, 2020
    Assignee: Qubole Inc.
    Inventors: Joydeep Sen Sarma, Abhishek Modi
  • Patent number: 10606664
    Abstract: The present invention is generally directed to systems and methods of provisioning and using heterogeneous clusters in a cloud-based big data system, the heterogeneous clusters made up of primary instance types and different types of instances, the method including: determining if there are composition requirements of any heterogeneous cluster, the composition requirements defining instance types permitted for use; determining if any of the permitted different types of instances are required or advantageous for use; determining an amount of different types of instances to utilize, this determination based at least in part on an instance weight; provisioning the heterogeneous cluster comprising both primary instances and permitted different types of instances.
    Type: Grant
    Filed: September 7, 2017
    Date of Patent: March 31, 2020
    Assignee: Qubole Inc.
    Inventors: Joydeep Sen Sarma, Mayank Ahuja, Ajaya Agrawal, Prakhar Jain, Hariharan Iyer
  • Patent number: 10606478
    Abstract: The present invention is generally directed to a distributed computing system comprising a plurality of computational clusters, each computational cluster comprising a plurality of compute optimized instances, each instance comprising local instance data storage and in communication with reserved disk storage, wherein processing hierarchy provides priority to local instance data storage before providing priority to reserved disk storage.
    Type: Grant
    Filed: October 22, 2015
    Date of Patent: March 31, 2020
    Assignee: Qubole, Inc.
    Inventors: Mayank Ahuja, Joydeep Sen Sarma, Shrikanth Shankar