Patents by Inventor Joydeep sen Sarma

Joydeep sen Sarma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for auto-scaling a big data system

Patent number: 11474874

Abstract: Systems and methods for automatically scaling a big data system. Methods include determining, at a first time, a first number of nodes for a cluster to process a request; assigning an amount of nodes equal to the first number of nodes to the cluster; determining a rate of progress of the request; determining, at a second time based on the rate of progress a second number of nodes; and modifying the amount of nodes to equal the second number of nodes. Systems include a cluster manager, to add and/or remove any nodes; the big data system, to process requests that utilize the cluster and nodes, and an automatic scaling cluster manager including a big data interface for communicating with the big data system; a cluster manager interface for communicating with the cluster manager; and a cluster state machine.

Type: Grant

Filed: August 14, 2014

Date of Patent: October 18, 2022

Assignee: QUBOLE, INC.

Inventors: Joydeep Sen Sarma, Mayank Ahuja, Sivaramakrishnan Narayanan, Shrikanth Shankar
Pure-spot and dynamically rebalanced auto-scaling clusters

Patent number: 11436667

Abstract: The present invention is generally directed to systems and methods of providing automatic scaling pure-spot clusters. Such dusters may be dynamically rebalanced for further costs savings. In accordance with some methods of the present invention may include a method of utilizing a cluster in a big data cloud computing environment where instances may include reserved on-demand instances for a set price and on-demand spot instances that may be bid on by a user, the method including: creating one or more stable nodes, comprising spot instances with a bid price above a price for an equivalent on-demand instance; creating one or more volatile nodes, comprising spot instances with a bid price below a price for an equivalent on-demand instance; using one or more of the stable nodes as a master node; and using the volatile nodes as slave nodes.

Type: Grant

Filed: June 7, 2016

Date of Patent: September 6, 2022

Assignee: Qubole, Inc.

Inventors: Hariharan Iyer, Joydeep Sen Sarma, Mayank Ahuja
Heterogeneous auto-scaling big-data clusters in the cloud

Patent number: 11113121

Abstract: The present invention is generally directed to systems and methods of provisioning, and using heterogeneous clusters in a cloud-based big data system, the heterogeneous clusters made up of primary instance types and different types of instances, the method including: determining if there are composition requirements of any heterogeneous cluster, the composition requirements defining instance types permitted for use; determining if any of the permitted different types of instances are required or advantageous for use; determining an amount of different types of instances to utilize, this determination based at least in part on an instance weight; provisioning the heterogeneous cluster comprising both primary instances and permitted different types of instances.

Type: Grant

Filed: March 2, 2020

Date of Patent: September 7, 2021

Inventors: Joydeep Sen Sarma, Mayank Ahuja, Ajaya Agrawal, Prakhar Jain, Hariharan Iyer
Caching framework for big-data engines in the cloud

Patent number: 11080207

Abstract: The present invention is generally directed to a caching framework that provides a common abstraction across one or more big data engines, comprising a cache filesystem including a cache filesystem interface used by applications to access cloud storage through a cache subsystem, the cache filesystem interface in communication with a big data engine extension and a cache manager; the big data engine extension, providing cluster information to the cache filesystem and working with the cache filesystem interface to determine which nodes cache which part of a file; and a cache manager for maintaining metadata about the cache, the metadata comprising the status of blocks for each file. The invention may provide common abstraction across big data engines that does not require changes to the setup of infrastructure or user workloads, allows sharing of cached data and caching only the parts of files that are required, can process columnar format.

Type: Grant

Filed: June 7, 2017

Date of Patent: August 3, 2021

Assignee: Qubole, Inc.

Inventors: Joydeep Sen Sarma, Rajat Venkatesh, Shubham Tagra
Task packing scheduling process for long running applications

Patent number: 10733024

Abstract: In general, the invention is directed to systems and methods of distributing tasks amongst servers or nodes in a cluster in a cloud-based big data environment, including: establishing a high_server_threshold; dividing active servers/nodes into at least three (3) categories of high usage servers, comprising servers on which usage is greater than the high_server_threshold; medium usage servers, comprising servers on which usage is less than the high_server_threshold, but is greater than zero; and low usage servers, comprising servers that are currently not utilized; receiving one or more tasks to be performed; scheduling the tasks by: first requesting that medium usage servers take tasks; if tasks remain that are not scheduled on the medium usage servers, schedule remaining tasks on low usage servers; if any tasks remain that are not scheduled on medium usage servers or low usage servers, scheduling remaining tasks on high usage servers.

Type: Grant

Filed: May 24, 2018

Date of Patent: August 4, 2020

Assignee: Qubole Inc.

Inventors: Joydeep Sen Sarma, Abhishek Modi
Heterogeneous Auto-Scaling Big-Data Clusters in the Cloud

Publication number: 20200241932

Abstract: The present invention is generally directed to systems and methods of provisioning, and using heterogeneous clusters in a cloud-based big data system, the heterogeneous clusters made up of primary instance types and different types of instances, the method including: determining if there are composition requirements of any heterogeneous cluster, the composition requirements defining instance types permitted for use; determining if any of the permitted different types of instances are required or advantageous for use; determining an amount of different types of instances to utilize, this determination based at least in part on an instance weight; provisioning the heterogeneous cluster comprising both primary instances and permitted different types of instances.

Type: Application

Filed: March 2, 2020

Publication date: July 30, 2020

Inventors: Joydeep Sen Sarma, Mayank Ahuja, Ajaya Agrawal, Prakhar Jain, Hariharan Iyer
Heterogeneous auto-scaling big-data clusters in the cloud

Patent number: 10606664

Abstract: The present invention is generally directed to systems and methods of provisioning and using heterogeneous clusters in a cloud-based big data system, the heterogeneous clusters made up of primary instance types and different types of instances, the method including: determining if there are composition requirements of any heterogeneous cluster, the composition requirements defining instance types permitted for use; determining if any of the permitted different types of instances are required or advantageous for use; determining an amount of different types of instances to utilize, this determination based at least in part on an instance weight; provisioning the heterogeneous cluster comprising both primary instances and permitted different types of instances.

Type: Grant

Filed: September 7, 2017

Date of Patent: March 31, 2020

Assignee: Qubole Inc.

Inventors: Joydeep Sen Sarma, Mayank Ahuja, Ajaya Agrawal, Prakhar Jain, Hariharan Iyer
High performance hadoop with new generation instances

Patent number: 10606478

Abstract: The present invention is generally directed to a distributed computing system comprising a plurality of computational clusters, each computational cluster comprising a plurality of compute optimized instances, each instance comprising local instance data storage and in communication with reserved disk storage, wherein processing hierarchy provides priority to local instance data storage before providing priority to reserved disk storage.

Type: Grant

Filed: October 22, 2015

Date of Patent: March 31, 2020

Assignee: Qubole, Inc.

Inventors: Mayank Ahuja, Joydeep Sen Sarma, Shrikanth Shankar
TASK PACKING SCHEDULING PROCESS FOR LONG RUNNING APPLICATIONS

Publication number: 20180341524

Abstract: In general, the invention is directed to systems and methods of distributing tasks amongst servers or nodes in a cluster in a cloud-based big data environment, including: establishing a high_server_threshold; dividing active servers/nodes into at least three (3) categories of high usage servers, comprising servers on which usage is greater than the high_server_threshold; medium usage servers, comprising servers on which usage is less than the high_server_threshold, but is greater than zero; and low usage servers, comprising servers that are currently not utilized; receiving one or more tasks to be performed; scheduling the tasks by: first requesting that medium usage servers take tasks; if tasks remain that are not scheduled on the medium usage servers, schedule remaining tasks on low usage servers; if any tasks remain that are not scheduled on medium usage servers or low usage servers, scheduling remaining tasks on high usage servers.

Type: Application

Filed: May 24, 2018

Publication date: November 29, 2018

Inventors: Joydeep Sen Sarma, Abhishek Modi
Heterogeneous Auto-Scaling Big-Data Clusters in the Cloud

Publication number: 20180067783

Abstract: The present invention is generally directed to systems and methods of provisioning and using heterogeneous clusters in a cloud-based big data system, the heterogeneous clusters made up of primary instance types and different types of instances, the method including: determining if there are composition requirements of any heterogeneous cluster, the composition requirements defining instance types permitted for use; determining if any of the permitted different types of instances are required or advantageous for use; determining an amount of different types of instances to utilize, this determination based at least in part on an instance weight; provisioning the heterogeneous cluster comprising both primary instances and permitted different types of instances.

Type: Application

Filed: September 7, 2017

Publication date: March 8, 2018

Inventors: Joydeep Sen Sarma, Mayank Ahuja, Ajaya Agrawal, Prakhar Jain, Hariharan Iyer
Caching Framework for Big-Data Engines in the Cloud

Publication number: 20170351620

Abstract: The present invention is generally directed to a caching framework that provides a common abstraction across one or more big data engines, comprising a cache filesystem including a cache filesystem interface used by applications to access cloud storage through a cache subsystem, the cache filesystem interface in communication with a big data engine extension and a cache manager; the big data engine extension, providing cluster information to the cache filesystem and working with the cache filesystem interface to determine which nodes cache which part of a file; and a cache manager for maintaining metadata about the cache, the metadata comprising the status of blocks for each file. The invention may provide common abstraction across big data engines that does not require changes to the setup of infrastructure or user workloads, allows sharing of cached data and caching only the parts of files that are required, can process columnar format.

Type: Application

Filed: June 7, 2017

Publication date: December 7, 2017

Inventors: Joydeep Sen Sarma, Rajat Venkatesh, Shubham Tagra
Pure-Spot and Dynamically Rebalanced Auto-Scaling Clusters

Publication number: 20160358249

Abstract: The present invention is generally directed to systems and methods of providing automatic scaling pure-spot clusters. Such dusters may be dynamically rebalanced for further costs savings. In accordance with some methods of the present invention may include a method of utilizing a cluster in a big data cloud computing environment where instances may include reserved on-demand instances for a set price and on-demand spot instances that may be bid on by a user, the method including: creating one or more stable nodes, comprising spot instances with a bid price above a price for an equivalent on-demand instance; creating one or more volatile nodes, comprising spot instances with a bid price below a price for an equivalent on-demand instance; using one or more of the stable nodes as a master node; and using the volatile nodes as slave nodes.

Type: Application

Filed: June 7, 2016

Publication date: December 8, 2016

Inventors: Hariharan Iyer, Joydeep Sen Sarma, Mayank Ahuja
High Performance Hadoop with New Generation Instances

Publication number: 20160117107

Abstract: The present invention is generally directed to a distributed computing system comprising a plurality of computational clusters, each computational cluster comprising a plurality of compute optimized instances, each instance comprising local instance data storage and in communication with reserved disk storage, wherein processing hierarchy provides priority to local instance data storage before providing priority to reserved disk storage.

Type: Application

Filed: October 22, 2015

Publication date: April 28, 2016

Inventors: Mayank Ahuja, Joydeep Sen Sarma, Shrikanth Shankar
Systems and Methods for Auto-Scaling a Big Data System

Publication number: 20160048415

Abstract: Systems and methods for automatically scaling a big data system are disclosed. Methods may include: determining, at a first time, a first optimal number of nodes for a cluster to adequately process a request; assigning an amount of nodes equal to the first optimal number; determining a rate of progress of the request; determining, at a second time based on the rate of progress a second optimal number of nodes; and modifying the number of nodes assigned to the cluster to equal the second optimal number. Systems may include: a cluster manager, to add and/or remove nodes; a big data system, to process requests that utilize the cluster and nodes, and an automatic scaling cluster manager, including: a big data interface, for communicating with the big data system; a cluster manager interface, for communicating with a cluster manager instructions for adding and/or removing nodes from a cluster used to process a request; and a cluster state machine.

Type: Application

Filed: August 14, 2014

Publication date: February 18, 2016

Inventors: Joydeep Sen Sarma, Mayank Ahuja, Sivaramakrishnan Narayanan, Shrikanth Shankar
System and method for cluster management

Patent number: 9104493

Abstract: A system and method of managing a cluster of distributed machines is described. A cluster manager receives status updates regarding tasks running on each machine in the cluster from a task tracker running on the machine. The cluster manager receives resource requests from a job tracker created by a client wishing to run a job in the cluster. The cluster manager is responsible for implementing push-based fair scheduling of resources to the job trackers. The job tracker is responsible for running tasks for one job in the resource identified by the cluster manager. In one embodiment, the job tracker can run in the client for small jobs and in the cluster for larger jobs. The cluster manager can also be restarted, for example, for software updates without restraining the cluster.

Type: Grant

Filed: November 6, 2012

Date of Patent: August 11, 2015

Assignee: FACEBOOK, INC.

Inventors: Dmytro Molkov, Ramkumar Venkat Vadali, Chun-Yang Chen, Joydeep Sen Sarma
SYSTEM AND METHOD FOR CLUSTER MANAGEMENT

Publication number: 20140130054

Abstract: A system and method of managing a cluster of distributed machines is described. A cluster manager receives status updates regarding tasks running on each machine in the cluster from a task tracker running on the machine. The cluster manager receives resource requests from a job tracker created by a client wishing to run a job in the cluster. The cluster manager is responsible for implementing push-based fair scheduling of resources to the job trackers. The job tracker is responsible for running tasks for one job in the resource identified by the cluster manager. In one embodiment, the job tracker can run in the client for small jobs and in the cluster for larger jobs. The cluster manager can also be restarted, for example, for software updates without restraining the cluster.

Type: Application

Filed: November 6, 2012

Publication date: May 8, 2014

Inventors: Dmytro Molkov, Ramkumar Vadali, Chung-Yang Chen, Joydeep Sen Sarma
Transferring behavioral profiles anonymously across domains for behavioral targeting

Patent number: 8660899

Abstract: A system and method are disclosed for transferring a behavior profile anonymously across multiple domains. The behavior profile may be established from a first domain, but transferred anonymously such that it is accessible by other domains. The behavior profile may be used for generating targeted advertisements.

Type: Grant

Filed: December 19, 2006

Date of Patent: February 25, 2014

Assignee: Yahoo! Inc.

Inventors: Joydeep Sen Sarma, Wu Wang
System and method of implementing disk ownership in networked storage

Patent number: 8380824

Abstract: A method and apparatus for identifying ownership by a computer of a storage device connected to a computer network is described. A first ownership information is written into a selected sector of the storage device by a computer having ownership of the device as a first indicia of ownership. A second ownership information is written into a storage device label of the storage device by the computer having ownership as a second indicia of ownership, the storage device label visible to a plurality of computers connected to the computer network. In the event that at a future time the first indicia of ownership does not match the second indicia of ownership, the first indicia of ownership is taken as definitive of ownership of the storage device.

Type: Grant

Filed: August 11, 2009

Date of Patent: February 19, 2013

Assignee: NetApp, Inc.

Inventors: Susan M. Coatney, Alan L. Rowe, Radek Aster, Joydeep Sen Sarma
Mirror split brain avoidance

Patent number: 8060776

Abstract: A data storage system has two computers. Each computer is assigned to a set of data. Two copies of each set of data are maintained. A first copy is stored on a first set of disks and a second copy is stored on a second set of disks. Each time that a data is written by a computer, a label is written to each set of disks, the label having fields for a status of each computer, a first ordinal which is increased each time that a new data is written, and a time stamp giving a time at which the last write was performed. After failure of a computer, a processor determines, in response to reading the labels of the first set of disks and the second set of disks, the most up to date copy of the data assigned to the failed computer.

Type: Grant

Filed: June 19, 2008

Date of Patent: November 15, 2011

Assignee: NetApp, Inc.

Inventors: Scott Schoenthal, Steven H. Rodrigues, Alan L. Rowe, Joydeep sen Sarma, Susan M. Coatney
System and method for coordinating cluster state information

Patent number: 7953924

Abstract: A method for managing a plurality of servers is disclosed. Each server of the plurality of servers has access to data stored by other servers. The data is stored to one or more data storage devices. Coordinating information is written for the plurality of servers to a master mailbox record. The coordinating information includes data that each server uses to recover after a failure by a server. The master mailbox record is stored on a selected storage device at a location known to the plurality of servers, and the selected storage device is designated as a lock storage device. A plurality of lock storage devices is chosen so that in the event of failure of a server of the plurality of servers, at least one lock storage device will be available to the remaining servers.

Type: Grant

Filed: January 22, 2010

Date of Patent: May 31, 2011

Assignee: NetApp, Inc.

Inventors: Richard O. Larson, Alan L. Rowe, Joydeep sen Sarma

1 2 3 next