Patents by Inventor Abhinav Duggal

Abhinav Duggal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11349915
    Abstract: A source worker node at a source site fetches a task from a message queue. The task specifies replicating a first object at the source site to a destination site. A request for a connection is issued from the source worker node to the destination site. The request is received by a load balancer at the destination site and assigned to a destination worker node. A connection is established between the source and destination worker nodes. A determination is made that the destination site does not include an object that is the same as the first object. Upon the determination, a deduplication is performed between the source and destination worker nodes of segments into which the first object has been divided. Deduplicated segments of the first object are transmitted from the source worker node to the destination worker node for storage at the destination site.
    Type: Grant
    Filed: February 2, 2018
    Date of Patent: May 31, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Kevin Xu, Abhinav Duggal, Atul Avinash Karmarkar, Philip Shilane
  • Patent number: 11327799
    Abstract: A schedule is stored indicating a frequency of replication from source to destination sites. When a replication job is initiated, information identifying one or more objects at the source site to be replicated is copied into a snapshot without pausing user operations against the one or more objects. The snapshot is compared with a previous snapshot to generate replication tasks for the replication job. The replication tasks are placed onto a message queue at the source site, where a worker node at the source site retrieves a replication task from the message queue and processes the replication task in conjunction with a worker node at the destination site.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: May 10, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Atul Avinash Karmarkar, Philip Shilane, Kevin Xu, Abhinav Duggal
  • Publication number: 20220113902
    Abstract: An intelligent method of scheduling garbage collection (GC) in a storage system. A GC scheduler obtains capacity utilization and ingest rate of the storage system and calculate therefrom a predicted capacity utilization. When the predicted capacity utilization reaches a threshold, the GC scheduler invokes GC, but otherwise skips GC until such time as predicted capacity utilization reaches the threshold. The ingest rage may be calculated by performing linear fit on past data ingest. The GC scheduler may calculate predicted capacity utilization periodically according to preset period. The GC scheduler may calculate the predicted capacity utilization to a future date beyond the next period. The future date may be at least as far as the next period plus total ingest time.
    Type: Application
    Filed: October 13, 2020
    Publication date: April 14, 2022
    Inventors: Tony T. WONG, Abhinav DUGGAL, Joseph JOBI
  • Publication number: 20220083513
    Abstract: A method, apparatus, and system for redistributing files in a multi-node storage system to improve global deduplication storage savings is disclosed. A plurality of file cluster candidates are generated for a plurality of files stored at a multi-node storage system comprising a plurality of data nodes. A similarity index is determined for each of the plurality of file cluster candidates based on similarity of the files comprised in the file cluster candidate. A ranked recipe list comprising a plurality of recipes is generated. Each recipe is associated with one of the plurality of file cluster candidates, comprises a destination data node for the associated file cluster candidate, and is associated with a deduplication space savings. At least some of the plurality of files are moved between the plurality of data nodes based on the recipes in the ranked recipe list to improve deduplication space savings in the multi-node storage system.
    Type: Application
    Filed: September 17, 2020
    Publication date: March 17, 2022
    Inventors: Tony Wong, ABHINAV DUGGAL, SMRITI THAKKAR, YU QIU, Pei Jie Sim, RAHUL NIHALANI
  • Patent number: 11226865
    Abstract: Embodiments for a mostly unique file selection process for a deduplication backup system are described. The process assigns tags to files. A tag serves as a hint about the similarity of files in a deduplication file system. It is expected that files from the same client machine will be assigned the same tag. The tag is the smallest unit of migration and serves as a hint of the similarity of the files. The MUFS process measures the uniqueness using a u-index that is a function of the total unique size of a tag relative to the total size of the tag. A load balancer then selects the most unique tags for migration to free the maximum space. It uses the u-index to measure the uniqueness percentage of a tag, so that tags with the highest u-index are selected for migration to free up maximum space on the source node.
    Type: Grant
    Filed: January 18, 2019
    Date of Patent: January 18, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Tony Wong, Hemanth Satyanarayana, Abhinav Duggal
  • Publication number: 20210374124
    Abstract: Systems and methods for verifying files in bulk in a file system. When files are represented by a segment tree, the levels of the segment trees are walked by level such that that multiple files are verified at the same time in order to identify missing segments. Then, a bottom up scan is performed using the missing segments to identify the files corresponding to the missing segments. The missing files can then be handled by the file system.
    Type: Application
    Filed: August 17, 2021
    Publication date: December 2, 2021
    Inventors: Abhinav Duggal, Tony Wong
  • Publication number: 20210271644
    Abstract: A garbage collection assisted deduplication process determines whether or not data segments should be deduplicated or not based on the liveness of segment data in a region, and the number of segments subject to deduplication in the region. Ingested data is divided into a plurality of segments, and a fingerprint is calculated for each segment. An index table entry maps a fingerprint to a region and container ID, and a perfect hash vector is setup for this mapping. A percentage of live segments in the region relative to a liveness threshold is determined, as is a number of segments in the region subject to deduplication relative to a deduplication threshold. If a region is sufficiently live, deduplication is performed, but if the region is dead, deduplication is not performed. For a live region, if the number of deduplicated segments is too low, deduplication is not performed.
    Type: Application
    Filed: February 28, 2020
    Publication date: September 2, 2021
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal
  • Patent number: 11100088
    Abstract: Systems and methods for verifying files in bulk in a file system. When files are represented by a segment tree, the levels of the segment trees are walked by level such that that multiple files are verified at the same time in order to identify missing segments. Then, a bottom up scan is performed using the missing segments to identify the files corresponding to the missing segments. The missing files can then be handled by the file system.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: August 24, 2021
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Abhinav Duggal, Tony Wong
  • Patent number: 11093387
    Abstract: System generates data structure based on unique identifiers of objects in object storages and sets indicators in positions that correspond to hashes of unique identifiers of active objects. If a first number of regions of active data objects in first data storage and second number of regions of active data objects in second data storage each fail to satisfy data threshold, then system creates model identifying locations and sizes of regions of active data objects in first data storage and regions of active data objects in second data storage. System resets indicators in positions in data structure which correspond to hashes of unique identifiers of active data objects associated with model and enables remote storage to use model to copy regions of active data objects in first data storage and second data storage to third data storage, and to delete first data storage and second data storage.
    Type: Grant
    Filed: October 26, 2018
    Date of Patent: August 17, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal, Kalidas Balakrishnan, Fani Jenkins
  • Patent number: 11093453
    Abstract: A data management device includes a persistent storage and a processor. The persistent storage includes meta-data of data stored in a long term retention (LTR) storage. The processor obtains a file storage request for a file and deduplicates the file against segments stored in the LTR storage while performing garbage collection on the LTR storage. Performing garbage collection includes deleting segments of the data stored in the LTR storage using the meta-data. The meta-data is not stored in the LTR storage.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: August 17, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Abdullah Reza, Abhinav Duggal, Lan Bai
  • Patent number: 11010257
    Abstract: Embodiments for a memory efficient perfect hashing for large records. A container ID set is divided into multiple fixed range sizes. These ranges are then mapped into perfect hash buckets until each bucket is filled to uniformly distribute the container IDs across different perfect hash buckets so that the number of CIDs in every perfect hash bucket is the same or nearly the same. Individual perfect hash functions are created for each perfect hash bucket. With container IDs as keys, the process maps n keys to n positions to reduce any extra memory overhead. The perfect hash function is implemented using a compress, hash, displace (CHD) algorithm using two levels of hash functions. The level 1 hash functions divides the keys into multiple internal buckets with a defined average number of keys per bucket. The CHD algorithm iteratively tries different level 2 hash variables to achieve collision-free mapping.
    Type: Grant
    Filed: October 12, 2018
    Date of Patent: May 18, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Tony Wong, Hemanth Satyanarayana, Abhinav Duggal, Ranganathan Dhathri Purohith
  • Patent number: 11010240
    Abstract: A controller at a source site generates a set of tasks associated with a replication job. Each task includes one or more of copying an object from the source to destination site, or deleting an object from the destination site. The tasks are placed onto a message queue at the source site. Source worker nodes at the source site retrieve the tasks from the source site message queue for processing in conjunction with destination worker nodes at the destination site. A destination worker node, upon receiving a task from a source worker nodes, places the task onto a message queue at the destination site for retrieval by a backend worker node that handles writing to an object store at the destination site.
    Type: Grant
    Filed: January 14, 2020
    Date of Patent: May 18, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Philip Shilane, Kevin Xu, Abhinav Duggal, Atul Avinash Karmarkar
  • Patent number: 11003624
    Abstract: Systems and methods for incrementally repairing physical locality for live or active data are provided. Files that are enumerated to determine their locality are identified using dataless consistency points. The files are walked in order to measure their locality or at least the locality of their data segments. Locality repair is performed when the locality is greater than a threshold locality.
    Type: Grant
    Filed: March 18, 2019
    Date of Patent: May 11, 2021
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventor: Abhinav Duggal
  • Patent number: 10949088
    Abstract: A data management device includes a persistent storage and a processor. The persistent storage includes an object storage. The processor generates a collision free hash function based on segments stored in the object storage. The processor generates a hash vector using the collision free hash function. The processor deduplicates the segments using the hash vector. The processor stores the deduplicated segments in the object storage.
    Type: Grant
    Filed: July 21, 2017
    Date of Patent: March 16, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Ramprasad Chinthekindi, Nitin Madan, Abhinav Duggal, Lan Bai
  • Patent number: 10929176
    Abstract: In an embodiment, a system and method for supporting a seeding process with suspend and resume capabilities are described. A resumable seeding component in a data seeding module can be used to move data from a source tier to a target tier. A resumption context including a perfect hash function (PHF) and a perfect hash vector (PHV) persists a state of a seeding process at the end of each operation in the seeding process. The PHV represents data segments of the data using the PHF. The resumption context is loaded into memory upon resumption of the seeding process after it is suspended. Information in the resumable context is used to determine a last successfully completed operation, and a last copied container. The seeding process is resumed by executing an operation following the completed operation in the resumable context.
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: February 23, 2021
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal, Srikanth Srinivasan, Lan Bai
  • Publication number: 20210049044
    Abstract: Embodiments for allocating and reclaiming memory using dynamic buffer allocation for a slab memory allocator. The method keeps track of a count of a total number of worker threads and a count of a total number of quiesced threads, and determines if there is any free slab memory. If there is no free slab memory, the method triggers an out of memory event and increments the count of the total number of quiesced threads. It reclaims all objects currently allocated in an object pool, and allocates a buffer of a next smaller size than an original buffer until a sufficient amount of slab memory is freed.
    Type: Application
    Filed: November 3, 2020
    Publication date: February 18, 2021
    Inventors: Tony Wong, Abhinav Duggal, Hemanth Satyanarayana
  • Patent number: 10860212
    Abstract: A data management device includes a persistent storage and a processor. The persistent storage includes an object storage that stores segments. The processor generates a collision free hash function based on the segments, generates a hash vector using the collision free hash function, deduplicates a portion of the segments associated with to-be-migrated files using the hash vector, and migrates the to-be-migrated files using the deduplicated portion of the segments to a remote storage.
    Type: Grant
    Filed: July 21, 2017
    Date of Patent: December 8, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Ramprasad Chinthekindi, Nitin Madan, Abhinav Duggal, Lan Bai
  • Patent number: 10853140
    Abstract: Embodiments for dynamically resizing buffers for a slab allocator process are described. The slab allocator informs the consumer that the memory buffer must be shrunk to a smaller size. A buffer allocation process dynamically reclaims portions of larger memory buffers to make room for a smaller allocation by shrinking data objects in larger slabs and returning slabs to reserve or free slab lists. Initially a large limit is set, and it is dynamically reduced once all the available memory is exhausted. This allows the slab allocator to adapt to the workload.
    Type: Grant
    Filed: January 31, 2019
    Date of Patent: December 1, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Tony Wong, Abhinav Duggal, Hemanth Satyanarayana
  • Publication number: 20200349066
    Abstract: Systems and methods for performing data protection operations including garbage collection operations and copy forward operations. For deduplicated data stored in a cloud-based storage or in a cloud tier that stores containers containing dead and live segments or dead and live regions such as compression regions, the dead compression regions are deleted by copying the live compression regions into new containers and then deleting the old containers. The copy forward is based on a recipe from a data protection system and is performed using a serverless approach.
    Type: Application
    Filed: May 3, 2019
    Publication date: November 5, 2020
    Inventors: Ramprasad Chinthekindi, Philip Shilane, Abhinav Duggal
  • Publication number: 20200348852
    Abstract: A source site includes a controller, a set of source worker nodes, and a message queue connected between the controller and source worker nodes. The message queue receives messages and stores the messages for retrieval. A destination site includes a set of destination worker nodes. Tasks are generated to replicate changes to objects at the source site to the destination site. The controller pushes messages corresponding to the tasks onto the message queue. A source worker node retrieves a message corresponding to a task from the message queue for processing in conjunction with a destination worker node. The message is indicated as having been retrieved from the message queue.
    Type: Application
    Filed: July 16, 2020
    Publication date: November 5, 2020
    Inventors: Abhinav Duggal, Atul Avinash Karmarkar, Philip Shilane, Kevin Xu