Patents by Inventor Abhinav Duggal

Abhinav Duggal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200341892
    Abstract: Systems and methods for performing data protection operations including garbage collection operations and copy forward operations. For deduplicated data stored in a cloud-based storage or in a cloud tier that stores containers containing dead and live regions such as compression regions, the dead segments in the dead compression regions are deleted by copying the live compression regions into new containers and then deleting the old containers. The copy forward is based on a recipe from a data protection system and is performed using a microservices based approach.
    Type: Application
    Filed: April 26, 2019
    Publication date: October 29, 2020
    Inventors: Abhinav Duggal, Ramprasad Chinthekindi, Philip Shilane
  • Publication number: 20200341891
    Abstract: Systems and methods for performing data protection operations including garbage collection operations and copy forward operations. For deduplicated data stored in a cloud-based storage or in a cloud tier that stores containers containing dead and live segments, the dead segments are deleted by copying live segments into new containers and then deleting the old containers. The copy forward is based on a recipe from a data protection system and is performed using a microservices that can be run as needed in the cloud.
    Type: Application
    Filed: April 26, 2019
    Publication date: October 29, 2020
    Inventors: Philip Shilane, Abhinav Duggal, Ramprasad Chinthekindi
  • Patent number: 10810162
    Abstract: A perfect hash vector (PHVEC) is created to track segments in a deduplication file system. Files are represented by segment trees having hierarchical segment levels. Containers store the segments and fingerprints of segments. Upper-level segments are traversed to identify a first set of fingerprints of each level. These fingerprints correspond to segments that should be present. The first set of fingerprints are hashed and bits are set in the PHVEC corresponding to positions from the hashing. The containers are read to identify a second set of fingerprints. These fingerprints correspond to segments that are present. The second set of fingerprints are hashed and bits are cleared in the PHVEC corresponding to positions from the hashing. If a bit was set and not cleared, a determination is that there is at least one segment missing. If all bits set were also cleared, a determination is that no segments are missing.
    Type: Grant
    Filed: July 12, 2018
    Date of Patent: October 20, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Tony Wong, Abhinav Duggal, Ramprasad Chinthekindi
  • Patent number: 10795812
    Abstract: A garbage collection (GC) process within a deduplication backup network comprising a GC component identifying metadata stored in file system (FS) segments, storing the metadata in a metadata container locally on the server as well as on cloud storage, and reading the locally stored metadata container through the GC process to obtain metadata of the FS containers and determine live data regions of the FS containers, wherein the metadata contains fingerprints of all segments of the file system containers; and a copy forward component forwarding the live data regions to new containers written both locally on the server and on the cloud storage, writing live portions of the metadata container to a new metadata container written both locally on the server and on the cloud storage, and deleting dead compression regions from the cloud storage and deleting the original metadata container from local storage and the cloud storage.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: October 6, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Chinthekindi Ramprasad, Mahesh Kamat, Bhimsen Bhanjois
  • Patent number: 10761765
    Abstract: A source site includes a controller, a set of source worker nodes, and a message queue connected between the controller and source worker nodes. A destination site includes a set of destination worker nodes. The controller identifies differences between a first snapshot created at the source site at a first time and a second snapshot created at a second time, after the first time. Based on the differences, a set of tasks are generated. The tasks include one or more of copying an object from the source to destination or deleting an object from the destination. The controller places the tasks onto the message queue. A first source worker node retrieves the first task and coordinates with a first destination worker node to perform the first task. A second source worker nodes retrieves the second task and coordinates with a second destination worker node to perform the second task.
    Type: Grant
    Filed: February 2, 2018
    Date of Patent: September 1, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Abhinav Duggal, Atul Avinash Karmarkar, Philip Shilane, Kevin Xu
  • Publication number: 20200249995
    Abstract: Embodiments for dynamically resizing buffers for a slab allocator process are described. The slab allocator informs the consumer that the memory buffer must be shrunk to a smaller size. A buffer allocation process dynamically reclaims portions of larger memory buffers to make room for a smaller allocation by shrinking data objects in larger slabs and returning slabs to reserve or free slab lists. Initially a large limit is set, and it is dynamically reduced once all the available memory is exhausted. This allows the slab allocator to adapt to the workload.
    Type: Application
    Filed: January 31, 2019
    Publication date: August 6, 2020
    Inventors: Tony Wong, Abhinav Duggal, Hemanth Satyanarayana
  • Publication number: 20200233752
    Abstract: Embodiments for a mostly unique file selection process for a deduplication backup system are described. The process assigns tags to files. A tag serves as a hint about the similarity of files in a deduplication file system. It is expected that files from the same client machine will be assigned the same tag. The tag is the smallest unit of migration and serves as a hint of the similarity of the files. The MUFS process measures the uniqueness using a u-index that is a function of the total unique size of a tag relative to the total size of the tag. A load balancer then selects the most unique tags for migration to free the maximum space. It uses the u-index to measure the uniqueness percentage of a tag, so that tags with the highest u-index are selected for migration to free up maximum space on the source node.
    Type: Application
    Filed: January 18, 2019
    Publication date: July 23, 2020
    Inventors: Tony Wong, Hemanth Satyanarayana, Abhinav Duggal
  • Patent number: 10713217
    Abstract: In general, embodiments of the invention relate to a method and system for managing persistent storage in a local computing device. More specifically, embodiments of the invention relate to determining the amount of space that will be freed up (or become available) in the persistent storage during a data transfer using a perfect hash function. Once the amount of data to be transferred is determined, embodiments of the invention initiate the allocation of an appropriate amount of space in the remote storage device and, subsequently, initiate the transfer of the data to the remote storage device.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: July 14, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Srikanth Srinivasan, Ramprasad Chinthekindi, Abhinav Duggal
  • Publication number: 20200159611
    Abstract: A controller at a source site generates a set of tasks associated with a replication job. Each task includes one or more of copying an object from the source to destination site, or deleting an object from the destination site. The tasks are placed onto a message queue at the source site. Source worker nodes at the source site retrieve the tasks from the source site message queue for processing in conjunction with destination worker nodes at the destination site. A destination worker node, upon receiving a task from a source worker nodes, places the task onto a message queue at the destination site for retrieval by a backend worker node that handles writing to an object store at the destination site.
    Type: Application
    Filed: January 14, 2020
    Publication date: May 21, 2020
    Inventors: Philip Shilane, Kevin Xu, Abhinav Duggal, Atul Avinash Karmarkar
  • Patent number: 10649682
    Abstract: Described is a deduplicated storage system that may perform a focused sanitization process by reducing the number of data storage containers that must be sanitized. The system leverages additional characteristics of the files that need to be sanitized such as an initial storage date (e.g. data breach date) of when a sensitive file (e.g. file to be sanitization) was actually stored on the deduplicated storage system. By maintaining a creation date of data containers, the system may limit sanitization to those containers having a creation date on or after the initial storage date of the sensitive file. Accordingly, the system is capable of performing a more focused overwriting of data thereby improving the overall efficiency of the sanitization process.
    Type: Grant
    Filed: October 6, 2017
    Date of Patent: May 12, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Ramprasad Chinthekindi, Shah Veeral, Abhinav Duggal
  • Patent number: 10649807
    Abstract: In an embodiment, a method for validating data integrity of a seeding process is described. The seeding process for migrating data from a source tier to a target tier persists a perfect hash vector (PHV) to a disk when the seeding process is suspended for various reasons. The PHV includes bits for fingerprints for data segments corresponding to the data, and can be reloaded into memory upon resumption of the seeding process. One or more bits corresponding to fingerprints for copied data segments are reset prior to starting the copy phase in the resumed run. A checksum of the PHV is calculated after the seeding process completes copying data segments in the containers. A non-zero checksum of the PHV indicates that one or more data segments are missing on the source tier or the data segments are not successfully copied to the target tier. The missing data segments and/or one or more related files are reported to a user via a user interface.
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: May 12, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal, Srikanth Srinivasan, Lan Bai
  • Publication number: 20200133720
    Abstract: In an embodiment, a method for validating data integrity of a seeding process is described. The seeding process for migrating data from a source tier to a target tier persists a perfect hash vector (PHV) to a disk when the seeding process is suspended for various reasons. The PHV includes bits for fingerprints for data segments corresponding to the data, and can be reloaded into memory upon resumption of the seeding process. One or more bits corresponding to fingerprints for copied data segments are reset prior to starting the copy phase in the resumed run. A checksum of the PHV is calculated after the seeding process completes copying data segments in the containers. A non-zero checksum of the PHV indicates that one or more data segments are missing on the source tier or the data segments are not successfully copied to the target tier. The missing data segments and/or one or more related files are reported to a user via a user interface.
    Type: Application
    Filed: October 24, 2018
    Publication date: April 30, 2020
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal, Srikanth Srinivasan, Lan Bai
  • Publication number: 20200133719
    Abstract: In an embodiment, a system and method for supporting a seeding process with suspend and resume capabilities are described. A resumable seeding component in a data seeding module can be used to move data from a source tier to a target tier. A resumption context including a perfect hash function (PHF) and a perfect hash vector (PHV) persists a state of a seeding process at the end of each operation in the seeding process. The PHV represents data segments of the data using the PHF. The resumption context is loaded into memory upon resumption of the seeding process after it is suspended. Information in the resumable context is used to determine a last successfully completed operation, and a last copied container. The seeding process is resumed by executing an operation following the completed operation in the resumable context.
    Type: Application
    Filed: October 24, 2018
    Publication date: April 30, 2020
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal, Srikanth Srinivasan, Lan Bai
  • Publication number: 20200134042
    Abstract: In general, embodiments of the invention relate to a method and system for managing persistent storage in a local computing device. More specifically, embodiments of the invention relate to determining the amount of space that will be freed up (or become available) in the persistent storage during a data transfer using a perfect hash function. Once the amount of data to be transferred is determined, embodiments of the invention initiate the allocation of an appropriate amount of space in the remote storage device and, subsequently, initiate the transfer of the data to the remote storage device.
    Type: Application
    Filed: October 30, 2018
    Publication date: April 30, 2020
    Inventors: Srikanth Srinivasan, Ramprasad Chinthekindi, Abhinav Duggal
  • Publication number: 20200125410
    Abstract: A schedule is stored indicating a frequency of replication from source to destination sites. When a replication job is initiated, information identifying one or more objects at the source site to be replicated is copied into a snapshot without pausing user operations against the one or more objects. The snapshot is compared with a previous snapshot to generate replication tasks for the replication job. The replication tasks are placed onto a message queue at the source site, where a worker node at the source site retrieves a replication task from the message queue and processes the replication task in conjunction with a worker node at the destination site.
    Type: Application
    Filed: October 29, 2019
    Publication date: April 23, 2020
    Inventors: Atul Avinash Karmarkar, Philip Shilane, Kevin Xu, Abhinav Duggal
  • Patent number: 10628298
    Abstract: Generate first data structure based on unique identifiers of objects in object storages. Set indicators in positions in first data structure corresponding to hashes of unique identifiers of active objects in storages. When garbage collection is suspended, store suspension information to persistent storage. Set indicators in second data structure positions corresponding to hashes of unique identifiers of data objects that are deduplicated to storages while garbage collection is suspended. When garbage collection is resumed, retrieve suspension information from persistent storage. Set indicators in positions in first data structure corresponding to hashes of unique identifiers of data objects corresponding to indicators set in second data structure positions. Copy active objects from first object storage to second if number of active objects in first object storage does not satisfy threshold.
    Type: Grant
    Filed: October 26, 2018
    Date of Patent: April 21, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal
  • Publication number: 20200117546
    Abstract: Embodiments for a memory efficient perfect hashing for large records. A container ID set is divided into multiple fixed range sizes. These ranges are then mapped into perfect hash buckets until each bucket is filled to uniformly distribute the container IDs across different perfect hash buckets so that the number of CIDs in every perfect hash bucket is the same or nearly the same. Individual perfect hash functions are created for each perfect hash bucket. With container IDs as keys, the process maps n keys to n positions to reduce any extra memory overhead. The perfect hash function is implemented using a compress, hash, displace (CHD) algorithm using two levels of hash functions. The level 1 hash functions divides the keys into multiple internal buckets with a defined average number of keys per bucket. The CHD algorithm iteratively tries different level 2 hash variables to achieve collision-free mapping.
    Type: Application
    Filed: October 12, 2018
    Publication date: April 16, 2020
    Inventors: Tony Wong, Hemanth Satyanarayana, Abhinav Duggal, Ranganathan Dhathri Purohith
  • Patent number: 10592158
    Abstract: A method for transferring data includes populating a perfect hash bit vector (PHV) using a perfect hash function (PHF) and a target index file to obtain a populated PHV, determining required segment references using the populated PHV and received segment references, providing the required segment references to a source storage device, and receiving segments corresponding to the required segment references from the source storage device.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: March 17, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Ramprasad Chinthekindi, Abhinav Duggal
  • Patent number: 10585746
    Abstract: A controller at a source site generates a set of tasks associated with a replication job. Each task involves a source worker node from among a set of source worker nodes at the source site, a destination worker node from among a set of destination worker nodes at the destination site, and includes one or more of copying an object from the source to destination site, or deleting an object from the destination site. Status update messages concerning the tasks are received at a message queue connected between the controller and the set of source worker nodes. The status update messages are logged into a persistent key-value store. Upon a failure to complete the replication job, the key-value store is accessed to identify tasks that were and were not completed before the failure. The tasks that were not completed are resent to the source worker nodes.
    Type: Grant
    Filed: February 2, 2018
    Date of Patent: March 10, 2020
    Assignee: EMC IP Holding Company LLC
    Inventors: Philip Shilane, Kevin Xu, Abhinav Duggal, Atul Avinash Karmarkar
  • Publication number: 20200019623
    Abstract: A perfect hash vector (PHVEC) is created to track segments in a deduplication file system. Files are represented by segment trees having hierarchical segment levels. Containers store the segments and fingerprints of segments. Upper-level segments are traversed to identify a first set of fingerprints of each level. These fingerprints correspond to segments that should be present. The first set of fingerprints are hashed and bits are set in the PHVEC corresponding to positions from the hashing. The containers are read to identify a second set of fingerprints. These fingerprints correspond to segments that are present. The second set of fingerprints are hashed and bits are cleared in the PHVEC corresponding to positions from the hashing. If a bit was set and not cleared, a determination is that there is at least one segment missing. If all bits set were also cleared, a determination is that no segments are missing.
    Type: Application
    Filed: July 12, 2018
    Publication date: January 16, 2020
    Inventors: Tony Wong, Abhinav Duggal, Ramprasad Chinthekindi