Patents by Inventor Avnish Kumar Rastogi

Avnish Kumar Rastogi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230289370
    Abstract: The present disclosure relates to a system and a method for processing distributed data files. The processor executes instructions to receive a set of instructions from a primary device, wherein the set of instructions comprises verification rules, validators, primary transformers and structure query transformers; generate processed data files by processing the distributed data files. The distributed data files are processed by performing at least one of: executing one of the verification rules, the validators and the primary transformers on the distributed data files; and transforming the distributed data files by executing the structure query transformers. The execution of the structured query transformers comprises steps of generating a dependency graph based upon dependencies between the structure query transformers; and determining a sequence of execution of the structured query transformers based upon the dependency graph; and transfer the processed data files to a data warehouse.
    Type: Application
    Filed: May 23, 2023
    Publication date: September 14, 2023
    Inventors: AVNISH KUMAR RASTOGI, NITIN NARANG, MOHAMMAD AJMAL
  • Patent number: 11727009
    Abstract: Disclosed is a method and system for processing skewed datasets. The processor 202 is configured to capture a broadcast size of non-skewed datasets to be loaded onto a memory associated with one or more nodes in a distributed system. The skewed dataset is identified from two or more datasets to be joined. Each of the non-skewed dataset is divided into a plurality of non-skewed data chunks at the node and each of the non-skewed data chunk is broadcasted to one or more nodes having the skewed dataset. The joining operation is then performed between each of the skewed dataset and the non-skewed data chunk till all the non-skewed data chunks are consumed in the join operation. Resultant joined dataset is then collected as a single joined dataset from the nodes involved in the joining operation.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: August 15, 2023
    Inventor: Avnish Kumar Rastogi
  • Patent number: 11693884
    Abstract: The present disclosure relates to a system and a method for processing distributed data files. The processor executes instructions to receive a set of instructions from a primary device, wherein the set of instructions comprises verification rules, validators, primary transformers and structure query transformers; generate processed data files by processing the distributed data files. The distributed data files are processed by performing at least one of: executing one of the verification rules, the validators and the primary transformers on the distributed data files; and transforming the distributed data files by executing the structure query transformers. The execution of the structured query transformers comprises steps of generating a dependency graph based upon dependencies between the structure query transformers; and determining a sequence of execution of the structured query transformers based upon the dependency graph; and transfer the processed data files to a data warehouse.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: July 4, 2023
    Assignee: HCL TECHNOLOGIES LIMITED
    Inventors: Avnish Kumar Rastogi, Nitin Narang, Mohammad Ajmal
  • Patent number: 11615094
    Abstract: Disclosed is a method and system for joining datasets in a distributed computing environment. The system comprises a memory 206 and a processor 202. The processor 202 identifies a skewed dataset from two or more datasets to be joined. The processor 202 identifies a replication parameter from a configuration file. The processor 202 then assigns a randomly assigned machine number to each chunk of the skewed dataset owned by the nodes/machines involved in the join operation. The processor 202 forms copies of the non-skewed dataset equal to the replication parameter and adds the copy number to each sample of the copy of the non-skewed dataset formed. Further, the processor 202 merges each non-skewed dataset into the final copy of the non-skewed dataset, forming a single non skewed dataset. The processor 202 then repeats these steps for all the non-skewed datasets involved in the join operation resulting in generation of merged copies of all the non-skewed datasets and then performs the joining operation.
    Type: Grant
    Filed: August 12, 2020
    Date of Patent: March 28, 2023
    Assignee: HCL TECHNOLOGIES LIMITED
    Inventor: Avnish Kumar Rastogi
  • Publication number: 20220100752
    Abstract: Disclosed is a method and system for processing skewed datasets. The processor 202 is configured to capture a broadcast size of non-skewed datasets to be loaded onto a memory associated with one or more nodes in a distributed system. The skewed dataset is identified from two or more datasets to be joined. Each of the non-skewed dataset is divided into a plurality of non-skewed data chunks at the node and each of the non-skewed data chunk is broadcasted to one or more nodes having the skewed dataset. The joining operation is then performed between each of the skewed dataset and the non-skewed data chunk till all the non-skewed data chunks are consumed in the join operation. Resultant joined dataset is then collected as a single joined dataset from the nodes involved in the joining operation.
    Type: Application
    Filed: September 29, 2020
    Publication date: March 31, 2022
    Inventor: Avnish Kumar RASTOGI
  • Publication number: 20220050845
    Abstract: Disclosed is a method and system for joining datasets in a distributed computing environment. The system comprises a memory 206 and a processor 202. The processor 202 identifies a skewed dataset from two or more datasets to be joined. The processor 202 identifies a replication parameter from a configuration file. The processor 202 then assigns a randomly assigned machine number to each chunk of the skewed dataset owned by the nodes/machines involved in the join operation. The processor 202 forms copies of the non-skewed dataset equal to the replication parameter and adds the copy number to each sample of the copy of the non-skewed dataset formed. Further, the processor 202 merges each non-skewed dataset into the final copy of the non-skewed dataset, forming a single non skewed dataset. The processor 202 then repeats these steps for all the non-skewed datasets involved in the join operation resulting in generation of merged copies of all the non-skewed datasets and then performs the joining operation.
    Type: Application
    Filed: August 12, 2020
    Publication date: February 17, 2022
    Applicant: HCL TECHNOLOGIES LIMITED
    Inventor: Avnish Kumar RASTOGI
  • Patent number: 11126642
    Abstract: Disclosed method for generating synthetic data for minority classes in a very large dataset comprises grouping samples stored on several devices, into different groups. A pivot is identified to be used as a reference for grouping the samples into bins. The samples are assigned to a bin, based on a closest pivot. The samples are regrouped into different groups, based on identities of the bins, and each of the groups is distributed to the several devices. Samples belonging to majority class and minority classes for which synthetic data is not being generated are removed from each of the different groups. Samples of each of these groups are arranged in different M-Trees to facilitate identification of K-nearest neighbours for each sample within each of the different groups to generate K pairs of nearest neighbours. Finally, synthetic samples are generated for the K pairs of nearest neighbours by creating random samples.
    Type: Grant
    Filed: July 29, 2019
    Date of Patent: September 21, 2021
    Inventors: Avnish Kumar Rastogi, Nitin Narang, Mohammad Ajmal
  • Publication number: 20210279259
    Abstract: The present disclosure relates to a system and a method for processing distributed data files. The processor executes instructions to receive a set of instructions from a primary device, wherein the set of instructions comprises verification rules, validators, primary transformers and structure query transformers; generate processed data files by processing the distributed data files. The distributed data files are processed by performing at least one of: executing one of the verification rules, the validators and the primary transformers on the distributed data files; and transforming the distributed data files by executing the structure query transformers. The execution of the structured query transformers comprises steps of generating a dependency graph based upon dependencies between the structure query transformers; and determining a sequence of execution of the structured query transformers based upon the dependency graph; and transfer the processed data files to a data warehouse.
    Type: Application
    Filed: March 4, 2020
    Publication date: September 9, 2021
    Applicant: HCL TECHNOLOGIES LIMITED
    Inventors: Avnish Kumar RASTOGI, Nitin NARANG, Mohammad AJMAL
  • Publication number: 20210034645
    Abstract: Disclosed method for generating synthetic data for minority classes in a very large dataset comprises grouping samples stored on several devices, into different groups. A pivot is identified to be used as a reference for grouping the samples into bins. The samples are assigned to a bin, based on a closest pivot. The samples are regrouped into different groups, based on identities of the bins, and each of the groups is distributed to the several devices. Samples belonging to majority class and minority classes for which synthetic data is not being generated are removed from each of the different groups. Samples of each of these groups are arranged in different M-Trees to facilitate identification of K-nearest neighbours for each sample within each of the different groups to generate K pairs of nearest neighbours. Finally, synthetic samples are generated for the K pairs of nearest neighbours by creating random samples.
    Type: Application
    Filed: July 29, 2019
    Publication date: February 4, 2021
    Inventors: Avnish Kumar Rastogi, Nitin Narang, Mohammad Ajmal