Patents by Inventor Avnish Kumar Rastogi

Avnish Kumar Rastogi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ADAPATIVE SYSTEM FOR PROCESSING DISTRIBUTED DATA FILES AND A METHOD THEREOF

Publication number: 20230289370

Abstract: The present disclosure relates to a system and a method for processing distributed data files. The processor executes instructions to receive a set of instructions from a primary device, wherein the set of instructions comprises verification rules, validators, primary transformers and structure query transformers; generate processed data files by processing the distributed data files. The distributed data files are processed by performing at least one of: executing one of the verification rules, the validators and the primary transformers on the distributed data files; and transforming the distributed data files by executing the structure query transformers. The execution of the structured query transformers comprises steps of generating a dependency graph based upon dependencies between the structure query transformers; and determining a sequence of execution of the structured query transformers based upon the dependency graph; and transfer the processed data files to a data warehouse.

Type: Application

Filed: May 23, 2023

Publication date: September 14, 2023

Inventors: AVNISH KUMAR RASTOGI, NITIN NARANG, MOHAMMAD AJMAL
System and method for processing skewed datasets

Patent number: 11727009

Abstract: Disclosed is a method and system for processing skewed datasets. The processor 202 is configured to capture a broadcast size of non-skewed datasets to be loaded onto a memory associated with one or more nodes in a distributed system. The skewed dataset is identified from two or more datasets to be joined. Each of the non-skewed dataset is divided into a plurality of non-skewed data chunks at the node and each of the non-skewed data chunk is broadcasted to one or more nodes having the skewed dataset. The joining operation is then performed between each of the skewed dataset and the non-skewed data chunk till all the non-skewed data chunks are consumed in the join operation. Resultant joined dataset is then collected as a single joined dataset from the nodes involved in the joining operation.

Type: Grant

Filed: September 29, 2020

Date of Patent: August 15, 2023

Inventor: Avnish Kumar Rastogi
Adapative system for processing distributed data files and a method thereof

Patent number: 11693884

Abstract: The present disclosure relates to a system and a method for processing distributed data files. The processor executes instructions to receive a set of instructions from a primary device, wherein the set of instructions comprises verification rules, validators, primary transformers and structure query transformers; generate processed data files by processing the distributed data files. The distributed data files are processed by performing at least one of: executing one of the verification rules, the validators and the primary transformers on the distributed data files; and transforming the distributed data files by executing the structure query transformers. The execution of the structured query transformers comprises steps of generating a dependency graph based upon dependencies between the structure query transformers; and determining a sequence of execution of the structured query transformers based upon the dependency graph; and transfer the processed data files to a data warehouse.

Type: Grant

Filed: March 4, 2020

Date of Patent: July 4, 2023

Assignee: HCL TECHNOLOGIES LIMITED

Inventors: Avnish Kumar Rastogi, Nitin Narang, Mohammad Ajmal
System and method for joining skewed datasets in a distributed computing environment

Patent number: 11615094

Abstract: Disclosed is a method and system for joining datasets in a distributed computing environment. The system comprises a memory 206 and a processor 202. The processor 202 identifies a skewed dataset from two or more datasets to be joined. The processor 202 identifies a replication parameter from a configuration file. The processor 202 then assigns a randomly assigned machine number to each chunk of the skewed dataset owned by the nodes/machines involved in the join operation. The processor 202 forms copies of the non-skewed dataset equal to the replication parameter and adds the copy number to each sample of the copy of the non-skewed dataset formed. Further, the processor 202 merges each non-skewed dataset into the final copy of the non-skewed dataset, forming a single non skewed dataset. The processor 202 then repeats these steps for all the non-skewed datasets involved in the join operation resulting in generation of merged copies of all the non-skewed datasets and then performs the joining operation.

Type: Grant

Filed: August 12, 2020

Date of Patent: March 28, 2023

Assignee: HCL TECHNOLOGIES LIMITED

Inventor: Avnish Kumar Rastogi
SYSTEM AND METHOD FOR PROCESSING SKEWED DATASETS

Publication number: 20220100752

Abstract: Disclosed is a method and system for processing skewed datasets. The processor 202 is configured to capture a broadcast size of non-skewed datasets to be loaded onto a memory associated with one or more nodes in a distributed system. The skewed dataset is identified from two or more datasets to be joined. Each of the non-skewed dataset is divided into a plurality of non-skewed data chunks at the node and each of the non-skewed data chunk is broadcasted to one or more nodes having the skewed dataset. The joining operation is then performed between each of the skewed dataset and the non-skewed data chunk till all the non-skewed data chunks are consumed in the join operation. Resultant joined dataset is then collected as a single joined dataset from the nodes involved in the joining operation.

Type: Application

Filed: September 29, 2020

Publication date: March 31, 2022

Inventor: Avnish Kumar RASTOGI
SYSTEM AND METHOD FOR JOINING SKEWED DATASETS IN A DISTRIBUTED COMPUTING ENVIRONMENT

Publication number: 20220050845

Abstract: Disclosed is a method and system for joining datasets in a distributed computing environment. The system comprises a memory 206 and a processor 202. The processor 202 identifies a skewed dataset from two or more datasets to be joined. The processor 202 identifies a replication parameter from a configuration file. The processor 202 then assigns a randomly assigned machine number to each chunk of the skewed dataset owned by the nodes/machines involved in the join operation. The processor 202 forms copies of the non-skewed dataset equal to the replication parameter and adds the copy number to each sample of the copy of the non-skewed dataset formed. Further, the processor 202 merges each non-skewed dataset into the final copy of the non-skewed dataset, forming a single non skewed dataset. The processor 202 then repeats these steps for all the non-skewed datasets involved in the join operation resulting in generation of merged copies of all the non-skewed datasets and then performs the joining operation.

Type: Application

Filed: August 12, 2020

Publication date: February 17, 2022

Applicant: HCL TECHNOLOGIES LIMITED

Inventor: Avnish Kumar RASTOGI
System and method for generating synthetic data for minority classes in a large dataset

Patent number: 11126642

Abstract: Disclosed method for generating synthetic data for minority classes in a very large dataset comprises grouping samples stored on several devices, into different groups. A pivot is identified to be used as a reference for grouping the samples into bins. The samples are assigned to a bin, based on a closest pivot. The samples are regrouped into different groups, based on identities of the bins, and each of the groups is distributed to the several devices. Samples belonging to majority class and minority classes for which synthetic data is not being generated are removed from each of the different groups. Samples of each of these groups are arranged in different M-Trees to facilitate identification of K-nearest neighbours for each sample within each of the different groups to generate K pairs of nearest neighbours. Finally, synthetic samples are generated for the K pairs of nearest neighbours by creating random samples.

Type: Grant

Filed: July 29, 2019

Date of Patent: September 21, 2021

Inventors: Avnish Kumar Rastogi, Nitin Narang, Mohammad Ajmal
ADAPATIVE SYSTEM FOR PROCESSING DISTRIBUTED DATA FILES AND A METHOD THEREOF

Publication number: 20210279259

Abstract: The present disclosure relates to a system and a method for processing distributed data files. The processor executes instructions to receive a set of instructions from a primary device, wherein the set of instructions comprises verification rules, validators, primary transformers and structure query transformers; generate processed data files by processing the distributed data files. The distributed data files are processed by performing at least one of: executing one of the verification rules, the validators and the primary transformers on the distributed data files; and transforming the distributed data files by executing the structure query transformers. The execution of the structured query transformers comprises steps of generating a dependency graph based upon dependencies between the structure query transformers; and determining a sequence of execution of the structured query transformers based upon the dependency graph; and transfer the processed data files to a data warehouse.

Type: Application

Filed: March 4, 2020

Publication date: September 9, 2021

Applicant: HCL TECHNOLOGIES LIMITED

Inventors: Avnish Kumar RASTOGI, Nitin NARANG, Mohammad AJMAL
SYSTEM AND METHOD FOR GENERATING SYNTHETIC DATA FOR MINORITY CLASSES IN A LARGE DATASET

Publication number: 20210034645

Abstract: Disclosed method for generating synthetic data for minority classes in a very large dataset comprises grouping samples stored on several devices, into different groups. A pivot is identified to be used as a reference for grouping the samples into bins. The samples are assigned to a bin, based on a closest pivot. The samples are regrouped into different groups, based on identities of the bins, and each of the groups is distributed to the several devices. Samples belonging to majority class and minority classes for which synthetic data is not being generated are removed from each of the different groups. Samples of each of these groups are arranged in different M-Trees to facilitate identification of K-nearest neighbours for each sample within each of the different groups to generate K pairs of nearest neighbours. Finally, synthetic samples are generated for the K pairs of nearest neighbours by creating random samples.

Type: Application

Filed: July 29, 2019

Publication date: February 4, 2021

Inventors: Avnish Kumar Rastogi, Nitin Narang, Mohammad Ajmal