Patents by Inventor Rohit Ranchal
Rohit Ranchal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11586598Abstract: One embodiment of the invention provides a method for data deduplication storage management in a data platform including a plurality of data stores. The method comprises, for each data store of the plurality of data stores, determining a corresponding multi-level signature mapping data content of the data store into an ordered logical form comprising a plurality of data abstraction levels, determining a data similarity between the data store and each other data store of the plurality of data stores based on the multi-level signature corresponding to the data store and another multi-level signature corresponding to the other data store, and determining data usage of the data content of the data store. The method further comprises improving storage in the data platform by detecting duplicate data across the plurality of data stores based on each data similarity determined and each data usage determined.Type: GrantFiled: October 12, 2021Date of Patent: February 21, 2023Assignee: International Business Machines CorporationInventors: Rohit Ranchal, Aris Gkoulalas-Divanis, Paul R. Bastide
-
Patent number: 11580951Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.Type: GrantFiled: October 27, 2021Date of Patent: February 14, 2023Assignee: International Business Machines CorporationInventors: Aris Gkoulalas-Divanis, Xu Wang, Paul R. Bastide, Rohit Ranchal
-
Patent number: 11500929Abstract: A method, apparatus, system, and computer program product for training a global machine learning model. A hierarchical structure for nodes in which the global machine learning model is located at a primary node in the hierarchical structure is identified. Authorized nodes in which local data is authorized for use in training in the authorized nodes for a local training of local machine learning models are determined. The machine learning models in the authorized nodes are trained using the local data in the authorized nodes to generate local model updates to weights in the local machine learning models. The local model updates to the weights are propagated upward in the hierarchical structure to the global machine learning model, wherein a node receiving local model updates to the weights from nodes from a lower level aggregates the weights in the local model updates received from the nodes in the lower level.Type: GrantFiled: November 7, 2019Date of Patent: November 15, 2022Assignee: International Business Machines CorporationInventors: Olivia Choudhury, Rohit Ranchal, HariGovind Venkatraj Ramasamy, Amarendra Das
-
Patent number: 11456996Abstract: A method, system, and computer program product for privacy protection of records based on attribute-based determination of quasi-identifiers within the records is provided. The method receives a first set of records containing a first set of attributes for a set of individuals. The method receives a second set of records for the set of individuals, with the second set of records containing a second set of attributes. A first set of quasi-identifiers, based on the first set of attributes, is accessed for the first set of records. The method determines a set of new attributes of the second set of attributes based on the first set of attributes. A second set of quasi-identifiers is generated based on the first set of quasi-identifiers and the set of new attributes. The method generates an anonymized set of records from the second set of records based on the second set of quasi-identifiers.Type: GrantFiled: December 10, 2019Date of Patent: September 27, 2022Assignee: International Business Machines CorporationInventors: Aris Gkoulalas-Divanis, Rohit Ranchal, Paul R. Bastide
-
Patent number: 11455391Abstract: A computer-implemented system and method for a data leakage and misuse detection system comprises receiving an evaluation dataset A, and building a signature of the evaluation dataset A (sig(A)), where A signature of a dataset is a multi-level evaluation data abstraction representation of the dataset. The method further comprises building a signature for each of existing datasets B (B1, B2, . . . , Bn) (sig(Bx)) that are stored in a memory. The method then compares the sig(A) with each of the sig(Bx)s. A similarity score is derived based on the comparing, and responsive to determining the similarity score exceeds a predefined threshold, the method comprises generating an activity related to the determination.Type: GrantFiled: October 28, 2020Date of Patent: September 27, 2022Assignee: International Business Machines CorporationInventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Rohit Ranchal
-
Patent number: 11449674Abstract: One embodiment of the invention provides a method for utility-preserving text de-identification. The method comprises generating corresponding processed text for each text document by applying at least one natural language processor (NLP) annotator to the text document to recognize and tag privacy-sensitive personal information corresponding to an individual, and replacing some words in the text document with some replacement values. The method further comprises determining infrequent terms occurring across all processed texts, filtering out the infrequent terms from the processed texts, and selectively reinstating to the processed texts at least one of the infrequent terms that is innocuous.Type: GrantFiled: April 28, 2020Date of Patent: September 20, 2022Assignee: International Business Machines CorporationInventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Xu Wang, Rohit Ranchal
-
Publication number: 20220222370Abstract: Examples described herein provide a computer-implemented method that includes scanning, by a processing device, a code dependency list and a hierarchy of a core code component. The method further includes pulling, by the processing device, data of the core code using the scanned code dependency list. The method further includes extracting, by the processing device, information from the data for each dependency. The method further includes scoring, by the processing device, the information between versions to detect a likelihood of user data posture changes. The method further includes enforcing, by the processing device, a compensating control for the core code.Type: ApplicationFiled: January 12, 2021Publication date: July 14, 2022Inventors: Paul R. Bastide, Xu Wang, Rohit Ranchal, Senthil Bakthavachalam, Shakil Manzoor Khan
-
Patent number: 11334268Abstract: One embodiment of the invention provides a method for data lineage and data provenance enhancement. The method comprises arranging a data set into a logical ordering, and partitioning the data set into at least one set of partitions based on the logical ordering. The method further comprises, for each partition of the at least one set of partitions, determining a corresponding score for the partition, and determining a data similarity between the partition and each other partition of each other data set based on the corresponding score for the partition and another score corresponding to the other partition. The method further comprises determining data lineage of the data set based on each data similarity determined.Type: GrantFiled: January 10, 2020Date of Patent: May 17, 2022Assignee: International Business Machines CorporationInventors: Paul R. Bastide, Aris Gkoulalas-Divanis, Rohit Ranchal
-
Publication number: 20220129548Abstract: A computer-implemented system and method for a data leakage and misuse detection system comprises receiving an evaluation dataset A, and building a signature of the evaluation dataset A (sig(A)), where A signature of a dataset is a multi-level evaluation data abstraction representation of the dataset. The method further comprises building a signature for each of existing datasets B (B1, B2, . . . , Bn) (sig(Bx)) that are stored in a memory. The method then compares the sig(A) with each of the sig(Bx)s. A similarity score is derived based on the comparing, and responsive to determining the similarity score exceeds a predefined threshold, the method comprises generating an activity related to the determination.Type: ApplicationFiled: October 28, 2020Publication date: April 28, 2022Inventors: ARIS GKOULALAS-DIVANIS, Paul R. Bastide, Rohit Ranchal
-
Patent number: 11250939Abstract: A computer system manages administration of substances. Information is received including administration of one or more substances and user preferences for the administration. Features of the received information are extracted, and interactions between the one or more substances and a new substance for administration are identified based on the extracted features. A schedule is generated for administration of the one or more substances and the new substance based on the interactions. Embodiments of the present invention further include a method and program product for managing administration of substances in substantially the same manner described above.Type: GrantFiled: January 15, 2019Date of Patent: February 15, 2022Assignee: International Business Machines CorporationInventors: Rohit Ranchal, Fang Lu, Paul R. Bastide, Grant Covell
-
Publication number: 20220043789Abstract: One embodiment of the invention provides a method for data deduplication storage management in a data platform including a plurality of data stores. The method comprises, for each data store of the plurality of data stores, determining a corresponding multi-level signature mapping data content of the data store into an ordered logical form comprising a plurality of data abstraction levels, determining a data similarity between the data store and each other data store of the plurality of data stores based on the multi-level signature corresponding to the data store and another multi-level signature corresponding to the other data store, and determining data usage of the data content of the data store. The method further comprises improving storage in the data platform by detecting duplicate data across the plurality of data stores based on each data similarity determined and each data usage determined.Type: ApplicationFiled: October 12, 2021Publication date: February 10, 2022Inventors: Rohit Ranchal, Aris Gkoulalas-Divanis, Paul R. Bastide
-
Publication number: 20220044667Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.Type: ApplicationFiled: October 27, 2021Publication date: February 10, 2022Inventors: Aris Gkoulalas-Divanis, Xu Wang, Paul R. Bastide, Rohit Ranchal
-
Patent number: 11217223Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.Type: GrantFiled: April 28, 2020Date of Patent: January 4, 2022Assignee: International Business Machines CorporationInventors: Aris Gkoulalas-Divanis, Xu Wang, Paul R. Bastide, Rohit Ranchal
-
Patent number: 11216589Abstract: Embodiments also include a method for filtering and securing content of datasets in computer readable form designated for release to reduce discernable inferences therein. The method includes receiving a first dataset having first records associated with a quasi-identifier. The first records have respective first data values associated with the quasi-identifier. The method includes receiving a second dataset having second records associated with the quasi-identifier. The second records have respective second data values associated with the quasi-identifier. The method includes defining a first cluster having a first boundary based on a combination of the first dataset and the second dataset. The method includes replacing a first one of the first data values with the first boundary and a second one of the second data values with the first boundary.Type: GrantFiled: March 11, 2020Date of Patent: January 4, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Rohit Ranchal
-
Patent number: 11182359Abstract: One embodiment of the invention provides a method for data deduplication storage management in a data platform including a plurality of data stores. The method comprises, for each data store of the plurality of data stores, determining a corresponding multi-level signature mapping data content of the data store into an ordered logical form comprising a plurality of data abstraction levels, determining a data similarity between the data store and each other data store of the plurality of data stores based on the multi-level signature corresponding to the data store and another multi-level signature corresponding to the other data store, and determining data usage of the data content of the data store. The method further comprises improving storage in the data platform by detecting duplicate data across the plurality of data stores based on each data similarity determined and each data usage determined.Type: GrantFiled: January 10, 2020Date of Patent: November 23, 2021Assignee: International Business Machines CorporationInventors: Rohit Ranchal, Aris Gkoulalas-Divanis, Paul R. Bastide
-
Patent number: 11182721Abstract: An approach is provided in which an information handling system trains on a set of historical data that includes a set of first infractions caused by a set of first businesses and a set of fines imposed on the set of first businesses based on the set of first infractions. The trained information handling system then performs a risk assessment of a second business that includes predicting a set of possible infractions of the second business based on a set of characteristics of the second business. Then, the information handling system predicts a set of possible fines corresponding to the set of possible infractions based on the historical data. In turn, the information handling system generates a risk report that includes the set of possible infractions and the corresponding set of possible fines.Type: GrantFiled: May 22, 2018Date of Patent: November 23, 2021Assignee: International Business Machines CorporationInventors: Cindy Harro, Mark E. Elliott, Thomas H. Rogers, Paul R. Bastide, Aashita Shekhar, Rohit Ranchal
-
Patent number: 11178022Abstract: A method, apparatus, system, and computer program product for facilitating evidence collection. A set of evidence requirements is identified by computer system in which the set of evidence requirements is for a control that manages a set of resources in the computer system to enforce a policy in the computer system. Labels are associated by the computer system with historical evidence for the set of requirements. The historical evidence comprises prior evidence collected for compliance with the set of evidence requirements for the control and was accepted to meet the set of evidence requirements for the control. The historical evidence with the labels form labeled historical evidence. Rules for mining evidence for the set of evidence requirements for the control using the labeled historical evidence are learned by a machine learning model in the computer system.Type: GrantFiled: September 26, 2019Date of Patent: November 16, 2021Assignee: International Business Machines CorporationInventors: Rohit Ranchal, Uttam Thakore, HariGovind Venkatraj Ramasamy, Yi-hsiu Wei
-
Publication number: 20210335337Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.Type: ApplicationFiled: April 28, 2020Publication date: October 28, 2021Inventors: Aris GKOULALAS-DIVANIS, Xu Wang, Paul R. Bastide, Rohit Ranchal
-
Publication number: 20210334455Abstract: One embodiment of the invention provides a method for utility-preserving text de-identification. The method comprises generating corresponding processed text for each text document by applying at least one natural language processor (NLP) annotator to the text document to recognize and tag privacy-sensitive personal information corresponding to an individual, and replacing some words in the text document with some replacement values. The method further comprises determining infrequent terms occurring across all processed texts, filtering out the infrequent terms from the processed texts, and selectively reinstating to the processed texts at least one of the infrequent terms that is innocuous.Type: ApplicationFiled: April 28, 2020Publication date: October 28, 2021Inventors: Aris GKOULALAS-DIVANIS, Paul R. Bastide, Xu Wang, Rohit Ranchal
-
Publication number: 20210286898Abstract: Embodiments also include a method for filtering and securing content of datasets in computer readable form designated for release to reduce discernable inferences therein. The method includes receiving a first dataset having first records associated with a quasi-identifier. The first records have respective first data values associated with the quasi-identifier. The method includes receiving a second dataset having second records associated with the quasi-identifier. The second records have respective second data values associated with the quasi-identifier. The method includes defining a first cluster having a first boundary based on a combination of the first dataset and the second dataset. The method includes replacing a first one of the first data values with the first boundary and a second one of the second data values with the first boundary.Type: ApplicationFiled: March 11, 2020Publication date: September 16, 2021Inventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Rohit Ranchal