Patents by Inventor Rohit Ranchal

Rohit Ranchal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Data deduplication in data platforms

Patent number: 11586598

Abstract: One embodiment of the invention provides a method for data deduplication storage management in a data platform including a plurality of data stores. The method comprises, for each data store of the plurality of data stores, determining a corresponding multi-level signature mapping data content of the data store into an ordered logical form comprising a plurality of data abstraction levels, determining a data similarity between the data store and each other data store of the plurality of data stores based on the multi-level signature corresponding to the data store and another multi-level signature corresponding to the other data store, and determining data usage of the data content of the data store. The method further comprises improving storage in the data platform by detecting duplicate data across the plurality of data stores based on each data similarity determined and each data usage determined.

Type: Grant

Filed: October 12, 2021

Date of Patent: February 21, 2023

Assignee: International Business Machines Corporation

Inventors: Rohit Ranchal, Aris Gkoulalas-Divanis, Paul R. Bastide
Speaker identity and content de-identification

Patent number: 11580951

Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.

Type: Grant

Filed: October 27, 2021

Date of Patent: February 14, 2023

Assignee: International Business Machines Corporation

Inventors: Aris Gkoulalas-Divanis, Xu Wang, Paul R. Bastide, Rohit Ranchal
Hierarchical federated learning using access permissions

Patent number: 11500929

Abstract: A method, apparatus, system, and computer program product for training a global machine learning model. A hierarchical structure for nodes in which the global machine learning model is located at a primary node in the hierarchical structure is identified. Authorized nodes in which local data is authorized for use in training in the authorized nodes for a local training of local machine learning models are determined. The machine learning models in the authorized nodes are trained using the local data in the authorized nodes to generate local model updates to weights in the local machine learning models. The local model updates to the weights are propagated upward in the hierarchical structure to the global machine learning model, wherein a node receiving local model updates to the weights from nodes from a lower level aggregates the weights in the local model updates received from the nodes in the lower level.

Type: Grant

Filed: November 7, 2019

Date of Patent: November 15, 2022

Assignee: International Business Machines Corporation

Inventors: Olivia Choudhury, Rohit Ranchal, HariGovind Venkatraj Ramasamy, Amarendra Das
Data leakage and misuse detection

Patent number: 11455391

Abstract: A computer-implemented system and method for a data leakage and misuse detection system comprises receiving an evaluation dataset A, and building a signature of the evaluation dataset A (sig(A)), where A signature of a dataset is a multi-level evaluation data abstraction representation of the dataset. The method further comprises building a signature for each of existing datasets B (B1, B2, . . . , Bn) (sig(Bx)) that are stored in a memory. The method then compares the sig(A) with each of the sig(Bx)s. A similarity score is derived based on the comparing, and responsive to determining the similarity score exceeds a predefined threshold, the method comprises generating an activity related to the determination.

Type: Grant

Filed: October 28, 2020

Date of Patent: September 27, 2022

Assignee: International Business Machines Corporation

Inventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Rohit Ranchal
Attribute-based quasi-identifier discovery

Patent number: 11456996

Abstract: A method, system, and computer program product for privacy protection of records based on attribute-based determination of quasi-identifiers within the records is provided. The method receives a first set of records containing a first set of attributes for a set of individuals. The method receives a second set of records for the set of individuals, with the second set of records containing a second set of attributes. A first set of quasi-identifiers, based on the first set of attributes, is accessed for the first set of records. The method determines a set of new attributes of the second set of attributes based on the first set of attributes. A second set of quasi-identifiers is generated based on the first set of quasi-identifiers and the set of new attributes. The method generates an anonymized set of records from the second set of records based on the second set of quasi-identifiers.

Type: Grant

Filed: December 10, 2019

Date of Patent: September 27, 2022

Assignee: International Business Machines Corporation

Inventors: Aris Gkoulalas-Divanis, Rohit Ranchal, Paul R. Bastide
Utility-preserving text de-identification with privacy guarantees

Patent number: 11449674

Abstract: One embodiment of the invention provides a method for utility-preserving text de-identification. The method comprises generating corresponding processed text for each text document by applying at least one natural language processor (NLP) annotator to the text document to recognize and tag privacy-sensitive personal information corresponding to an individual, and replacing some words in the text document with some replacement values. The method further comprises determining infrequent terms occurring across all processed texts, filtering out the infrequent terms from the processed texts, and selectively reinstating to the processed texts at least one of the infrequent terms that is innocuous.

Type: Grant

Filed: April 28, 2020

Date of Patent: September 20, 2022

Assignee: International Business Machines Corporation

Inventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Xu Wang, Rohit Ranchal
PRIVACY CHANGE RISK REMEDIATION FOR DEPENDENT PRODUCT CODE

Publication number: 20220222370

Abstract: Examples described herein provide a computer-implemented method that includes scanning, by a processing device, a code dependency list and a hierarchy of a core code component. The method further includes pulling, by the processing device, data of the core code using the scanned code dependency list. The method further includes extracting, by the processing device, information from the data for each dependency. The method further includes scoring, by the processing device, the information between versions to detect a likelihood of user data posture changes. The method further includes enforcing, by the processing device, a compensating control for the core code.

Type: Application

Filed: January 12, 2021

Publication date: July 14, 2022

Inventors: Paul R. Bastide, Xu Wang, Rohit Ranchal, Senthil Bakthavachalam, Shakil Manzoor Khan
Data lineage and data provenance enhancement

Patent number: 11334268

Abstract: One embodiment of the invention provides a method for data lineage and data provenance enhancement. The method comprises arranging a data set into a logical ordering, and partitioning the data set into at least one set of partitions based on the logical ordering. The method further comprises, for each partition of the at least one set of partitions, determining a corresponding score for the partition, and determining a data similarity between the partition and each other partition of each other data set based on the corresponding score for the partition and another score corresponding to the other partition. The method further comprises determining data lineage of the data set based on each data similarity determined.

Type: Grant

Filed: January 10, 2020

Date of Patent: May 17, 2022

Assignee: International Business Machines Corporation

Inventors: Paul R. Bastide, Aris Gkoulalas-Divanis, Rohit Ranchal
DATA LEAKAGE AND MISUSE DETECTION

Publication number: 20220129548

Abstract: A computer-implemented system and method for a data leakage and misuse detection system comprises receiving an evaluation dataset A, and building a signature of the evaluation dataset A (sig(A)), where A signature of a dataset is a multi-level evaluation data abstraction representation of the dataset. The method further comprises building a signature for each of existing datasets B (B1, B2, . . . , Bn) (sig(Bx)) that are stored in a memory. The method then compares the sig(A) with each of the sig(Bx)s. A similarity score is derived based on the comparing, and responsive to determining the similarity score exceeds a predefined threshold, the method comprises generating an activity related to the determination.

Type: Application

Filed: October 28, 2020

Publication date: April 28, 2022

Inventors: ARIS GKOULALAS-DIVANIS, Paul R. Bastide, Rohit Ranchal
Managing personalized substance administration

Patent number: 11250939

Abstract: A computer system manages administration of substances. Information is received including administration of one or more substances and user preferences for the administration. Features of the received information are extracted, and interactions between the one or more substances and a new substance for administration are identified based on the extracted features. A schedule is generated for administration of the one or more substances and the new substance based on the interactions. Embodiments of the present invention further include a method and program product for managing administration of substances in substantially the same manner described above.

Type: Grant

Filed: January 15, 2019

Date of Patent: February 15, 2022

Assignee: International Business Machines Corporation

Inventors: Rohit Ranchal, Fang Lu, Paul R. Bastide, Grant Covell
DATA DEDUPLICATION IN DATA PLATFORMS

Publication number: 20220043789

Abstract: One embodiment of the invention provides a method for data deduplication storage management in a data platform including a plurality of data stores. The method comprises, for each data store of the plurality of data stores, determining a corresponding multi-level signature mapping data content of the data store into an ordered logical form comprising a plurality of data abstraction levels, determining a data similarity between the data store and each other data store of the plurality of data stores based on the multi-level signature corresponding to the data store and another multi-level signature corresponding to the other data store, and determining data usage of the data content of the data store. The method further comprises improving storage in the data platform by detecting duplicate data across the plurality of data stores based on each data similarity determined and each data usage determined.

Type: Application

Filed: October 12, 2021

Publication date: February 10, 2022

Inventors: Rohit Ranchal, Aris Gkoulalas-Divanis, Paul R. Bastide
SPEAKER IDENTITY AND CONTENT DE-IDENTIFICATION

Publication number: 20220044667

Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.

Type: Application

Filed: October 27, 2021

Publication date: February 10, 2022

Inventors: Aris Gkoulalas-Divanis, Xu Wang, Paul R. Bastide, Rohit Ranchal
Dataset origin anonymization and filtration

Patent number: 11216589

Abstract: Embodiments also include a method for filtering and securing content of datasets in computer readable form designated for release to reduce discernable inferences therein. The method includes receiving a first dataset having first records associated with a quasi-identifier. The first records have respective first data values associated with the quasi-identifier. The method includes receiving a second dataset having second records associated with the quasi-identifier. The second records have respective second data values associated with the quasi-identifier. The method includes defining a first cluster having a first boundary based on a combination of the first dataset and the second dataset. The method includes replacing a first one of the first data values with the first boundary and a second one of the second data values with the first boundary.

Type: Grant

Filed: March 11, 2020

Date of Patent: January 4, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Rohit Ranchal
Speaker identity and content de-identification

Patent number: 11217223

Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.

Type: Grant

Filed: April 28, 2020

Date of Patent: January 4, 2022

Assignee: International Business Machines Corporation

Inventors: Aris Gkoulalas-Divanis, Xu Wang, Paul R. Bastide, Rohit Ranchal
Healthcare risk analytics

Patent number: 11182721

Abstract: An approach is provided in which an information handling system trains on a set of historical data that includes a set of first infractions caused by a set of first businesses and a set of fines imposed on the set of first businesses based on the set of first infractions. The trained information handling system then performs a risk assessment of a second business that includes predicting a set of possible infractions of the second business based on a set of characteristics of the second business. Then, the information handling system predicts a set of possible fines corresponding to the set of possible infractions based on the historical data. In turn, the information handling system generates a risk report that includes the set of possible infractions and the corresponding set of possible fines.

Type: Grant

Filed: May 22, 2018

Date of Patent: November 23, 2021

Assignee: International Business Machines Corporation

Inventors: Cindy Harro, Mark E. Elliott, Thomas H. Rogers, Paul R. Bastide, Aashita Shekhar, Rohit Ranchal
Data deduplication in data platforms

Patent number: 11182359

Abstract: One embodiment of the invention provides a method for data deduplication storage management in a data platform including a plurality of data stores. The method comprises, for each data store of the plurality of data stores, determining a corresponding multi-level signature mapping data content of the data store into an ordered logical form comprising a plurality of data abstraction levels, determining a data similarity between the data store and each other data store of the plurality of data stores based on the multi-level signature corresponding to the data store and another multi-level signature corresponding to the other data store, and determining data usage of the data content of the data store. The method further comprises improving storage in the data platform by detecting duplicate data across the plurality of data stores based on each data similarity determined and each data usage determined.

Type: Grant

Filed: January 10, 2020

Date of Patent: November 23, 2021

Assignee: International Business Machines Corporation

Inventors: Rohit Ranchal, Aris Gkoulalas-Divanis, Paul R. Bastide
Evidence mining for compliance management

Patent number: 11178022

Abstract: A method, apparatus, system, and computer program product for facilitating evidence collection. A set of evidence requirements is identified by computer system in which the set of evidence requirements is for a control that manages a set of resources in the computer system to enforce a policy in the computer system. Labels are associated by the computer system with historical evidence for the set of requirements. The historical evidence comprises prior evidence collected for compliance with the set of evidence requirements for the control and was accepted to meet the set of evidence requirements for the control. The historical evidence with the labels form labeled historical evidence. Rules for mining evidence for the set of evidence requirements for the control using the labeled historical evidence are learned by a machine learning model in the computer system.

Type: Grant

Filed: September 26, 2019

Date of Patent: November 16, 2021

Assignee: International Business Machines Corporation

Inventors: Rohit Ranchal, Uttam Thakore, HariGovind Venkatraj Ramasamy, Yi-hsiu Wei
SPEAKER IDENTITY AND CONTENT DE-IDENTIFICATION

Publication number: 20210335337

Abstract: One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content.

Type: Application

Filed: April 28, 2020

Publication date: October 28, 2021

Inventors: Aris GKOULALAS-DIVANIS, Xu Wang, Paul R. Bastide, Rohit Ranchal
UTILITY-PRESERVING TEXT DE-IDENTIFICATION WITH PRIVACY GUARANTEES

Publication number: 20210334455

Abstract: One embodiment of the invention provides a method for utility-preserving text de-identification. The method comprises generating corresponding processed text for each text document by applying at least one natural language processor (NLP) annotator to the text document to recognize and tag privacy-sensitive personal information corresponding to an individual, and replacing some words in the text document with some replacement values. The method further comprises determining infrequent terms occurring across all processed texts, filtering out the infrequent terms from the processed texts, and selectively reinstating to the processed texts at least one of the infrequent terms that is innocuous.

Type: Application

Filed: April 28, 2020

Publication date: October 28, 2021

Inventors: Aris GKOULALAS-DIVANIS, Paul R. Bastide, Xu Wang, Rohit Ranchal
DATASET ORIGIN ANONYMIZATION AND FILTRATION

Publication number: 20210286898

Abstract: Embodiments also include a method for filtering and securing content of datasets in computer readable form designated for release to reduce discernable inferences therein. The method includes receiving a first dataset having first records associated with a quasi-identifier. The first records have respective first data values associated with the quasi-identifier. The method includes receiving a second dataset having second records associated with the quasi-identifier. The second records have respective second data values associated with the quasi-identifier. The method includes defining a first cluster having a first boundary based on a combination of the first dataset and the second dataset. The method includes replacing a first one of the first data values with the first boundary and a second one of the second data values with the first boundary.

Type: Application

Filed: March 11, 2020

Publication date: September 16, 2021

Inventors: Aris Gkoulalas-Divanis, Paul R. Bastide, Rohit Ranchal

1 2 3 next