Patents by Inventor Arun Kumar Jagota

Arun Kumar Jagota has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210241047
    Abstract: An online system performs predictions for real-time tasks and near real-time tasks that need to be performed by a deadline. A client device receives a real-time machine learning based model associated with a measure of accuracy. If the client device determines that a task can be performed using predictions having less than the specified measure of accuracy, the client device uses the real-time machine learning based model. If the client device determines that a higher level of accuracy of results is required, the client device sends a request to an online system. The online system provides a prediction along with a string representing a rationale for the prediction.
    Type: Application
    Filed: January 31, 2020
    Publication date: August 5, 2021
    Inventors: Rakesh Ganapathi Karanth, Arun Kumar Jagota, Kaushal Bansal, Amrita Dasgupta
  • Publication number: 20210241179
    Abstract: An online system performs predictions for real-time tasks and near real-time tasks that need to be performed by a deadline. A client device receives a real-time machine learning based model associated with a measure of accuracy. If the client device determines that a task can be performed using predictions having less than the specified measure of accuracy, the client device uses the real-time machine learning based model. If the client device determines that a higher level of accuracy of results is required, the client device sends a request to an online system. The online system provides a prediction along with a string representing a rationale for the prediction.
    Type: Application
    Filed: January 30, 2020
    Publication date: August 5, 2021
    Inventors: Rakesh Ganapathi Karanth, Arun Kumar Jagota, Kaushal Bansal, Amrita Dasgupta
  • Publication number: 20210232637
    Abstract: Determine first count of first records storing first value in first field, second count of second records storing second value in second field, third count of third records storing third value in third field. Determine count threshold using first, second and third counts, dispersion measure based on dispersion of values stored in second field by first records and other dispersion measure based on other dispersion of values stored in third field by first records. Train machine-learning model to determine dispersion measure threshold based on dispersion and other dispersion measures. If first count is greater than count threshold, and dispersion measure is greater than dispersion measure threshold, create match index based on first and second fields. Receive prospective record storing first value in first field, second value in second field. Use match index to identify record storing first value in first field, second value in second field as matching prospective record.
    Type: Application
    Filed: January 29, 2020
    Publication date: July 29, 2021
    Inventors: Arun Kumar Jagota, Ajitesh Jain, Rahul Mathias Madan, Shravani Madhavaram
  • Publication number: 20210224482
    Abstract: A system receives a record which includes a string and separates the string into a number of tokens, including a token and another token. The system identifies a pattern that includes an entity, another entity, and a number of entities that equals the number of tokens, and another pattern that includes the same number of entities as the number of tokens. The system determines a combined probability that combines a probability based on the number of entries in the entity's dictionary which stores the token, and another probability based on a number of character types in the other entity that match characters in the other token. If the combined probability associated with the pattern is greater than another combined probability associated with the other pattern, the system matches the record to a system record based on recognizing the token as the entity and the other token as the other entity.
    Type: Application
    Filed: January 22, 2020
    Publication date: July 22, 2021
    Inventors: Arun Kumar Jagota, Ajitesh Jain
  • Publication number: 20210224614
    Abstract: A model is trained to create a probability distribution of counts based on counts of distinct values stored by person profiles in a field. The model is trained to create another probability distribution of counts based on other counts of other distinct values stored by the person profiles in another field. The count of distinct values stored by a person profile in the field is identified. Another count of distinct values stored by the person profile in the other field is identified. A score is determined based on a cumulative distribution function of the count under the probability distribution of counts. Another score is determined based on the cumulative distribution function of the other count under the other probability distribution of counts. If the score and the other score combine in an overall score that satisfies a threshold, a message is output about the person profile being suspected of corruption.
    Type: Application
    Filed: January 17, 2020
    Publication date: July 22, 2021
    Inventor: Arun Kumar Jagota
  • Patent number: 11016959
    Abstract: A system tokenizes values stored in a field by multiple records. The system creates a trie from the tokenized values, each branch in the trie labeled with one of the tokenized values, each node storing a count indicating the number of the multiple records associated with a tokenized value sequence beginning from a root of the trie. The system tokenizes a value stored in the field by a prospective record. Beginning from the root of the trie, the system identifies each node corresponding to a token value sequence for the prospective record's tokenized value. Beginning from the most recently identified node for the prospective record's token value sequence, the system identifies each extending node which stores a count that satisfies a threshold, each identified extending node corresponding to another token value sequence. The system uses the other token value sequence to identify one of the multiple records that matches the prospective record.
    Type: Grant
    Filed: January 31, 2018
    Date of Patent: May 25, 2021
    Assignee: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Ajitesh Jain, Dmytro Kudriavtsev
  • Patent number: 11010771
    Abstract: A system determines factored score by multiplying factor and match score for values of field in two records, offset score by adding offset to factored score, and weighted score by applying weight to offset score. The system determines status for two records based on combining weighted score with other weighted score corresponding to other field of two records. The system revises factor, offset, and weight based on feedback associated with two records. The system determines revised factored score by multiplying revised factor and match score for other values of field in two other records, revised offset score by adding revised offset to revised factored score, and revised weighted score by applying revised weight to revised offset score. The system determines learned status for two other records based on combining revised weighted score with additional weighted score corresponding to other field for two other records.
    Type: Grant
    Filed: January 31, 2019
    Date of Patent: May 18, 2021
    Assignee: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Piranavan Selvanandan
  • Publication number: 20210124779
    Abstract: A system creates a graph of nodes connected by edges, the nodes including: i) a first node associated with a first value and a count of the first value, and ii) a second node associated with a second value and a count of the second value, the edges including an edge that connects the first and second nodes and is associated with a count of instances of the first value being stored with the second value. The system includes each node and each associated with clique count less than clique threshold in keys sets and deletes each node and each edge associated with clique count less than clique threshold. The system identifies triplet nodes connected by triplet edges. If estimated clique count for triplet values represented by triplet nodes is less than clique threshold, the system includes triplet values in keys set and identify triplet of nodes as analyzed.
    Type: Application
    Filed: October 23, 2019
    Publication date: April 29, 2021
    Inventor: Arun Kumar Jagota
  • Patent number: 10956450
    Abstract: Some embodiments of the present invention include a method for determining a dense subset from a group of records using a graphical representation of the group of records, the graphical representation having nodes and edges, a node associated with a record from the group of records, an edge connecting two nodes associated with two related records, wherein a node is associated with a weight corresponding to a number of edges connected to the node, wherein a record is added to the dense subset based on its associated node having a highest weight and a density that satisfies a density threshold, the density being based on the content of the dense subset, and wherein the content of the dense subset is to be processed as including duplicate records.
    Type: Grant
    Filed: March 28, 2016
    Date of Patent: March 23, 2021
    Assignee: salesforce.com, inc.
    Inventors: Dai Duong Doan, Arun Kumar Jagota
  • Patent number: 10949395
    Abstract: Some embodiments of the present invention include a method for determining duplicate records in multiple objects and may include combining records associated with a first object with records associated with a second object to generate a third object, wherein the first object is related to the second object; performing de-duplication on the third object to generate a combined group of duplicate sets; and from the combined group of duplicate sets, identifying at least one duplicate set associated with both the first object and the second object based on the duplicate set having at least one record associated with the first object and at least one record associated with the second object.
    Type: Grant
    Filed: March 30, 2016
    Date of Patent: March 16, 2021
    Assignee: salesforce.com, inc.
    Inventors: Dai Duong Doan, Arun Kumar Jagota, Chenghung Ker, Parth Vaishnav, Danil Dvinov, Dmytro Kudriavtsev
  • Publication number: 20210034596
    Abstract: A training set is created via creating adjacent classified substrings by using character classes to replace corresponding characters in adjacent substrings in each training character string, and associating each pair of adjacent classified substrings and each pair of adjacent substrings with corresponding labels indicating whether corresponding pairs include any token boundary. The system splits input character string into beginning and ending parts and creates classified beginning part by replacing beginning part character with corresponding class and classified ending part by replacing ending part character with corresponding class. The machine-learning model determines probability of token identification, based on training set to determine count of instances that classified beginning part is paired with classified ending part and count of corresponding labels that indicate inclusion of any token boundary.
    Type: Application
    Filed: July 30, 2019
    Publication date: February 4, 2021
    Applicant: salesforce.com, inc.
    Inventor: Arun Kumar Jagota
  • Publication number: 20210034638
    Abstract: A system tokenizes raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences. A machine-learning model learns standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences. The system tokenizes an input value into an input token sequence. The machine-learning model determines a probability of inserting an insertion token after an insertion markable token in the input token sequence. If the probability of inserting the insertion token satisfies a threshold, the system inserts the insertion token after the insertion markable token in the input token sequence. The machine-learning model determines a probability of substituting a substitution token for a substitutable token in the input token sequence.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 4, 2021
    Applicant: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Stanislav Georgiev
  • Patent number: 10909575
    Abstract: New account recommendations for user account sets are described. A system creates an accounts profile for a set of accounts based on multiple attributes associated with each account of the set of accounts. The system calculates an account score for an account based on comparing multiple attributes associated with the account against the accounts profile, wherein the account is not in the set of accounts. The system determines whether the account score satisfies an account score threshold. The system recommends the account to a user associated with the set of accounts if the account score satisfies the account score threshold.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: February 2, 2021
    Assignee: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Sancho S. Pinto, Saurin G. Shah, Stanislav Georgiev
  • Patent number: 10901996
    Abstract: Some embodiments of the present invention include a method for identifying duplicate records from a group of records in a database system.
    Type: Grant
    Filed: February 24, 2016
    Date of Patent: January 26, 2021
    Assignee: salesforce.com, inc.
    Inventors: Dai Duong Doan, Arun Kumar Jagota, Chenghung Ker, Parth Vaishnav, Danil Dvinov, Dmytro Kudriavtsev
  • Publication number: 20200401595
    Abstract: A method and system for estimating a number of distinct entities in a set of records are described. For each one of a subset of records, a set of match rule keys are generated based on a set of match rules. Each match rule from the set of match rules defines a match between records, and each match rule key from the set of match rule keys includes at least a key field value. A high order key for the record is determined based on the match rule keys, and a counter associated with the high order key is incremented. When each record from the subset of records has been processed by determining the match rule keys, and incrementing the counter(s) of the high order keys, a sum of a number of counters that have a non-zero value is performed to estimate the distinct entities in the records.
    Type: Application
    Filed: June 21, 2019
    Publication date: December 24, 2020
    Applicant: Salesforce.com, inc.
    Inventor: Arun Kumar Jagota
  • Publication number: 20200401587
    Abstract: A method and system of matching field values of a field type are described. Blurring operations are applied on a first and second values to obtain blurred values. A first maximum score is determined from first scores for blurred values, where each one of the first scores is indicative of a confidence that a match of the first and the second values occurs with knowledge of a first blurred value. A second maximum score is determined from second scores for the blurred values, where each one of the second scores is indicative of a confidence that a non-match of the first and the second values occurs with knowledge of the first blurred value. Responsive to determining that the first maximum score is greater than the second maximum score, an indication that the first value matches the second value is output.
    Type: Application
    Filed: June 21, 2019
    Publication date: December 24, 2020
    Applicant: salesforce.com, inc.
    Inventor: Arun Kumar Jagota
  • Publication number: 20200356574
    Abstract: A system determines a name probability based on a first name dataset frequency of a first name value stored by a first name field in a personal record and a last name dataset frequency of a last name value stored by a last name field in a personal record. The system determines at least one other probability based on another dataset frequency of another value stored by another field in the personal record and an additional dataset frequency of an additional value stored by an additional field in the personal record. The system determines a combined probability based on the name probability and the at least one other probability. The system increments a count of identifiable personal records for each personal record that has a corresponding combined probability that satisfies an identifiability threshold. The system outputs a message based on the count of identifiable personal records.
    Type: Application
    Filed: May 10, 2019
    Publication date: November 12, 2020
    Applicant: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Stanislav Georgiev
  • Patent number: 10817465
    Abstract: A system identifies a first number of distinct values stored in a first field by a dataset of records. The system identifies a second number of distinct values stored in a second field by the dataset of records. The system creates a trie from values stored in a field by multiple records, the field corresponding to the first field or the second field, based on comparing the first number to the second number. The system associates a node in the trie with one of the multiple records, based on a value stored in the field by the record. The system identifies a branch sequence in the trie as a key for a prospective record, based on a prospective value stored in a corresponding field by the prospective record. The system uses the key for the prospective record to identify one of the multiple records that matches the prospective record.
    Type: Grant
    Filed: April 25, 2017
    Date of Patent: October 27, 2020
    Assignee: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Dmytro Kudriavtsev
  • Patent number: 10817479
    Abstract: Recommending data providers' datasets based on database value densities is described. A database system determines a provider dataset density for a value by identifying a frequency of the value in a dataset that is provided by a data provider. The database system determines a user database density for the value by identifying a frequency of the value in a database used by a data user. The database system determines a relative density based on a relationship between the provider dataset density and the user database density. The database system determines an evaluation metric for the value, based on a combination of the relative density and the user database density. The database system causes a recommendation to be outputted, based on a relationship of the evaluation metric relative to other evaluation metrics for other values, which recommends that the data user acquire at least a part of the dataset.
    Type: Grant
    Filed: June 23, 2017
    Date of Patent: October 27, 2020
    Assignee: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Marc Joseph Delurgio, Venkata Murali Tejomurtula
  • Patent number: 10817549
    Abstract: System creates three tries based on values stored in first three fields by records. System associates node in third trie with record, based on value stored in third field by record. System associates node with first dispersion measure, based on values stored in first field by records associated with node, and with second dispersion measure, based on values stored in second field by records associated with node. System identifies branch sequence in third trie as key for prospective record, based on value stored in third field by prospective record. System uses key to identify a subset of records that match prospective record. If a count of the subset exceeds threshold, the system identifies other branch sequence in first trie or second trie as other key for prospective record, based on first dispersion measure and second dispersion measure. System uses the key and the other key to identify at least one record that matches prospective record.
    Type: Grant
    Filed: May 9, 2017
    Date of Patent: October 27, 2020
    Assignee: salesforce.com, inc.
    Inventors: Arun Kumar Jagota, Dmytro Kudriavtsev