Patents by Inventor Varun Mithal

Varun Mithal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240070549
    Abstract: Systems and methods for extracting rule lists from tree ensembles are provided. A system extracts first stage candidate rules from individual trees. The system identifies the first stage candidate rules that satisfy a precision threshold and places those rules in a solution set. Subsequently, a determination is made whether a further stage is needed based on whether a predetermined number of positive data samples of the data set are covered by the solution set. In the further stage, the system generates next stage candidate rules from previous stage candidate rules that have not been pruned and identifies the next stage candidate rules that satisfy the precision threshold, placing those rules in the solution set. A simplified rule list is generated by identifying a minimum subset of rules in the solution set that covers the positive data samples within the precision threshold.
    Type: Application
    Filed: August 24, 2022
    Publication date: February 29, 2024
    Inventors: Gopiram Roshan Lal, Varun Mithal, Xiaotong Chen
  • Patent number: 11769087
    Abstract: Machine learning based method for multilabel learning with label relationships is provided. This methodology addresses the technical problem of alleviating computational complexity of training a machine learning model that generates multilabel output with constraints, especially in contexts characterized by a large volume of data, by providing a new formulation that encodes probabilistic relationships among the labels as a regularization parameter in the training objective of the underlying model. For example, the training process of the model may be configured to have two objectives. Namely, in addition to the objective of minimizing conventional multilabel loss, there is another training objective, which is to minimize penalty associated with the prediction generated by the model breaking probabilistic relationships among the labels.
    Type: Grant
    Filed: June 4, 2020
    Date of Patent: September 26, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Girish Kathalagiri Somashekairah, Varun Mithal, Aman Grover
  • Patent number: 11397899
    Abstract: In some embodiments, a computer system selects a first subset of candidate content items based on their filter scores that are generated based on a partial generalized linear mixed model comprising a baseline model and a user-based model, with the baseline model being a generalized linear model, and the user-based model being a random effects model based on user actions by the target user directed towards reference content items related to the candidate content items. In some embodiments, the computer system then selects a second subset from the first subset based on recommendation scores that are generated based on a full generalized linear mixed model comprising the baseline model, the user-based model, and an item-based model, with the item-based model being a random effects model based on user actions directed towards the candidate online content item by reference users related to the target user.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: July 26, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Huichao Xue, Girish Kathalagiri Somashekariah, Ye Yuan, Varun Mithal, Junrui Xu, Ada Cheuk Ying Yu
  • Publication number: 20210383306
    Abstract: Machine learning based method for multilabel learning with label relationships is provided. This methodology addresses the technical problem of alleviating computational complexity of training a machine learning model that generates multilabel output with constraints, especially in contexts characterized by a large volume of data, by providing a new formulation that encodes probabilistic relationships among the labels as a regularization parameter in the training objective of the underlying model. For example, the training process of the model may be configured to have two objectives. Namely, in addition to the objective of minimizing conventional multilabel loss, there is another training objective, which is to minimize penalty associated with the prediction generated by the model breaking probabilistic relationships among the labels.
    Type: Application
    Filed: June 4, 2020
    Publication date: December 9, 2021
    Inventors: Girish Kathalagiri Somashekairah, Varun Mithal, Aman Grover
  • Patent number: 11138281
    Abstract: Techniques for using online user activity in determining relevance of attributes to improve computer functionality in generating recommendations of online content are disclosed herein. In some embodiments, a computer system calculates a corresponding relevance score for each attribute of a user based on a total number of online postings for which the user has performed at least one of a plurality of online actions within a particular sliding window of time defining a most recent time period, an attribute activity number representing a number of online postings in the plurality of online postings that have the attribute, and an inverse of a frequency value representing how many of a total number of online postings published within the particular sliding window of time have the attribute. In some embodiments, the computer system causes at least one recommendation associated with the user to be displayed based on the calculated relevance scores.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: October 5, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vita G. Markman, Ye Yuan, Varun Mithal, Igor Vladimir Yagolnitser
  • Publication number: 20200410451
    Abstract: Disclosed are systems, methods, and non-transitory computer-readable media extracting title hierarchy from trajectory data. A computing system generates a title hierarchy using a graph of connected nodes generated based on career trajectory data. Each distinct node in the graph represents a unique employment title identified in the career trajectory data. Connections established among pairs of nodes in the graph indicate user transitions among the employment titles associated with the nodes and edge values assigned to the connections indicate the number of users that transitioned from the employment titles associated with the nodes in the pair of nodes. The edge values are used to assign seniority values to each node in the graph, for example, by performing a topological sort of the nodes in the graph. The seniority values are used to establish the title hierarchy.
    Type: Application
    Filed: June 27, 2019
    Publication date: December 31, 2020
    Inventor: Varun Mithal
  • Publication number: 20200409960
    Abstract: Described herein are methods and systems for using weak labels to train a model for use in identifying job listings that are relevant to a user of an online job hosting service. The weak labels correspond with various user actions that a user has undertaken with respect to job listings presented to the user. By way of example, the relevant user actions may include: Job Applies, Job Saves, Job Views, Job Skips and Job Dismisses.
    Type: Application
    Filed: June 27, 2019
    Publication date: December 31, 2020
    Inventors: Varun Mithal, Girish Kathalagiri Somashekariah
  • Publication number: 20200372090
    Abstract: Techniques for using online user activity in determining relevance of attributes to improve computer functionality in generating recommendations of online content are disclosed herein. In some embodiments, a computer system calculates a corresponding relevance score for each attribute of a user based on a total number of online postings for which the user has performed at least one of a plurality of online actions within a particular sliding window of time defining a most recent time period, an attribute activity number representing a number of online postings in the plurality of online postings that have the attribute, and an inverse of a frequency value representing how many of a total number of online postings published within the particular sliding window of time have the attribute. In some embodiments, the computer system causes at least one recommendation associated with the user to be displayed based on the calculated relevance scores.
    Type: Application
    Filed: May 22, 2019
    Publication date: November 26, 2020
    Inventors: Vita G. Markman, Ye Yuan, Varun Mithal, Igor Vladimir Yagolnitser
  • Publication number: 20200311162
    Abstract: The disclosed embodiments provide a system for selecting recommendations based on title transition embeddings. During operation, the system obtains a word embedding model of a set of job histories. Next, the system calculates similarities between pairs of the embeddings produced by the word embedding model from attributes associated with titles in the set of job histories. The system then identifies, based on the similarities, job titles with high similarity to a current title of the candidate. Finally, the system outputs the job titles for use in selecting job recommendations for the candidate.
    Type: Application
    Filed: March 28, 2019
    Publication date: October 1, 2020
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Junrui Xu, Meng Meng, Girish Kathalagiri Somashekariah, Huichao Xue, Varun Mithal, Ada Cheuk Ying Yu
  • Publication number: 20200311157
    Abstract: In some embodiments, a computer system determines that online postings belong to a cohort based on the postings having an attribute of the cohort, identifies skills from the postings, determines that a user belongs to the cohort based on a determination that a profile of the user includes the attribute(s) of the cohort, determines that one or more of the skills is stored in association with the profile, determines a user confidence score that indicates a relevance level of the skill to the user for each one of the one or more of the skills, determines a cohort confidence score for each one of the one or more of the skills based on how many of the postings include the skill, and displays a recommendation associated based on a combination of the user confidence score and the cohort confidence score for at least a portion of the skills.
    Type: Application
    Filed: March 28, 2019
    Publication date: October 1, 2020
    Inventors: Ye Yuan, Girish Kathalagiri Somashekariah, Huichao Xue, Varun Mithal, Ada Cheuk Ying Yu, Junrui Xu
  • Publication number: 20200311568
    Abstract: In some embodiments, a computer system selects a first subset of candidate content items based on their filter scores that are generated based on a partial generalized linear mixed model comprising a baseline model and a user-based model, with the baseline model being a generalized linear model, and the user-based model being a random effects model based on user actions by the target user directed towards reference content items related to the candidate content items. In some embodiments, the computer system then selects a second subset from the first subset based on recommendation scores that are generated based on a full generalized linear mixed model comprising the baseline model, the user-based model, and an item-based model, with the item-based model being a random effects model based on user actions directed towards the candidate online content item by reference users related to the target user.
    Type: Application
    Filed: March 26, 2019
    Publication date: October 1, 2020
    Inventors: Huichao Xue, Girish Kathalagiri Somashekariah, Ye Yuan, Varun Mithal, Junrui Xu, Ada Cheuk Ying Yu
  • Patent number: 10776713
    Abstract: A method for identifying highly-skewed classes using an imperfect annotation of every instance together with a set of features for all instances. The imperfect annotations designate a plurality of instances as belonging to the target rare class and others to the majority class. First, a classifier is trained on the set of features using the imperfect annotation as supervision, to designate each instance to either the rare class or majority class. A combination of the predictions from the trained classifier and the imperfect annotations is then used to classify each instance to either the rare class or majority class. In particular, an instance is classified to the rare class only when both the trained classifier and the imperfect annotation classify the instance to the rare class. Finally, for each instance assigned as a rare class instance by the combination stage, all instances in its neighborhood are re-classified as either rare class or majority class.
    Type: Grant
    Filed: April 25, 2016
    Date of Patent: September 15, 2020
    Assignee: Regents of the University of Minnesota
    Inventors: Vipin Kumar, Varun Mithal, Guruprasad Nayak, Ankush Khandelwal
  • Publication number: 20180315132
    Abstract: Among other things, embodiments of the present disclosure discussed herein help to identify peers of various individuals and organizations who are members of an online social network. Groups of peers may be identified based on various criteria, and some embodiments may generate a probability score reflecting a confidence level that two or more members of the online social network are peers of one another.
    Type: Application
    Filed: April 28, 2017
    Publication date: November 1, 2018
    Inventors: Aibo Tian, Varun Mithal, Suman Sundaresh, Cissy Chen, Bowen Meng, Lanxiao Xu
  • Publication number: 20180130193
    Abstract: A method improves automated water body extent determinations using satellite sensor values and includes a processor receiving a time-sequence of land cover labels for a plurality of geographic areas represented by pixels in the satellite sensor values. The processor alternates between ordering the geographic areas based on a water level estimates at each time point in the time sequence such that the order of the geographic areas reflects an estimate of the relative elevations of the geographic areas and updating the water level estimates based on the land cover labels for the geographic areas. A final ordering of the geographic areas and a final water level estimate are used to correct the time-sequence of land cover labels.
    Type: Application
    Filed: November 8, 2017
    Publication date: May 10, 2018
    Inventors: Varun Mithal, Ankush Khandelwal, Vipin Kumar
  • Publication number: 20160314411
    Abstract: A method for identifying highly-skewed classes using an imperfect annotation of every instance together with a set of features for all instances. The imperfect annotations designate a plurality of instances as belonging to the target rare class and others to the majority class. First, a classifier is trained on the set of features using the imperfect annotation as supervision, to designate each instance to either the rare class or majority class. A combination of the predictions from the trained classifier and the imperfect annotations is then used to classify each instance to either the rare class or majority class. In particular, an instance is classified to the rare class only when both the trained classifier and the imperfect annotation classify the instance to the rare class. Finally, for each instance assigned as a rare class instance by the combination stage, all instances in its neighborhood are re-classified as either rare class or majority class.
    Type: Application
    Filed: April 25, 2016
    Publication date: October 27, 2016
    Inventors: Vipin Kumar, Varun Mithal, Guruprasad Nayak, Ankush Khandelwal
  • Patent number: 9478038
    Abstract: A method reduces processing time required to identify locations burned by fire by receiving a feature value for each pixel in an image, each pixel representing a sub-area of a location. Pixels are then grouped based on similarities of the feature values to form candidate burn events. For each candidate burn event, a probability that the candidate burn event is a true burn event is determined based on at least one further feature value for each pixel in the candidate burn event. Candidate burn events that have a probability below a threshold are removed from further consideration as burn events to produce a set of remaining candidate burn events.
    Type: Grant
    Filed: March 30, 2015
    Date of Patent: October 25, 2016
    Assignee: Regents of the University of Minnesota
    Inventors: Shyam Boriah, Vipin Kumar, Varun Mithal, Ankush Khandelwal
  • Publication number: 20150278603
    Abstract: A method reduces processing time required to identify locations burned by fire by receiving a feature value for each pixel in an image, each pixel representing a sub-area of a location. Pixels are then grouped based on similarities of the feature values to form candidate burn events. For each candidate burn event, a probability that the candidate burn event is a true burn event is determined based on at least one further feature value for each pixel in the candidate burn event. Candidate burn events that have a probability below a threshold are removed from further consideration as burn events to produce a set of remaining candidate burn events.
    Type: Application
    Filed: March 30, 2015
    Publication date: October 1, 2015
    Applicant: Regents of the University of Minnesota
    Inventors: Shyam Boriah, Vipin Kumar, Varun Mithal, Ankush Khandelwal
  • Patent number: 8958603
    Abstract: A system has an aerial image database containing sensor data representing a plurality of aerial images of an area having multiple sub-areas. A processor applies a classifier to the sensor values to identify a label for each sub-area in each aerial image and to thereby generate an initial label sequence for each sub-area. The processor identifies a most likely land cover state for each sub-area based on the initial label sequence, a confusion matrix and a transition matrix. For each sub-area, the processor stores the most likely land cover state sequence for the sub-area.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: February 17, 2015
    Assignee: Regents of the University of Minnesota
    Inventors: Shyam Boriah, Ankush Khandelwal, Vipin Kumar, Varun Mithal, Karsten Steinhaeuser
  • Publication number: 20140212055
    Abstract: A system has an aerial image database containing sensor data representing a plurality of aerial images of an area having multiple sub-areas. A processor applies a classifier to the sensor values to identify a label for each sub-area in each aerial image and to thereby generate an initial label sequence for each sub-area. The processor identifies a most likely land cover state for each sub-area based on the initial label sequence, a confusion matrix and a transition matrix. For each sub-area, the processor stores the most likely land cover state sequence for the sub-area.
    Type: Application
    Filed: March 15, 2013
    Publication date: July 31, 2014
    Inventors: Shyam Boriah, Ankush Khandelwal, Vipin Kumar, Varun Mithal, Karsten Steinhaeuser