Patents by Inventor Varun Mithal
Varun Mithal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240070549Abstract: Systems and methods for extracting rule lists from tree ensembles are provided. A system extracts first stage candidate rules from individual trees. The system identifies the first stage candidate rules that satisfy a precision threshold and places those rules in a solution set. Subsequently, a determination is made whether a further stage is needed based on whether a predetermined number of positive data samples of the data set are covered by the solution set. In the further stage, the system generates next stage candidate rules from previous stage candidate rules that have not been pruned and identifies the next stage candidate rules that satisfy the precision threshold, placing those rules in the solution set. A simplified rule list is generated by identifying a minimum subset of rules in the solution set that covers the positive data samples within the precision threshold.Type: ApplicationFiled: August 24, 2022Publication date: February 29, 2024Inventors: Gopiram Roshan Lal, Varun Mithal, Xiaotong Chen
-
Patent number: 11769087Abstract: Machine learning based method for multilabel learning with label relationships is provided. This methodology addresses the technical problem of alleviating computational complexity of training a machine learning model that generates multilabel output with constraints, especially in contexts characterized by a large volume of data, by providing a new formulation that encodes probabilistic relationships among the labels as a regularization parameter in the training objective of the underlying model. For example, the training process of the model may be configured to have two objectives. Namely, in addition to the objective of minimizing conventional multilabel loss, there is another training objective, which is to minimize penalty associated with the prediction generated by the model breaking probabilistic relationships among the labels.Type: GrantFiled: June 4, 2020Date of Patent: September 26, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Girish Kathalagiri Somashekairah, Varun Mithal, Aman Grover
-
Patent number: 11397899Abstract: In some embodiments, a computer system selects a first subset of candidate content items based on their filter scores that are generated based on a partial generalized linear mixed model comprising a baseline model and a user-based model, with the baseline model being a generalized linear model, and the user-based model being a random effects model based on user actions by the target user directed towards reference content items related to the candidate content items. In some embodiments, the computer system then selects a second subset from the first subset based on recommendation scores that are generated based on a full generalized linear mixed model comprising the baseline model, the user-based model, and an item-based model, with the item-based model being a random effects model based on user actions directed towards the candidate online content item by reference users related to the target user.Type: GrantFiled: March 26, 2019Date of Patent: July 26, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Huichao Xue, Girish Kathalagiri Somashekariah, Ye Yuan, Varun Mithal, Junrui Xu, Ada Cheuk Ying Yu
-
Publication number: 20210383306Abstract: Machine learning based method for multilabel learning with label relationships is provided. This methodology addresses the technical problem of alleviating computational complexity of training a machine learning model that generates multilabel output with constraints, especially in contexts characterized by a large volume of data, by providing a new formulation that encodes probabilistic relationships among the labels as a regularization parameter in the training objective of the underlying model. For example, the training process of the model may be configured to have two objectives. Namely, in addition to the objective of minimizing conventional multilabel loss, there is another training objective, which is to minimize penalty associated with the prediction generated by the model breaking probabilistic relationships among the labels.Type: ApplicationFiled: June 4, 2020Publication date: December 9, 2021Inventors: Girish Kathalagiri Somashekairah, Varun Mithal, Aman Grover
-
Patent number: 11138281Abstract: Techniques for using online user activity in determining relevance of attributes to improve computer functionality in generating recommendations of online content are disclosed herein. In some embodiments, a computer system calculates a corresponding relevance score for each attribute of a user based on a total number of online postings for which the user has performed at least one of a plurality of online actions within a particular sliding window of time defining a most recent time period, an attribute activity number representing a number of online postings in the plurality of online postings that have the attribute, and an inverse of a frequency value representing how many of a total number of online postings published within the particular sliding window of time have the attribute. In some embodiments, the computer system causes at least one recommendation associated with the user to be displayed based on the calculated relevance scores.Type: GrantFiled: May 22, 2019Date of Patent: October 5, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Vita G. Markman, Ye Yuan, Varun Mithal, Igor Vladimir Yagolnitser
-
Publication number: 20200410451Abstract: Disclosed are systems, methods, and non-transitory computer-readable media extracting title hierarchy from trajectory data. A computing system generates a title hierarchy using a graph of connected nodes generated based on career trajectory data. Each distinct node in the graph represents a unique employment title identified in the career trajectory data. Connections established among pairs of nodes in the graph indicate user transitions among the employment titles associated with the nodes and edge values assigned to the connections indicate the number of users that transitioned from the employment titles associated with the nodes in the pair of nodes. The edge values are used to assign seniority values to each node in the graph, for example, by performing a topological sort of the nodes in the graph. The seniority values are used to establish the title hierarchy.Type: ApplicationFiled: June 27, 2019Publication date: December 31, 2020Inventor: Varun Mithal
-
Publication number: 20200409960Abstract: Described herein are methods and systems for using weak labels to train a model for use in identifying job listings that are relevant to a user of an online job hosting service. The weak labels correspond with various user actions that a user has undertaken with respect to job listings presented to the user. By way of example, the relevant user actions may include: Job Applies, Job Saves, Job Views, Job Skips and Job Dismisses.Type: ApplicationFiled: June 27, 2019Publication date: December 31, 2020Inventors: Varun Mithal, Girish Kathalagiri Somashekariah
-
Publication number: 20200372090Abstract: Techniques for using online user activity in determining relevance of attributes to improve computer functionality in generating recommendations of online content are disclosed herein. In some embodiments, a computer system calculates a corresponding relevance score for each attribute of a user based on a total number of online postings for which the user has performed at least one of a plurality of online actions within a particular sliding window of time defining a most recent time period, an attribute activity number representing a number of online postings in the plurality of online postings that have the attribute, and an inverse of a frequency value representing how many of a total number of online postings published within the particular sliding window of time have the attribute. In some embodiments, the computer system causes at least one recommendation associated with the user to be displayed based on the calculated relevance scores.Type: ApplicationFiled: May 22, 2019Publication date: November 26, 2020Inventors: Vita G. Markman, Ye Yuan, Varun Mithal, Igor Vladimir Yagolnitser
-
Publication number: 20200311162Abstract: The disclosed embodiments provide a system for selecting recommendations based on title transition embeddings. During operation, the system obtains a word embedding model of a set of job histories. Next, the system calculates similarities between pairs of the embeddings produced by the word embedding model from attributes associated with titles in the set of job histories. The system then identifies, based on the similarities, job titles with high similarity to a current title of the candidate. Finally, the system outputs the job titles for use in selecting job recommendations for the candidate.Type: ApplicationFiled: March 28, 2019Publication date: October 1, 2020Applicant: Microsoft Technology Licensing, LLCInventors: Junrui Xu, Meng Meng, Girish Kathalagiri Somashekariah, Huichao Xue, Varun Mithal, Ada Cheuk Ying Yu
-
Publication number: 20200311157Abstract: In some embodiments, a computer system determines that online postings belong to a cohort based on the postings having an attribute of the cohort, identifies skills from the postings, determines that a user belongs to the cohort based on a determination that a profile of the user includes the attribute(s) of the cohort, determines that one or more of the skills is stored in association with the profile, determines a user confidence score that indicates a relevance level of the skill to the user for each one of the one or more of the skills, determines a cohort confidence score for each one of the one or more of the skills based on how many of the postings include the skill, and displays a recommendation associated based on a combination of the user confidence score and the cohort confidence score for at least a portion of the skills.Type: ApplicationFiled: March 28, 2019Publication date: October 1, 2020Inventors: Ye Yuan, Girish Kathalagiri Somashekariah, Huichao Xue, Varun Mithal, Ada Cheuk Ying Yu, Junrui Xu
-
Publication number: 20200311568Abstract: In some embodiments, a computer system selects a first subset of candidate content items based on their filter scores that are generated based on a partial generalized linear mixed model comprising a baseline model and a user-based model, with the baseline model being a generalized linear model, and the user-based model being a random effects model based on user actions by the target user directed towards reference content items related to the candidate content items. In some embodiments, the computer system then selects a second subset from the first subset based on recommendation scores that are generated based on a full generalized linear mixed model comprising the baseline model, the user-based model, and an item-based model, with the item-based model being a random effects model based on user actions directed towards the candidate online content item by reference users related to the target user.Type: ApplicationFiled: March 26, 2019Publication date: October 1, 2020Inventors: Huichao Xue, Girish Kathalagiri Somashekariah, Ye Yuan, Varun Mithal, Junrui Xu, Ada Cheuk Ying Yu
-
Patent number: 10776713Abstract: A method for identifying highly-skewed classes using an imperfect annotation of every instance together with a set of features for all instances. The imperfect annotations designate a plurality of instances as belonging to the target rare class and others to the majority class. First, a classifier is trained on the set of features using the imperfect annotation as supervision, to designate each instance to either the rare class or majority class. A combination of the predictions from the trained classifier and the imperfect annotations is then used to classify each instance to either the rare class or majority class. In particular, an instance is classified to the rare class only when both the trained classifier and the imperfect annotation classify the instance to the rare class. Finally, for each instance assigned as a rare class instance by the combination stage, all instances in its neighborhood are re-classified as either rare class or majority class.Type: GrantFiled: April 25, 2016Date of Patent: September 15, 2020Assignee: Regents of the University of MinnesotaInventors: Vipin Kumar, Varun Mithal, Guruprasad Nayak, Ankush Khandelwal
-
Publication number: 20180315132Abstract: Among other things, embodiments of the present disclosure discussed herein help to identify peers of various individuals and organizations who are members of an online social network. Groups of peers may be identified based on various criteria, and some embodiments may generate a probability score reflecting a confidence level that two or more members of the online social network are peers of one another.Type: ApplicationFiled: April 28, 2017Publication date: November 1, 2018Inventors: Aibo Tian, Varun Mithal, Suman Sundaresh, Cissy Chen, Bowen Meng, Lanxiao Xu
-
Publication number: 20180130193Abstract: A method improves automated water body extent determinations using satellite sensor values and includes a processor receiving a time-sequence of land cover labels for a plurality of geographic areas represented by pixels in the satellite sensor values. The processor alternates between ordering the geographic areas based on a water level estimates at each time point in the time sequence such that the order of the geographic areas reflects an estimate of the relative elevations of the geographic areas and updating the water level estimates based on the land cover labels for the geographic areas. A final ordering of the geographic areas and a final water level estimate are used to correct the time-sequence of land cover labels.Type: ApplicationFiled: November 8, 2017Publication date: May 10, 2018Inventors: Varun Mithal, Ankush Khandelwal, Vipin Kumar
-
Publication number: 20160314411Abstract: A method for identifying highly-skewed classes using an imperfect annotation of every instance together with a set of features for all instances. The imperfect annotations designate a plurality of instances as belonging to the target rare class and others to the majority class. First, a classifier is trained on the set of features using the imperfect annotation as supervision, to designate each instance to either the rare class or majority class. A combination of the predictions from the trained classifier and the imperfect annotations is then used to classify each instance to either the rare class or majority class. In particular, an instance is classified to the rare class only when both the trained classifier and the imperfect annotation classify the instance to the rare class. Finally, for each instance assigned as a rare class instance by the combination stage, all instances in its neighborhood are re-classified as either rare class or majority class.Type: ApplicationFiled: April 25, 2016Publication date: October 27, 2016Inventors: Vipin Kumar, Varun Mithal, Guruprasad Nayak, Ankush Khandelwal
-
Patent number: 9478038Abstract: A method reduces processing time required to identify locations burned by fire by receiving a feature value for each pixel in an image, each pixel representing a sub-area of a location. Pixels are then grouped based on similarities of the feature values to form candidate burn events. For each candidate burn event, a probability that the candidate burn event is a true burn event is determined based on at least one further feature value for each pixel in the candidate burn event. Candidate burn events that have a probability below a threshold are removed from further consideration as burn events to produce a set of remaining candidate burn events.Type: GrantFiled: March 30, 2015Date of Patent: October 25, 2016Assignee: Regents of the University of MinnesotaInventors: Shyam Boriah, Vipin Kumar, Varun Mithal, Ankush Khandelwal
-
Publication number: 20150278603Abstract: A method reduces processing time required to identify locations burned by fire by receiving a feature value for each pixel in an image, each pixel representing a sub-area of a location. Pixels are then grouped based on similarities of the feature values to form candidate burn events. For each candidate burn event, a probability that the candidate burn event is a true burn event is determined based on at least one further feature value for each pixel in the candidate burn event. Candidate burn events that have a probability below a threshold are removed from further consideration as burn events to produce a set of remaining candidate burn events.Type: ApplicationFiled: March 30, 2015Publication date: October 1, 2015Applicant: Regents of the University of MinnesotaInventors: Shyam Boriah, Vipin Kumar, Varun Mithal, Ankush Khandelwal
-
Patent number: 8958603Abstract: A system has an aerial image database containing sensor data representing a plurality of aerial images of an area having multiple sub-areas. A processor applies a classifier to the sensor values to identify a label for each sub-area in each aerial image and to thereby generate an initial label sequence for each sub-area. The processor identifies a most likely land cover state for each sub-area based on the initial label sequence, a confusion matrix and a transition matrix. For each sub-area, the processor stores the most likely land cover state sequence for the sub-area.Type: GrantFiled: March 15, 2013Date of Patent: February 17, 2015Assignee: Regents of the University of MinnesotaInventors: Shyam Boriah, Ankush Khandelwal, Vipin Kumar, Varun Mithal, Karsten Steinhaeuser
-
Publication number: 20140212055Abstract: A system has an aerial image database containing sensor data representing a plurality of aerial images of an area having multiple sub-areas. A processor applies a classifier to the sensor values to identify a label for each sub-area in each aerial image and to thereby generate an initial label sequence for each sub-area. The processor identifies a most likely land cover state for each sub-area based on the initial label sequence, a confusion matrix and a transition matrix. For each sub-area, the processor stores the most likely land cover state sequence for the sub-area.Type: ApplicationFiled: March 15, 2013Publication date: July 31, 2014Inventors: Shyam Boriah, Ankush Khandelwal, Vipin Kumar, Varun Mithal, Karsten Steinhaeuser