Patents by Inventor Girish Kumar

Girish Kumar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12517942
    Abstract: Improved solutions for dataset clustering and evaluation are disclosed. Examples cluster a set of documents into set of clusters using a language model, in an iterative process. In second and later clustering tasks, the current cluster titles and descriptions are provided in the language model prompt, to avoid near-duplications. Upon determining that the set of clusters is sufficiently complete and representative of the set of documents, the tasking switches to classification of the set of documents into the set of clusters using a language model. Classification continues until a sufficient percentage of the set of documents is classified. Some examples use batching, to avoid overloading the language model(s). In some examples, different language models are used for clustering and classification. Some examples use intruder detection to determine the quality of the clustering.
    Type: Grant
    Filed: March 25, 2024
    Date of Patent: January 6, 2026
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Seyedeh Hoda Shajari, Julia S. McAnallen, David B. Levitan, Girish Kumar, Jiantao Pan
  • Publication number: 20250298834
    Abstract: Improved solutions for dataset clustering and evaluation are disclosed. Examples cluster a set of documents into set of clusters using a language model, in an iterative process. In second and later clustering tasks, the current cluster titles and descriptions are provided in the language model prompt, to avoid near-duplications. Upon determining that the set of clusters is sufficiently complete and representative of the set of documents, the tasking switches to classification of the set of documents into the set of clusters using a language model. Classification continues until a sufficient percentage of the set of documents is classified. Some examples use batching, to avoid overloading the language model(s). In some examples, different language models are used for clustering and classification. Some examples use intruder detection to determine the quality of the clustering.
    Type: Application
    Filed: March 25, 2024
    Publication date: September 25, 2025
    Inventors: Seyedeh Hoda SHAJARI, Julia S. MCANALLEN, David B. LEVITAN, Girish KUMAR, Jiantao PAN
  • Patent number: 12401742
    Abstract: According to an example aspect of the present invention, there is provided an apparatus comprising at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to function as a point of interception in an application server or border control function of a communication network, receive an incoming protocol message requesting initiation of a call, transmit an outgoing protocol message to advance initiation of the call, and receive a cryptographic token comprising a cryptographically signed identity of a caller initiating the call, and transmit a lawful interception message comprising information on the call to a lawful interception party as a response to at least one trigger being fulfilled.
    Type: Grant
    Filed: January 5, 2023
    Date of Patent: August 26, 2025
    Assignee: Nokia Technologies Oy
    Inventors: Nagaraja Rao, Girish Kumar
  • Patent number: 12353580
    Abstract: Systems and methods are directed to building annotated models based on eyes-off data. Specifically, a synthetic data generation model is trained and used to further train a target model. The synthetic data generation model is trained within an eyes-off environment using an anonymity technique on confidential data. The synthetic data generation model is then used to create synthetic data that closely represents the confidential data but without any specific details that can be linked back to the confidential data. The synthetic data is then annotated and used to train the target model within an eyes-on environment. Subsequently, the target model is deployed back within the eyes-off environment to classify the confidential data.
    Type: Grant
    Filed: October 24, 2022
    Date of Patent: July 8, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: David Benjamin Levitan, Robert Alexander Sim, Julia S. McAnallen, Huseyin Atahan Inan, Girish Kumar, Xiang Yue
  • Patent number: 12105837
    Abstract: A method and system for generating synthetic privacy preserving training data for training a language classifier machine-learning (ML) model includes receiving a request to generate the synthetic privacy-preserving training data for the language classifier ML model, retrieving labeled training data associated with training the language classifier ML model, providing the labeled training data, one or more privacy parameters, and a domain type associated with the labeled training data to a synthetic data generation ML model, the synthetic data generation ML model being configured to generate synthetic training data in a privacy-persevering manner, receiving synthetic privacy-preserving training data as an output from the synthetic data generation ML model, and providing the synthetic privacy preserving training data to the language classifier ML model for training the language classifier ML model in classifying text.
    Type: Grant
    Filed: November 2, 2021
    Date of Patent: October 1, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Christopher Lawrence LaTerza, Girish Kumar, David Benjamin Levitan
  • Publication number: 20240232405
    Abstract: Systems and methods are directed to building annotated models based on eyes-off data. Specifically, a synthetic data generation model is trained and used to further train a target model. The synthetic data generation model is trained within an eyes-off environment using an anonymity technique on confidential data. The synthetic data generation model is then used to create synthetic data that closely represents the confidential data but without any specific details that can be linked back to the confidential data. The synthetic data is then annotated and used to train the target model within an eyes-on environment. Subsequently, the target model is deployed back within the eyes-off environment to classify the confidential data.
    Type: Application
    Filed: October 24, 2022
    Publication date: July 11, 2024
    Inventors: David Benjamin LEVITAN, Robert Alexander SIM, Julia S. MCANALLEN, Huseyin Atahan INAN, Girish KUMAR, Xiang YUE
  • Publication number: 20240135015
    Abstract: Systems and methods are directed to building annotated models based on eyes-off data. Specifically, a synthetic data generation model is trained and used to further train a target model. The synthetic data generation model is trained within an eyes-off environment using an anonymity technique on confidential data. The synthetic data generation model is then used to create synthetic data that closely represents the confidential data but without any specific details that can be linked back to the confidential data. The synthetic data is then annotated and used to train the target model within an eyes-on environment. Subsequently, the target model is deployed back within the eyes-off environment to classify the confidential data.
    Type: Application
    Filed: October 23, 2022
    Publication date: April 25, 2024
    Inventors: David Benjamin LEVITAN, Robert Alexander SIM, Julia S. MCANALLEN, Huseyin Atahan INAN, Girish KUMAR, Xiang YUE
  • Publication number: 20230396706
    Abstract: According to an example aspect of the present invention, there is provided an apparatus comprising at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to function as a point of interception in an application server or border control function of a communication network, receive an incoming protocol message requesting initiation of a call, transmit an outgoing protocol message to advance initiation of the call, and receive a cryptographic token comprising a cryptographically signed identity of a caller initiating the call, and transmit a lawful interception message comprising information on the call to a lawful interception party as a response to at least one trigger being fulfilled.
    Type: Application
    Filed: January 5, 2023
    Publication date: December 7, 2023
    Inventors: Nagaraja RAO, Girish KUMAR
  • Publication number: 20230137378
    Abstract: A method and system for generating synthetic privacy preserving training data for training a language classifier machine-learning (ML) model includes receiving a request to generate the synthetic privacy-preserving training data for the language classifier ML model, retrieving labeled training data associated with training the language classifier ML model, providing the labeled training data, one or more privacy parameters, and a domain type associated with the labeled training data to a synthetic data generation ML model, the synthetic data generation ML model being configured to generate synthetic training data in a privacy-persevering manner, receiving synthetic privacy-preserving training data as an output from the synthetic data generation ML model, and providing the synthetic privacy preserving training data to the language classifier ML model for training the language classifier ML model in classifying text.
    Type: Application
    Filed: November 2, 2021
    Publication date: May 4, 2023
    Inventors: Christopher Lawrence LaTERZA, Girish KUMAR, David Benjamin LEVITAN
  • Patent number: 11270073
    Abstract: Disclosed is a method and a system for extracting entity information from target data. The method comprises: providing the target data; refining the target data to obtain at least one base entity information having a plurality of base entity units using an algorithm, wherein the algorithm is based on a predefined syntax; generating a plurality of strings for each of the base entity information, wherein the plurality of strings comprises at least one base entity unit among the plurality of base entity units; sorting the plurality of strings in a decreasing order of length of the plurality of strings; identifying an entity type of the plurality of strings, based on an ontology, by processing the plurality of strings sequentially; assigning labels to the plurality of strings based on the entity type; and mapping the labelled plurality of strings to a predefined signature to obtain the entity information.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: March 8, 2022
    Assignee: Innoplexus AG
    Inventors: Gaurav Tripathi, Vatsal Agarwal, Prashant Patil, Girish Kumar, Tapashi Mandal, Sudhanshu Shekhar
  • Patent number: 10534824
    Abstract: In one embodiment, a method includes receiving a search query input comprising one or more n-grams; parsing the search query input to identify keywords; generating query commands for the keywords. Each query command may specify: a particular object-type; one or more identifiers of one or more objects that match the search query input; and one or more types of relationships with respect to the objects. The method may further include searching a particular vertical that stores objects of the particular object-type having a relationship of the type of relationship with respect to one or more of the objects; generating a plurality of search-result modules corresponding to the query commands, each search-result module comprising references to objects of the particular object-type specified by the query command; and sending, to a client device, instructions for presenting an interface comprising one or more of the search-result modules.
    Type: Grant
    Filed: March 10, 2017
    Date of Patent: January 14, 2020
    Assignee: Facebook, Inc.
    Inventors: Girish Kumar, Yuval Kesten, Xiao Li, Fabio Lopiano
  • Publication number: 20190205378
    Abstract: Disclosed is a method and a system for extracting entity information from target data. The method comprises: providing the target data; refining the target data to obtain at least one base entity information having a plurality of base entity units using an algorithm, wherein the algorithm is based on a predefined syntax; generating a plurality of strings for each of the base entity information, wherein the plurality of strings comprises at least one base entity unit among the plurality of base entity units; sorting the plurality of strings in a decreasing order of length of the plurality of strings; identifying an entity type of the plurality of strings, based on an ontology, by processing the plurality of strings sequentially; assigning labels to the plurality of strings based on the entity type; and mapping the labelled plurality of strings to a predefined signature to obtain the entity information.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 4, 2019
    Inventors: Gaurav Tripathi, Vatsal Agarwal, Prashant Patil, Girish Kumar, Tapashi Mandal, Sudhanshu Shekhar
  • Patent number: 10310945
    Abstract: Exemplary embodiments relate to techniques that allow for file system support to be rapidly deployed for new or updated operating system distributions. In some embodiments, a management component is provided perform data management on file systems. When a data management operation on a file system is requested, an operation component searches in a predetermined location for a named module that implements certain types of operations. The operation component then calls these operations (including validate, build and deport operations for the file system) to implement data management procedures in the file system. Implementing support for a new operating system or file system does not require that the management entity be rebuilt. Upon release of a new operating system or file system, a new named module can be written and placed in the predetermined location where the operation module is configured to search.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: June 4, 2019
    Assignee: NETAPP, INC.
    Inventors: Vasantha Prabhu, Nikhil Kaplingat, Girish Kumar
  • Publication number: 20170315872
    Abstract: Exemplary embodiments relate to techniques that allow for file system support to be rapidly deployed for new or updated operating system distributions. In some embodiments, a management component is provided perform data management on file systems. When a data management operation on a file system is requested, an operation component searches in a predetermined location for a named module that implements certain types of operations. The operation component then calls these operations (including validate, build and deport operations for the file system) to implement data management procedures in the file system. Implementing support for a new operating system or file system does not require that the management entity be rebuilt. Upon release of a new operating system or file system, a new named module can be written and placed in the predetermined location where the operation module is configured to search.
    Type: Application
    Filed: April 28, 2016
    Publication date: November 2, 2017
    Inventors: Vasantha Prabhu, Nikhil Kaplingat, Girish Kumar
  • Publication number: 20170185689
    Abstract: In one embodiment, a method includes receiving a search query input comprising one or more n-grams; parsing the search query input to identify keywords; generating query commands for the keywords. Each query command may specify: a particular object-type; one or more identifiers of one or more objects that match the search query input; and one or more types of relationships with respect to the objects. The method may further include searching a particular vertical that stores objects of the particular object-type having a relationship of the type of relationship with respect to one or more of the objects; generating a plurality of search-result modules corresponding to the query commands, each search-result module comprising references to objects of the particular object-type specified by the query command; and sending, to a client device, instructions for presenting an interface comprising one or more of the search-result modules.
    Type: Application
    Filed: March 10, 2017
    Publication date: June 29, 2017
    Inventors: Girish Kumar, Yuval Kesten, Xiao Li, Fabio Lopiano
  • Patent number: 9646055
    Abstract: In one embodiment, a method includes receiving from a first user of an online social network a search query input including one or more n-grams; generating a number of query commands based on the search query input; and searching one or more verticals to identify one or more objects stored by the vertical that match the query commands. Each vertical stores one or more objects associated with the online social network. The method also includes generating a number of search-result modules. Each search-result module corresponds to a query command of the number of query commands. Each search-result module includes references to one or more of the identified objects matching the query command corresponding to the search-result module. The method also includes scoring the search-result modules; and sending each search-result module having a score greater than a threshold score to the first user for display.
    Type: Grant
    Filed: April 3, 2014
    Date of Patent: May 9, 2017
    Assignee: Facebook, Inc.
    Inventors: Girish Kumar, Yuval Kesten, Xiao Li, Fabio Lopiano
  • Patent number: 9251185
    Abstract: Computer-readable media, computer systems, and computing methods are provided for classifying search results as either of good quality or of poor quality. Initially, a portion of the search results, such as the highest ranked documents, are selected for evaluation. A level of quality for each of the selected search results is determined using a classification process that includes the following steps: targeting features demonstrated by the selected search results to be evaluated; evaluating the selected features to generate a level-of-quality score for each of the selected search results; comparing the score against a predefined threshold value; and, based on the comparison, assigning each of the selected search results an absolute measurement. The absolute measurement indicates poor quality when the score is less than the threshold value. Upon recognizing that the selected search results are of poor quality, automatically executing a corrective action that reformulates the issued search query.
    Type: Grant
    Filed: December 15, 2010
    Date of Patent: February 2, 2016
    Inventors: Girish Kumar, Sanaz Ahari, Farid Hosseini, Nazan Khan, Ahmad Abdulkader, Ankur Gupta, Giridhar Kumaran, Vijay Nair
  • Publication number: 20150286643
    Abstract: In one embodiment, a method includes receiving from a first user of an online social network a search query input including one or more n-grams; generating a number of query commands based on the search query input; and searching one or more verticals to identify one or more objects stored by the vertical that match the query commands. Each vertical stores one or more objects associated with the online social network. The method also includes generating a number of search-result modules. Each search-result module corresponds to a query command of the number of query commands. Each search-result module includes references to one or more of the identified objects matching the query command corresponding to the search-result module. The method also includes scoring the search-result modules; and sending each search-result module having a score greater than a threshold score to the first user for display.
    Type: Application
    Filed: April 3, 2014
    Publication date: October 8, 2015
    Applicant: Facebook, Inc.
    Inventors: Girish Kumar, Yuval Kesten, Xiao Li, Fabio Lopiano
  • Patent number: 8868567
    Abstract: Subject matter described herein is related to determining a document score, which suggests a relevance of a document (e.g., webpage) to a search query. For example, a search query is received that is comprised of one or more terms, which represent a subject. An equivalent subject is identified that is semantically similar to the subject. The document score is determined by accounting for both a subject frequency and an equivalent-subject frequency.
    Type: Grant
    Filed: February 2, 2011
    Date of Patent: October 21, 2014
    Assignee: Microsoft Corporation
    Inventors: Girish Kumar, Alfian Tan, Nicholas Eric Craswell
  • Patent number: 8612416
    Abstract: Techniques are disclosed for providing a domain-aware snippet for a search result. A uniform resource locator (URL) is identified for a search result obtained in response to a search query, and it is determined that the URL corresponds to a single domain that has a plurality of web pages that are generated using a template that is common to each of the web pages in the domain. The template comprises a hypertext markup language (HTML) layout pattern that includes multiple sections shared by the web pages. A ranking value is assigned to the multiple sections and is used to identify a first section of the template that is relevant to the search query. A snippet is provided to a user for the search result; the snippet includes at least a portion of text from the first section.
    Type: Grant
    Filed: May 1, 2012
    Date of Patent: December 17, 2013
    Assignee: Microsoft Corporation
    Inventors: Girish Kumar, Fang Liu