Patents by Inventor Nicolae Duta

Nicolae Duta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11348330
    Abstract: Systems, methods, and computer-executable instructions for extracting key value data. Optical character recognition (OCR) text of a document is received. The y-coordinate of characters are adjusted to a common y-coordinate. The rows of OCR text are tokenized into tokens based on a distance between characters. The tokens are ordered based on the x,y coordinates of the characters. The document is clustered into a cluster based on the ordered tokens and ordered tokens from other documents. Keys for the cluster are determined from the first set of documents. Each key is a token from a first set of documents. A value is assigned to each kay based on the tokens for the document, and values are assigned to each key for the other documents. The values for the document and the values for the other documents are stored in an output document.
    Type: Grant
    Filed: June 9, 2020
    Date of Patent: May 31, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Nicolae Duta
  • Patent number: 11030403
    Abstract: Methods and systems are provided for creating a calendar event using context. A natural language expression including at least one of words, terms, and phrases of text may be received at a calendar event creation module from an application. The calendar event creation module may identify one or more slots in the text of the natural language expression related to the calendar event using a first grammar module and a second grammar module. The one or more slots identified by the first grammar module and the second grammar module that indicate a calendar event may be compared to determine whether there is a match between the one or more identified slots. If a match is found, at least one calendar event using the one or more slots identified by the first grammar module and the second grammar module may be created.
    Type: Grant
    Filed: May 11, 2016
    Date of Patent: June 8, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Timothy J. Hazen, Diamond Bishop, Nicolae Duta, Mohammad Babaeizadeh, Peter Longo
  • Patent number: 10878195
    Abstract: A “Table Extractor” provides various techniques for automatically delimiting and extracting tables from arbitrary documents. In various implementations, the Table extractor also generates functional relationships on those tables that are suitable for generating query responses via any of a variety of natural language processing techniques. In other words, the Table Extractor provides techniques for detecting and representing table information in a way suitable for information extraction. These techniques output relational functions on the table in the form of tuples constructed from automatically identified headers and labels and the relationships between those headers and labels and the contents of one or more cells of the table. These tuples are suitable for correlating natural language questions about a specific piece of information in the table with the rows, columns, and/or cells that contain that information.
    Type: Grant
    Filed: May 3, 2018
    Date of Patent: December 29, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Nicolae Duta
  • Publication number: 20200302219
    Abstract: Systems, methods, and computer-executable instructions for extracting key value data. Optical character recognition (OCR) text of a document is received. The y-coordinate of characters are adjusted to a common y-coordinate. The rows of OCR text are tokenized into tokens based on a distance between characters. The tokens are ordered based on the x,y coordinates of the characters. The document is clustered into a cluster based on the ordered tokens and ordered tokens from other documents. Keys for the cluster are determined from the first set of documents. Each key is a token from a first set of documents. A value is assigned to each kay based on the tokens for the document, and values are assigned to each key for the other documents. The values for the document and the values for the other documents are stored in an output document.
    Type: Application
    Filed: June 9, 2020
    Publication date: September 24, 2020
    Inventor: Nicolae Duta
  • Patent number: 10713524
    Abstract: Systems, methods, and computer-executable instructions for extracting key value data. Optical character recognition (OCR) text of a document is received. The y-coordinate of characters are adjusted to a common y-coordinate. The rows of OCR text are tokenized into tokens based on a distance between characters. The tokens are ordered based on the x,y coordinates of the characters. The document is clustered into a cluster based on the ordered tokens and ordered tokens from other documents. Keys for the cluster are determined from the first set of documents. Each key is a token from a first set of documents. A value is assigned to each key based on the tokens for the document, and values are assigned to each key for the other documents. The values for the document and the values for the other documents are stored in an output document.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: July 14, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Nicolae Duta
  • Publication number: 20200117944
    Abstract: Systems, methods, and computer-executable instructions for extracting key value data. Optical character recognition (OCR) text of a document is received. The y-coordinate of characters are adjusted to a common y-coordinate. The rows of OCR text are tokenized into tokens based on a distance between characters. The tokens are ordered based on the x,y coordinates of the characters. The document is clustered into a cluster based on the ordered tokens and ordered tokens from other documents. Keys for the cluster are determined from the first set of documents. Each key is a token from a first set of documents. A value is assigned to each key based on the tokens for the document, and values are assigned to each key for the other documents. The values for the document and the values for the other documents are stored in an output document.
    Type: Application
    Filed: October 10, 2018
    Publication date: April 16, 2020
    Inventor: Nicolae Duta
  • Publication number: 20190340240
    Abstract: A “Table Extractor” provides various techniques for automatically delimiting and extracting tables from arbitrary documents. In various implementations, the Table extractor also generates functional relationships on those tables that are suitable for generating query responses via any of a variety of natural language processing techniques. In other words, the Table Extractor provides techniques for detecting and representing table information in a way suitable for information extraction. These techniques output relational functions on the table in the form of tuples constructed from automatically identified headers and labels and the relationships between those headers and labels and the contents of one or more cells of the table. These tuples are suitable for correlating natural language questions about a specific piece of information in the table with the rows, columns, and/or cells that contain that information.
    Type: Application
    Filed: May 3, 2018
    Publication date: November 7, 2019
    Applicant: Microsoft Technology Licensing, LLC
    Inventor: Nicolae DUTA
  • Patent number: 10282419
    Abstract: An arrangement and corresponding method are described for multi-domain natural language processing. Multiple parallel domain pipelines are used for processing a natural language input. Each domain pipeline represents a different specific subject domain of related concepts. Each domain pipeline includes a mention module that processes the natural language input using natural language understanding (NLU) to determine a corresponding list of mentions, and an interpretation generator that receives the list of mentions and produces a rank-ordered domain output set of sentence-level interpretation candidates. A global evidence ranker receives the domain output sets from the domain pipelines and produces an overall rank-ordered final output set of sentence-level interpretations.
    Type: Grant
    Filed: December 12, 2012
    Date of Patent: May 7, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Matthieu Hebert, Jean-Philippe Robichaud, Christopher M. Parisien, Nicolae Duta, Jerome Tremblay, Amjad Almahairi, Lakshmish Kaushik, Maryse Boisvert
  • Patent number: 9620110
    Abstract: An automated method is described for developing an automated speech input semantic classification system such as a call routing system. A set of semantic classifications is defined for classification of input speech utterances, where each semantic classification represents a specific semantic classification of the speech input. The semantic classification system is trained from training data from training data substantially without manually transcribed in-domain training data, and then operated to assign input speech utterances to the defined semantic classifications. Adaptation training data based on input speech utterances is collected with manually assigned semantic labels from at least one source of already collected language data. When the adaptation training data satisfies a pre-determined adaptation criteria, the semantic classification system is automatically retrained based on the adaptation training data.
    Type: Grant
    Filed: April 28, 2014
    Date of Patent: April 11, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Nicolae Duta, Réal Tremblay, Andrew D. Mauro, S. Douglas Peters
  • Publication number: 20160253310
    Abstract: Methods and systems are provided for creating a calendar event using context. A natural language expression including at least one of words, terms, and phrases of text may be received at a calendar event creation module from an application. The calendar event creation module may identify one or more slots in the text of the natural language expression related to the calendar event using a first grammar module and a second grammar module. The one or more slots identified by the first grammar module and the second grammar module that indicate a calendar event may be compared to determine whether there is a match between the one or more identified slots. If a match is found, at least one calendar event using the one or more slots identified by the first grammar module and the second grammar module may be created.
    Type: Application
    Filed: May 11, 2016
    Publication date: September 1, 2016
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Timothy J. Hazen, Diamond Bishop, Nicolae Duta, Mohammad Babaeizadeh, Peter Longo
  • Patent number: 9372851
    Abstract: Methods and systems are provided for creating a calendar event using context. A natural language expression including at least one of words, terms, and phrases of text may be received at a calendar event creation module from an application. The calendar event creation module may identify one or more slots in the text of the natural language expression related to the calendar event using a first grammar module and a second grammar module. The one or more slots identified by the first grammar module and the second grammar module that indicate a calendar event may be compared to determine whether there is a match between the one or more identified slots. If a match is found, at least one calendar event using the one or more slots identified by the first grammar module and the second grammar module may be created.
    Type: Grant
    Filed: April 1, 2014
    Date of Patent: June 21, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Timothy J. Hazen, Diamond Bishop, Nicolae Duta, Mohammad Babaeizadeh, Peter Longo
  • Publication number: 20160140957
    Abstract: An automated method is described for developing an automated speech input semantic classification system such as a call routing system. A set of semantic classifications is defined for classification of input speech utterances, where each semantic classification represents a specific semantic classification of the speech input. The semantic classification system is trained from training data from training data substantially without manually transcribed in-domain training data, and then operated to assign input speech utterances to the defined semantic classifications. Adaptation training data based on input speech utterances is collected with manually assigned semantic labels from at least one source of already collected language data. When the adaptation training data satisfies a pre-determined adaptation criteria, the semantic classification system is automatically retrained based on the adaptation training data.
    Type: Application
    Filed: April 28, 2014
    Publication date: May 19, 2016
    Applicant: Nuance Communications, Inc.
    Inventors: Nicolae Duta, Réal Tremblay, Andrew D. Mauro, S. Douglas Peters
  • Publication number: 20150278199
    Abstract: Methods and systems are provided for creating a calendar event using context. A natural language expression including at least one of words, terms, and phrases of text may be received at a calendar event creation module from an application. The calendar event creation module may identify one or more slots in the text of the natural language expression related to the calendar event using a first grammar module and a second grammar module. The one or more slots identified by the first grammar module and the second grammar module that indicate a calendar event may be compared to determine whether there is a match between the one or more identified slots. If a match is found, at least one calendar event using the one or more slots identified by the first grammar module and the second grammar module may be created.
    Type: Application
    Filed: April 1, 2014
    Publication date: October 1, 2015
    Applicant: Microsoft Corporation
    Inventors: Timothy J. Hazen, Diamond Bishop, Nicolae Duta, Mohammad Babaeizadeh, Peter Longo
  • Patent number: 8781833
    Abstract: An automated method is described for developing an automated speech input semantic classification system such as a call routing system. A set of semantic classifications is defined for classification of input speech utterances, where each semantic classification represents a specific semantic classification of the speech input. The semantic classification system is trained from training data having little or no in-domain manually transcribed training data, and then operated to assign input speech utterances to the defined semantic classifications. Adaptation training data based on input speech utterances is collected with manually assigned semantic labels. When the adaptation training data satisfies a pre-determined adaptation criteria, the semantic classification system is automatically retrained based on the adaptation training data.
    Type: Grant
    Filed: July 15, 2009
    Date of Patent: July 15, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Nicolae Duta, Rèal Tremblay, Andy Mauro, Douglas Peters
  • Publication number: 20140163959
    Abstract: An arrangement and corresponding method are described for multi-domain natural language processing. Multiple parallel domain pipelines are used for processing a natural language input. Each domain pipeline represents a different specific subject domain of related concepts. Each domain pipeline includes a mention module that processes the natural language input using natural language understanding (NLU) to determine a corresponding list of mentions, and an interpretation generator that receives the list of mentions and produces a rank-ordered domain output set of sentence-level interpretation candidates. A global evidence ranker receives the domain output sets from the domain pipelines and produces an overall rank-ordered final output set of sentence-level interpretations.
    Type: Application
    Filed: December 12, 2012
    Publication date: June 12, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Matthieu Hebert, Jean-Philippe Robichaud, Christopher M. Parisien, Nicolae Duta, Jerome Tremblay, Amjad Almahairi, Lakshmish Kaushik, Maryse Boisvert
  • Patent number: 8515736
    Abstract: Techniques disclosed herein include systems and methods for reusing semantically-labeled data collected for previous or existing call routing applications. Such reuse of semantically-labeled utterances can be used for automating and accelerating application design as well as data transcription and labeling for new and future call routing applications. Such techniques include using a semantic database containing transcriptions and semantic labels for several call routing applications along with corresponding baseline routers trained for those applications. This semantic database can be used to derive a semantic similarity measure between any pair of utterances, such as transcribed sentences. A mathematical model predicts how semantically related two utterances are, such as by identifying a same user intent to identifying completely unrelated intents.
    Type: Grant
    Filed: September 30, 2010
    Date of Patent: August 20, 2013
    Assignee: Nuance Communications, Inc.
    Inventor: Nicolae Duta
  • Publication number: 20130018864
    Abstract: Some embodiments relate to techniques for receiving a query comprising content; in response to the query being received, determining that the content may have at least a first semantic meaning or a second semantic meaning that is different than the first semantic meaning; and identifying a plurality of search engines to which to submit a representation of the query, the plurality of search engines comprising a first search engine identified based on the first semantic meaning and a second search engine identified based on the second semantic meaning.
    Type: Application
    Filed: July 14, 2011
    Publication date: January 17, 2013
    Applicant: Nuance Communications, Inc.
    Inventors: Marc W. Regan, Vladimir Sejnoha, Matthieu Hebert, Nicolae Duta, Nir Halperin, Carmit Brikman, Michael Leong
  • Publication number: 20100023331
    Abstract: An automated method is described for developing an automated speech input semantic classification system such as a call routing system. A set of semantic classifications is defined for classification of input speech utterances, where each semantic classification represents a specific semantic classification of the speech input. The semantic classification system is trained from training data having little or no in-domain manually transcribed training data, and then operated to assign input speech utterances to the defined semantic classifications. Adaptation training data based on input speech utterances is collected with manually assigned semantic labels. When the adaptation training data satisfies a pre-determined adaptation criteria, the semantic classification system is automatically retrained based on the adaptation training data.
    Type: Application
    Filed: July 15, 2009
    Publication date: January 28, 2010
    Applicant: Nuance Communications, Inc.
    Inventors: Nicolae Duta, Rèal Tremblay, Andy Mauro, Douglas Peters
  • Patent number: 7400757
    Abstract: A method is provided for segmenting an image of interest of a left ventricle. The method includes determining a myocardium contour according to a graph cut of candidate endocardium contours, and a spline fitting to candidate epicardium contours in the absence of shape propagation. The method further includes applying a plurality of shape constraints to candidate endocardium contours and candidate epicardium contours to determine the myocardium contour, wherein a template is determined by shape propagation of a plurality of images in a sequence including the image of interest in the presence of shape propagation.
    Type: Grant
    Filed: February 2, 2005
    Date of Patent: July 15, 2008
    Assignee: Siemens Medical Solutions USA, Inc.
    Inventors: Marie-Pierre Jolly, Ying Sun, Nicolae Duta
  • Publication number: 20030035573
    Abstract: An automated method for detection of an object of interest in magnetic resonance (MR) two-dimensional (2-D) images wherein the images comprise gray level patterns, the method includes a learning stage utilizing a set of positive/negative training samples drawn from a specified feature space.
    Type: Application
    Filed: December 20, 2000
    Publication date: February 20, 2003
    Inventors: Nicolae Duta, Marie-Pierre Jolly