Patents by Inventor Swapna Somasundaran

Swapna Somasundaran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11861310
    Abstract: A computer-implemented technique for characterizing lexical concreteness in narrative includes receiving data encapsulating narrative text having a plurality of words. Thereafter, the function words can be removed from the narrative text to result in only content words. A concreteness score can then be assigned to each content word by polling a database to identify matching words and to use concreteness scores associated with such matching words as specified by the database. Data can then be provided which characterizes the assigned concreteness scores. Related apparatus, systems, techniques and articles are also described.
    Type: Grant
    Filed: April 24, 2020
    Date of Patent: January 2, 2024
    Assignee: Educational Testing Service
    Inventors: Michael Flor, Swapna Somasundaran
  • Patent number: 11748571
    Abstract: Data is received that encapsulates a document of text. The text is then segmented into a plurality of semantically coherent units using a coherence-aware text segmentation (CATS) machine learning model. Data is then provided that characterizes the segmenting. Related apparatus, systems, techniques and articles are also described.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: September 5, 2023
    Assignee: Educational Testing Service
    Inventors: Goran Glava{hacek over (s)}, Swapna Somasundaran
  • Patent number: 10885274
    Abstract: Systems and methods are provided for processing a response to essay prompts that request a narrative response. A data structure associated with a narrative essay is accessed. The essay is analyzed to generate an organization subscore, where the organization subscore is generated using a graph metric by identifying content words in each sentence of the essay and populating a data structure with links between related content words in neighboring sentences, wherein the organization subscore is determined based on the links. The essay is analyzed to generate a development subscore, where the development subscore is generated using a transition metric by accessing a transition cue data store and identifying transition words in the essay, wherein the development subscore is based on a number of words in the essay that match words in the transition cue data store. A narrative quality metric is determined based on the organization subscore and the development subscore.
    Type: Grant
    Filed: June 21, 2018
    Date of Patent: January 5, 2021
    Assignee: Educational Testing Service
    Inventors: Swapna Somasundaran, Michael Flor, Martin Chodorow, Binod Gyawali, Hillary Molloy, Laura McCulla
  • Patent number: 10380490
    Abstract: Computer-based systems and methods are provided for generating a narrative computer scoring model for assessing story narratives. In one embodiment, supervised machine learning is used to generate the narrative computer scoring model. For example, a collection of training story narratives with assigned scores may be used to train the model. In one embodiment, each training story narrative is processed to extract features that signify content relevance, collocation of commonly used words, coherency, detailing, and expressions of sentiment. These features, as well as others, may be selectively used to train a narrative computer scoring model. Once trained, the model can be used to automatically evaluate story narratives and assign appropriate scores.
    Type: Grant
    Filed: February 26, 2016
    Date of Patent: August 13, 2019
    Assignee: Educational Testing Service
    Inventors: Swapna Somasundaran, Chong Min Lee, Martin Chodorow, Xinhao Wang
  • Patent number: 10339826
    Abstract: Systems and methods are provided for automatically scoring essay responses to a prompt using a scoring model. A relevant word corpus and an irrelevant word corpus are accessed. A scoring model is generated by, for each of a plurality of words in the relevant word corpus, determining a topic signature score based on a number of appearances of that word in the relevant word corpus and a number of appearances of that word in the irrelevant word corpus. For each of a plurality of words in an essay response, a topic signature score is determined for that word. A score for the essay response is determined based on the identified topic signature scores.
    Type: Grant
    Filed: October 12, 2016
    Date of Patent: July 2, 2019
    Assignee: Educational Testing Service
    Inventors: Swapna Somasundaran, Martin Chodorow, Jill Burstein
  • Patent number: 10198431
    Abstract: For generating a word space, manual thresholding of word scores is used. Rather than requiring the user to select the threshold arbitrarily or review each word, the user is iteratively requested to indicate the relevance of a given word. Words with greater or lesser scores are labeled in the same way depending upon the response. For determining the relationship between named entities, Latent Dirichlet Allocation (LDA) is performed on text associated with the name entities rather than on an entire document. LDA for relationship mining may include context information and/or supervised learning.
    Type: Grant
    Filed: August 22, 2011
    Date of Patent: February 5, 2019
    Assignee: SIEMENS CORPORATION
    Inventors: Swapna Somasundaran, Dingcheng Li, Amit Chakraborty
  • Patent number: 9959776
    Abstract: Systems and methods are provided for measuring a user's English language proficiency. A constructed response generated by a user is received, the constructed response being based on a picture. The constructed response is processed to determine a first numerical measure indicative of a presence of one or more grammar errors in the constructed response. The constructed response is processed to determine a second numerical measure indicative of a degree to which the constructed response describes a subject matter of the picture. The constructed response is processed to determine a third numerical measure indicative of a degree of awkward word usage in the constructed response. A model is applied to the first, second, and third numerical measures to determine a score for the constructed response indicative of the user's English language proficiency. The model includes first, second, and third variables with associated first, second, and third weighting factors, respectively.
    Type: Grant
    Filed: October 31, 2017
    Date of Patent: May 1, 2018
    Assignee: Educational Testing Service
    Inventors: Swapna Somasundaran, Martin Chodorow, Joel Tetreault
  • Patent number: 9836985
    Abstract: Systems and methods are provided for measuring a user's English language proficiency. A constructed response generated by a user is received, the constructed response being based on a picture. The constructed response is processed to determine a first numerical measure indicative of a presence of one or more grammar errors in the constructed response. The constructed response is processed to determine a second numerical measure indicative of a degree to which the constructed response describes a subject matter of the picture. The constructed response is processed to determine a third numerical measure indicative of a degree of awkward word usage in the constructed response. A model is applied to the first, second, and third numerical measures to determine a score for the constructed response indicative of the user's English language proficiency. The model includes first, second, and third variables with associated first, second, and third weighting factors, respectively.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: December 5, 2017
    Assignee: Educational Testing Service
    Inventors: Swapna Somasundaran, Martin Chodorow, Joel Tetreault
  • Patent number: 9665566
    Abstract: Systems and methods are provided for automatically generating a coherence score for a text using a scoring model. A lexical chain is identified within a text to be scored, where the lexical chain comprises a set of words spaced within the text. A discourse element is identified within the text, where the discourse element comprises a word within the text. A coherence metric is determined based on a relationship between the lexical chain and the discourse element. A coherence score is generated using a scoring model by providing the coherence metric to the scoring model.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: May 30, 2017
    Assignee: Educational Testing Service
    Inventors: Jill Burstein, Swapna Somasundaran, Martin Chodorow
  • Publication number: 20150254229
    Abstract: Systems and methods are provided for a computer-implemented method of providing a score that measures an essay's usage of source material provided in at least one written text and an audio recording. Using one or more data processors, a determination is made of a list of n-grams present in a received essay. For each of a plurality of present n-grams, an n-gram weight is determined, where the n-gram weight is based on a number of appearances of that n-gram in the at least one written text and a number of appearances of that n-gram in the audio recording, and an n-gram sub-metric is determined based on the presence of the n-gram in the essay and the n-gram weight. A source usage metric is determined based on the n-gram sub-metrics for the plurality of present n-grams, and a scoring model is used to generate a score for the essay based on the source usage metric.
    Type: Application
    Filed: March 6, 2015
    Publication date: September 10, 2015
    Inventors: Beata Beigman Klebanov, Nitin Madnani, Jill Burstein, Swapna Somasundaran
  • Publication number: 20150248397
    Abstract: Systems and methods are provided for automatically generating a coherence score for a text using a scoring model. A lexical chain is identified within a text to be scored, where the lexical chain comprises a set of words spaced within the text. A discourse element is identified within the text, where the discourse element comprises a word within the text. A coherence metric is determined based on a relationship between the lexical chain and the discourse element. A coherence score is generated using a scoring model by providing the coherence metric to the scoring model.
    Type: Application
    Filed: February 27, 2015
    Publication date: September 3, 2015
    Inventors: Jill Burstein, Swapna Somasundaran, Martin Chodorow
  • Publication number: 20150243181
    Abstract: Systems and methods are provided for measuring a user's English language proficiency. A constructed response generated by a user is received, the constructed response being based on a picture. The constructed response is processed to determine a first numerical measure indicative of a presence of one or more grammar errors in the constructed response. The constructed response is processed to determine a second numerical measure indicative of a degree to which the constructed response describes a subject matter of the picture. The constructed response is processed to determine a third numerical measure indicative of a degree of awkward word usage in the constructed response. A model is applied to the first, second, and third numerical measures to determine a score for the constructed response indicative of the user's English language proficiency. The model includes first, second, and third variables with associated first, second, and third weighting factors, respectively.
    Type: Application
    Filed: February 27, 2015
    Publication date: August 27, 2015
    Inventors: Swapna Somasundaran, Martin Chodorow, Joel Tetreault
  • Patent number: 8700589
    Abstract: A system generates medical knowledge base information by using predetermined data source specific message syntax information in identifying first and second information received from first and second data sources respectively. The first and second information indicates at least one type of medical relationship between the received first and second medical terms. The system determines likelihood of existence of the at least one type of medical relationship indicated by a combination of the first and second information, in response to predetermined information indicating a number of occurrences of the at least one type of relationship in data of at least one of the first and second data source. The system outputs first and second medical terms and the at least one type of medical relationship in response to the determined likelihood of existence.
    Type: Grant
    Filed: May 24, 2012
    Date of Patent: April 15, 2014
    Assignee: Siemens Corporation
    Inventors: Kateryna Tymoshenko, Swapna Somasundaran, Vinay Damodar Shet
  • Patent number: 8639678
    Abstract: A system generates medical knowledge base information by searching at least one repository of medical information to identify sentences including a received medical term. A data processor searches the identified sentences to identify sentences including a medical term different to the received term in response to a predetermined repository of medical terms and excludes sentences without a term different to the received term, to provide remaining multiple term sentences. The data processor groups different terms of individual sentences of the multiple term sentences to provide grouped terms, determines whether a medically valid relationship occurs between different terms of an individual group of terms of the grouped terms by using predetermined sentence structure and syntax rules and outputs data representing grouped terms having a medically valid relationship.
    Type: Grant
    Filed: May 24, 2012
    Date of Patent: January 28, 2014
    Assignee: Siemens Corporation
    Inventors: Swapna Somasundaran, Vinodkumar Prabhakaran, Vinay Damodar Shet, Kateryna Tymoshenko, Mathäus Dejori
  • Publication number: 20130066870
    Abstract: A system generates medical knowledge base information by searching at least one repository of medical information to identify sentences including a received medical term. A data processor searches the identified sentences to identify sentences including a medical term different to the received term in response to a predetermined repository of medical terms and excludes sentences without a term different to the received term, to provide remaining multiple term sentences. The data processor groups different terms of individual sentences of the multiple term sentences to provide grouped terms, determines whether a medically valid relationship occurs between different terms of an individual group of terms of the grouped terms by using predetermined sentence structure and syntax rules and outputs data representing grouped terms having a medically valid relationship.
    Type: Application
    Filed: May 24, 2012
    Publication date: March 14, 2013
    Inventors: Swapna Somasundaran, Vinodkumar Prabhakaran, Vinay Damodar Shet, Kateryna Tymoshenko, Mathäus Dejori
  • Publication number: 20130066903
    Abstract: A system generates medical knowledge base information by using predetermined data source specific message syntax information in identifying first and second information received from first and second data sources respectively. The first and second information indicates at least one type of medical relationship between the received first and second medical terms. The system determines likelihood of existence of the at least one type of medical relationship indicated by a combination of the first and second information, in response to predetermined information indicating a number of occurrences of the at least one type of relationship in data of at least one of the first and second data source. The system outputs first and second medical terms and the at least one type of medical relationship in response to the determined likelihood of existence.
    Type: Application
    Filed: May 24, 2012
    Publication date: March 14, 2013
    Applicant: SIEMENS CORPORATOIN
    Inventors: Kateryna Tymoshenko, Swapna Somasundaran, Vinay Damodar Shet
  • Publication number: 20120078918
    Abstract: For generating a word space, manual thresholding of word scores is used. Rather than requiring the user to select the threshold arbitrarily or review each word, the user is iteratively requested to indicate the relevance of a given word. Words with greater or lesser scores are labeled in the same way depending upon the response. For determining the relationship between named entities, Latent Dirichlet Allocation (LDA) is performed on text associated with the name entities rather than on an entire document. LDA for relationship mining may include context information and/or supervised learning.
    Type: Application
    Filed: August 22, 2011
    Publication date: March 29, 2012
    Applicant: Siemens Corporation
    Inventors: Swapna Somasundaran, Dingcheng Li, Amit Chakraborty