Patents by Inventor Shao Chin

Shao Chin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7181451
    Abstract: Disclosed is an automated system, machine-readable storage medium embodying computer-executable code, and method for generating descriptive words and optionally, multi-word groups derived from a digitally encoded, natural-language input text that describes a concept, invention, or event in a selected field. The system includes (a) an electronic digital computer, (b) a database of words and optionally, word-groups derived from a plurality of texts, and (c) machine-readable storage medium embodying computer-executable code for accessing the database. The database provides, or can be used to calculate, a selectivity value for each of the words and optionally, word groups contained in or derived from the input text. Words and optionally, word groups having an above-threshold selectivity value are selected as descriptive terms from the input text.
    Type: Grant
    Filed: September 30, 2002
    Date of Patent: February 20, 2007
    Assignee: Word Data Corp.
    Inventors: Peter J. Dehlinger, Shao Chin
  • Patent number: 7024408
    Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. For each of a plurality of non-generic words and/or words groups characterizing the target document, there is determined a selectivity value calculated as the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and the document is represented as a vector of terms, where the coefficient assigned to each term is a function of the selectivity value determined for that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text.
    Type: Grant
    Filed: July 1, 2003
    Date of Patent: April 4, 2006
    Assignee: Word Data Corp.
    Inventors: Peter J. Dehlinger, Shao Chin
  • Patent number: 7016895
    Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. Each of a plurality of non-generic words and optionally, words groups characterizing the target document is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text. From the selected matched texts, and the associated classification identifiers, a classification determination of the target document is made.
    Type: Grant
    Filed: February 25, 2003
    Date of Patent: March 21, 2006
    Assignee: Word Data Corp.
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20060047656
    Abstract: Disclosed are a computer-readable code, system and method for retrieving one or more selected texts from a library of documents. The system processes a user-input search query representing the content of the text to be retrieved, and accesses a word index for the documents to identify those texts in the database having the highest word-match scores with the search query. The weights of words in the query may be adjusted to optimize the search.
    Type: Application
    Filed: August 31, 2005
    Publication date: March 2, 2006
    Inventors: Peter Dehlinger, Shao Chin
  • Patent number: 7003516
    Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.
    Type: Grant
    Filed: May 15, 2003
    Date of Patent: February 21, 2006
    Assignee: Word Data Corp.
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20050278623
    Abstract: Disclosed are a computer-readable code, system and method for assisting in the preparation of a target document. The system stores a plurality of template documents which are each parsed into passages, typically paragraphs. The individual passages from the several template documents form a database of model passages from which a new document can be constructed. To retrieve a particular passage, the user describes the content of interest, or represents the content as a string of words and/or word groups. The system uses a word-records file to identify one or more descriptive passages having the highest match score with the user description. From these highest-matching passages, the user selects one or more descriptive passages for use in document construction.
    Type: Application
    Filed: May 13, 2005
    Publication date: December 15, 2005
    Inventors: Peter Dehlinger, Shao Chin
  • Publication number: 20050198026
    Abstract: Disclosed are a computer-readable code, system and method for generating candidate novel concepts in one or more selected fields. The system operates to generate strings of terms composed of combinations of word and optionally, word-group terms that are descriptive of concept elements in such field(s), and uses a genetic algorithm to find one or more high fitness strings, based on the application of a fitness metric which quantifies, e.g., the number occurrence of pairs of terms in texts in a selected library of texts. The highest- score string or strings are then applied in a database search to identify one or more pairs of primary and secondary texts whose terms overlap with those of a high fitness string.
    Type: Application
    Filed: February 2, 2005
    Publication date: September 8, 2005
    Inventors: Peter Dehlinger, Shao Chin
  • Publication number: 20050120011
    Abstract: Disclosed are a computer-readable code, system and method for combining texts to form novel combinations of texts related to a desired target concept, where the concept is represented in the form of a natural-language text or a list of descriptive word and/or word-group terms. The system operates to find primary and secondary groups of texts having highest term match scores with a first and second subset of terms in the concept, respectively. It then generates pairs of texts containing a text from each of the primary and secondary groups of database texts, and selects for presentation to the user, those pairs of texts having highest overlap scores as determined from one or more of (i) term overlap, (ii) term coverage, (iii) feature-specific cross-correlation, (iv) attribute-specific correlation, and (v) citation score of one or both texts in the pair.
    Type: Application
    Filed: November 18, 2004
    Publication date: June 2, 2005
    Inventors: Peter Dehlinger, Shao Chin
  • Publication number: 20040064304
    Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.
    Type: Application
    Filed: May 15, 2003
    Publication date: April 1, 2004
    Applicant: WORD DATA CORP
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20040059565
    Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms selected from one of (i) non-generic words in the document, (ii) proximately arranged word groups in the document, and (iii) a combination of (i) and (ii), a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts.
    Type: Application
    Filed: July 1, 2003
    Publication date: March 25, 2004
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20040054520
    Abstract: Disclosed are a computer-readable code, system and method for comparing a target concept, invention, or event with each of a plurality of texts. Each of a plurality of non-generic words and optionally, words groups characterizing the target concept, invention, or event, is selected as a vector term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of texts, a match score related to the number of vector terms present in or derived from that text that match those in the target concept, invention, or event. Texts having the highest match scores are selected.
    Type: Application
    Filed: July 1, 2003
    Publication date: March 18, 2004
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20040049498
    Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. For each of a plurality of non-generic words and/or words groups characterizing the target document, there is determined a selectivity value calculated as the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and the document is represented as a vector of terms, where the coefficient assigned to each term is a function of the selectivity value determined for that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text.
    Type: Application
    Filed: July 1, 2003
    Publication date: March 11, 2004
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20040006457
    Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. Each of a plurality of non-generic words and optionally, words groups characterizing the target document is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text. From the selected matched texts, and the associated classification identifiers, a classification determination of the target document is made.
    Type: Application
    Filed: February 25, 2003
    Publication date: January 8, 2004
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20040006558
    Abstract: Disclosed is an automated system, machine-readable code, and method for generating descriptive words and optionally, multi-word groups derived from a digitally encoded, natural-language input text that describes a concept, invention, or event in a selected field. The system includes (a) an electronic digital computer, (b) a database of words and optionally, word-groups derived from a plurality of texts, and (c) computer-readable code for accessing the database. The database provides, or can be used to calculate, a selectivity value for each of the words and optionally, word groups contained in or derived from the input text. Words and optionally, word groups having an above-threshold selectivity value are selected as descriptive terms from the input text.
    Type: Application
    Filed: September 30, 2002
    Publication date: January 8, 2004
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20040006459
    Abstract: Disclosed are a computer-readable code, system and method for comparing a target concept, invention, or event with each of a plurality of texts. Each of a plurality of non-generic words and optionally, words groups characterizing the target concept, invention, or event, is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of texts, a match score related to the number of descriptive terms present in or derived from that text that match those in the target concept, invention, or event. Texts having the highest match scores are selected.
    Type: Application
    Filed: September 30, 2002
    Publication date: January 8, 2004
    Inventors: Peter J. Dehlinger, Shao Chin
  • Publication number: 20040006547
    Abstract: Disclosed is a computer-accessible database composed of a list of non-generic words contained in a plurality of digitally encoded texts. Associated with each term is a selectivity value or values that are related to the frequency of occurrence of that word in at least one library of texts in a field, relative to the frequency of occurrence of the same word in one or more libraries of texts in one or more other fields, respectively. Also associated with each term are one or more text identifiers identifying one or more of the digitally processed texts containing that word. Each text identifier may be further associated with sentence and word-number identifiers that identify the sentence and word number(s) of a given database word.
    Type: Application
    Filed: September 30, 2002
    Publication date: January 8, 2004
    Inventors: Peter J. Dehlinger, Shao Chin