Patents by Inventor Shao Chin
Shao Chin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7181451Abstract: Disclosed is an automated system, machine-readable storage medium embodying computer-executable code, and method for generating descriptive words and optionally, multi-word groups derived from a digitally encoded, natural-language input text that describes a concept, invention, or event in a selected field. The system includes (a) an electronic digital computer, (b) a database of words and optionally, word-groups derived from a plurality of texts, and (c) machine-readable storage medium embodying computer-executable code for accessing the database. The database provides, or can be used to calculate, a selectivity value for each of the words and optionally, word groups contained in or derived from the input text. Words and optionally, word groups having an above-threshold selectivity value are selected as descriptive terms from the input text.Type: GrantFiled: September 30, 2002Date of Patent: February 20, 2007Assignee: Word Data Corp.Inventors: Peter J. Dehlinger, Shao Chin
-
Patent number: 7024408Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. For each of a plurality of non-generic words and/or words groups characterizing the target document, there is determined a selectivity value calculated as the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and the document is represented as a vector of terms, where the coefficient assigned to each term is a function of the selectivity value determined for that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text.Type: GrantFiled: July 1, 2003Date of Patent: April 4, 2006Assignee: Word Data Corp.Inventors: Peter J. Dehlinger, Shao Chin
-
Patent number: 7016895Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. Each of a plurality of non-generic words and optionally, words groups characterizing the target document is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text. From the selected matched texts, and the associated classification identifiers, a classification determination of the target document is made.Type: GrantFiled: February 25, 2003Date of Patent: March 21, 2006Assignee: Word Data Corp.Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20060047656Abstract: Disclosed are a computer-readable code, system and method for retrieving one or more selected texts from a library of documents. The system processes a user-input search query representing the content of the text to be retrieved, and accesses a word index for the documents to identify those texts in the database having the highest word-match scores with the search query. The weights of words in the query may be adjusted to optimize the search.Type: ApplicationFiled: August 31, 2005Publication date: March 2, 2006Inventors: Peter Dehlinger, Shao Chin
-
Patent number: 7003516Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.Type: GrantFiled: May 15, 2003Date of Patent: February 21, 2006Assignee: Word Data Corp.Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20050278623Abstract: Disclosed are a computer-readable code, system and method for assisting in the preparation of a target document. The system stores a plurality of template documents which are each parsed into passages, typically paragraphs. The individual passages from the several template documents form a database of model passages from which a new document can be constructed. To retrieve a particular passage, the user describes the content of interest, or represents the content as a string of words and/or word groups. The system uses a word-records file to identify one or more descriptive passages having the highest match score with the user description. From these highest-matching passages, the user selects one or more descriptive passages for use in document construction.Type: ApplicationFiled: May 13, 2005Publication date: December 15, 2005Inventors: Peter Dehlinger, Shao Chin
-
Publication number: 20050198026Abstract: Disclosed are a computer-readable code, system and method for generating candidate novel concepts in one or more selected fields. The system operates to generate strings of terms composed of combinations of word and optionally, word-group terms that are descriptive of concept elements in such field(s), and uses a genetic algorithm to find one or more high fitness strings, based on the application of a fitness metric which quantifies, e.g., the number occurrence of pairs of terms in texts in a selected library of texts. The highest- score string or strings are then applied in a database search to identify one or more pairs of primary and secondary texts whose terms overlap with those of a high fitness string.Type: ApplicationFiled: February 2, 2005Publication date: September 8, 2005Inventors: Peter Dehlinger, Shao Chin
-
Publication number: 20050120011Abstract: Disclosed are a computer-readable code, system and method for combining texts to form novel combinations of texts related to a desired target concept, where the concept is represented in the form of a natural-language text or a list of descriptive word and/or word-group terms. The system operates to find primary and secondary groups of texts having highest term match scores with a first and second subset of terms in the concept, respectively. It then generates pairs of texts containing a text from each of the primary and secondary groups of database texts, and selects for presentation to the user, those pairs of texts having highest overlap scores as determined from one or more of (i) term overlap, (ii) term coverage, (iii) feature-specific cross-correlation, (iv) attribute-specific correlation, and (v) citation score of one or both texts in the pair.Type: ApplicationFiled: November 18, 2004Publication date: June 2, 2005Inventors: Peter Dehlinger, Shao Chin
-
Publication number: 20040064304Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.Type: ApplicationFiled: May 15, 2003Publication date: April 1, 2004Applicant: WORD DATA CORPInventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20040059565Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms selected from one of (i) non-generic words in the document, (ii) proximately arranged word groups in the document, and (iii) a combination of (i) and (ii), a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts.Type: ApplicationFiled: July 1, 2003Publication date: March 25, 2004Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20040054520Abstract: Disclosed are a computer-readable code, system and method for comparing a target concept, invention, or event with each of a plurality of texts. Each of a plurality of non-generic words and optionally, words groups characterizing the target concept, invention, or event, is selected as a vector term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of texts, a match score related to the number of vector terms present in or derived from that text that match those in the target concept, invention, or event. Texts having the highest match scores are selected.Type: ApplicationFiled: July 1, 2003Publication date: March 18, 2004Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20040049498Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. For each of a plurality of non-generic words and/or words groups characterizing the target document, there is determined a selectivity value calculated as the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and the document is represented as a vector of terms, where the coefficient assigned to each term is a function of the selectivity value determined for that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text.Type: ApplicationFiled: July 1, 2003Publication date: March 11, 2004Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20040006457Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. Each of a plurality of non-generic words and optionally, words groups characterizing the target document is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text. From the selected matched texts, and the associated classification identifiers, a classification determination of the target document is made.Type: ApplicationFiled: February 25, 2003Publication date: January 8, 2004Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20040006558Abstract: Disclosed is an automated system, machine-readable code, and method for generating descriptive words and optionally, multi-word groups derived from a digitally encoded, natural-language input text that describes a concept, invention, or event in a selected field. The system includes (a) an electronic digital computer, (b) a database of words and optionally, word-groups derived from a plurality of texts, and (c) computer-readable code for accessing the database. The database provides, or can be used to calculate, a selectivity value for each of the words and optionally, word groups contained in or derived from the input text. Words and optionally, word groups having an above-threshold selectivity value are selected as descriptive terms from the input text.Type: ApplicationFiled: September 30, 2002Publication date: January 8, 2004Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20040006459Abstract: Disclosed are a computer-readable code, system and method for comparing a target concept, invention, or event with each of a plurality of texts. Each of a plurality of non-generic words and optionally, words groups characterizing the target concept, invention, or event, is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of texts, a match score related to the number of descriptive terms present in or derived from that text that match those in the target concept, invention, or event. Texts having the highest match scores are selected.Type: ApplicationFiled: September 30, 2002Publication date: January 8, 2004Inventors: Peter J. Dehlinger, Shao Chin
-
Publication number: 20040006547Abstract: Disclosed is a computer-accessible database composed of a list of non-generic words contained in a plurality of digitally encoded texts. Associated with each term is a selectivity value or values that are related to the frequency of occurrence of that word in at least one library of texts in a field, relative to the frequency of occurrence of the same word in one or more libraries of texts in one or more other fields, respectively. Also associated with each term are one or more text identifiers identifying one or more of the digitally processed texts containing that word. Each text identifier may be further associated with sentence and word-number identifiers that identify the sentence and word number(s) of a given database word.Type: ApplicationFiled: September 30, 2002Publication date: January 8, 2004Inventors: Peter J. Dehlinger, Shao Chin