Patents by Inventor Shao Chin

Shao Chin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Processing input text to generate the selectivity value of a word or word group in a library of texts in a field is related to the frequency of occurrence of that word or word group in library

Patent number: 7181451

Abstract: Disclosed is an automated system, machine-readable storage medium embodying computer-executable code, and method for generating descriptive words and optionally, multi-word groups derived from a digitally encoded, natural-language input text that describes a concept, invention, or event in a selected field. The system includes (a) an electronic digital computer, (b) a database of words and optionally, word-groups derived from a plurality of texts, and (c) machine-readable storage medium embodying computer-executable code for accessing the database. The database provides, or can be used to calculate, a selectivity value for each of the words and optionally, word groups contained in or derived from the input text. Words and optionally, word groups having an above-threshold selectivity value are selected as descriptive terms from the input text.

Type: Grant

Filed: September 30, 2002

Date of Patent: February 20, 2007

Assignee: Word Data Corp.

Inventors: Peter J. Dehlinger, Shao Chin
Text-classification code, system and method

Patent number: 7024408

Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. For each of a plurality of non-generic words and/or words groups characterizing the target document, there is determined a selectivity value calculated as the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and the document is represented as a vector of terms, where the coefficient assigned to each term is a function of the selectivity value determined for that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text.

Type: Grant

Filed: July 1, 2003

Date of Patent: April 4, 2006

Assignee: Word Data Corp.

Inventors: Peter J. Dehlinger, Shao Chin
Text-classification system and method

Patent number: 7016895

Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. Each of a plurality of non-generic words and optionally, words groups characterizing the target document is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text. From the selected matched texts, and the associated classification identifiers, a classification determination of the target document is made.

Type: Grant

Filed: February 25, 2003

Date of Patent: March 21, 2006

Assignee: Word Data Corp.

Inventors: Peter J. Dehlinger, Shao Chin
Code, system, and method for retrieving text material from a library of documents

Publication number: 20060047656

Abstract: Disclosed are a computer-readable code, system and method for retrieving one or more selected texts from a library of documents. The system processes a user-input search query representing the content of the text to be retrieved, and accesses a word index for the documents to identify those texts in the database having the highest word-match scores with the search query. The weights of words in the query may be adjusted to optimize the search.

Type: Application

Filed: August 31, 2005

Publication date: March 2, 2006

Inventors: Peter Dehlinger, Shao Chin
Text representation and method

Patent number: 7003516

Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.

Type: Grant

Filed: May 15, 2003

Date of Patent: February 21, 2006

Assignee: Word Data Corp.

Inventors: Peter J. Dehlinger, Shao Chin
Code, system, and method for generating documents

Publication number: 20050278623

Abstract: Disclosed are a computer-readable code, system and method for assisting in the preparation of a target document. The system stores a plurality of template documents which are each parsed into passages, typically paragraphs. The individual passages from the several template documents form a database of model passages from which a new document can be constructed. To retrieve a particular passage, the user describes the content of interest, or represents the content as a string of words and/or word groups. The system uses a word-records file to identify one or more descriptive passages having the highest match score with the user description. From these highest-matching passages, the user selects one or more descriptive passages for use in document construction.

Type: Application

Filed: May 13, 2005

Publication date: December 15, 2005

Inventors: Peter Dehlinger, Shao Chin
Code, system, and method for generating concepts

Publication number: 20050198026

Abstract: Disclosed are a computer-readable code, system and method for generating candidate novel concepts in one or more selected fields. The system operates to generate strings of terms composed of combinations of word and optionally, word-group terms that are descriptive of concept elements in such field(s), and uses a genetic algorithm to find one or more high fitness strings, based on the application of a fitness metric which quantifies, e.g., the number occurrence of pairs of terms in texts in a selected library of texts. The highest- score string or strings are then applied in a database search to identify one or more pairs of primary and secondary texts whose terms overlap with those of a high fitness string.

Type: Application

Filed: February 2, 2005

Publication date: September 8, 2005

Inventors: Peter Dehlinger, Shao Chin
Code, method, and system for manipulating texts

Publication number: 20050120011

Abstract: Disclosed are a computer-readable code, system and method for combining texts to form novel combinations of texts related to a desired target concept, where the concept is represented in the form of a natural-language text or a list of descriptive word and/or word-group terms. The system operates to find primary and secondary groups of texts having highest term match scores with a first and second subset of terms in the concept, respectively. It then generates pairs of texts containing a text from each of the primary and secondary groups of database texts, and selects for presentation to the user, those pairs of texts having highest overlap scores as determined from one or more of (i) term overlap, (ii) term coverage, (iii) feature-specific cross-correlation, (iv) attribute-specific correlation, and (v) citation score of one or both texts in the pair.

Type: Application

Filed: November 18, 2004

Publication date: June 2, 2005

Inventors: Peter Dehlinger, Shao Chin
Text representation and method

Publication number: 20040064304

Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.

Type: Application

Filed: May 15, 2003

Publication date: April 1, 2004

Applicant: WORD DATA CORP

Inventors: Peter J. Dehlinger, Shao Chin
Text-representation code, system, and method

Publication number: 20040059565

Abstract: A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms selected from one of (i) non-generic words in the document, (ii) proximately arranged word groups in the document, and (iii) a combination of (i) and (ii), a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts.

Type: Application

Filed: July 1, 2003

Publication date: March 25, 2004

Inventors: Peter J. Dehlinger, Shao Chin
Text-searching code, system and method

Publication number: 20040054520

Abstract: Disclosed are a computer-readable code, system and method for comparing a target concept, invention, or event with each of a plurality of texts. Each of a plurality of non-generic words and optionally, words groups characterizing the target concept, invention, or event, is selected as a vector term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of texts, a match score related to the number of vector terms present in or derived from that text that match those in the target concept, invention, or event. Texts having the highest match scores are selected.

Type: Application

Filed: July 1, 2003

Publication date: March 18, 2004

Inventors: Peter J. Dehlinger, Shao Chin
Text-classification code, system and method

Publication number: 20040049498

Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. For each of a plurality of non-generic words and/or words groups characterizing the target document, there is determined a selectivity value calculated as the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and the document is represented as a vector of terms, where the coefficient assigned to each term is a function of the selectivity value determined for that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text.

Type: Application

Filed: July 1, 2003

Publication date: March 11, 2004

Inventors: Peter J. Dehlinger, Shao Chin
Text-classification system and method

Publication number: 20040006457

Abstract: Disclosed are a computer-readable code, system and method for classifying a target document in the form of a digitally encoded natural-language text as belonging to one or more of two or more different classes. Each of a plurality of non-generic words and optionally, words groups characterizing the target document is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of sample texts having associated classification identifiers, a match score related to the number of descriptive terms present in or derived from that text that match those in the target text. From the selected matched texts, and the associated classification identifiers, a classification determination of the target document is made.

Type: Application

Filed: February 25, 2003

Publication date: January 8, 2004

Inventors: Peter J. Dehlinger, Shao Chin
Text-processing code, system and method

Publication number: 20040006558

Abstract: Disclosed is an automated system, machine-readable code, and method for generating descriptive words and optionally, multi-word groups derived from a digitally encoded, natural-language input text that describes a concept, invention, or event in a selected field. The system includes (a) an electronic digital computer, (b) a database of words and optionally, word-groups derived from a plurality of texts, and (c) computer-readable code for accessing the database. The database provides, or can be used to calculate, a selectivity value for each of the words and optionally, word groups contained in or derived from the input text. Words and optionally, word groups having an above-threshold selectivity value are selected as descriptive terms from the input text.

Type: Application

Filed: September 30, 2002

Publication date: January 8, 2004

Inventors: Peter J. Dehlinger, Shao Chin
Text-searching system and method

Publication number: 20040006459

Abstract: Disclosed are a computer-readable code, system and method for comparing a target concept, invention, or event with each of a plurality of texts. Each of a plurality of non-generic words and optionally, words groups characterizing the target concept, invention, or event, is selected as a descriptive term if the term has an above-threshold selectivity value in at least one library of texts in a field, where the selectivity value of a term is a measure of the field-specificity of that term. There is then determined, for each of the plurality of texts, a match score related to the number of descriptive terms present in or derived from that text that match those in the target concept, invention, or event. Texts having the highest match scores are selected.

Type: Application

Filed: September 30, 2002

Publication date: January 8, 2004

Inventors: Peter J. Dehlinger, Shao Chin
Text-processing database

Publication number: 20040006547

Abstract: Disclosed is a computer-accessible database composed of a list of non-generic words contained in a plurality of digitally encoded texts. Associated with each term is a selectivity value or values that are related to the frequency of occurrence of that word in at least one library of texts in a field, relative to the frequency of occurrence of the same word in one or more libraries of texts in one or more other fields, respectively. Also associated with each term are one or more text identifiers identifying one or more of the digitally processed texts containing that word. Each text identifier may be further associated with sentence and word-number identifiers that identify the sentence and word number(s) of a given database word.

Type: Application

Filed: September 30, 2002

Publication date: January 8, 2004

Inventors: Peter J. Dehlinger, Shao Chin

prev 1 2 3 4