Abstract: A method for creating and using a cross-idea association database that includes a method for associating words and word strings in a language by analyzing word formations around a word or word string to identify other words or word strings that are equivalents or near equivalents semantically. One method for associating words and word strings includes querying a collection of documents with a user-supplied word or word string, determining a user-defined amount of words or word strings to the left and right of the query string, determining the frequency of occurrence of words or word strings located on the left and right of the query string, and ranking the located words.
Abstract: A method for creating and using a cross-idea association database that includes a method for associating words and word strings in a language by analyzing word formations around a word or word string to identify other words or word strings that are equivalents or near equivalents semantically. One method for associating words and word strings includes querying a collection of documents with a user-supplied word or word string, determining a user-defined amount of words or word strings to the left and right of the query string, determining the frequency of occurrence of words or word strings located on the left and right of the query string, and ranking the located words.
Abstract: A method and apparatus for translating a document segment in a first language into a document segment in a second language. A document segment can be text in the form of words or phrases in a document. The invention can be used where there is insufficient information to directly translate the document in the first language into the document in the second language. The invention includes providing an association between the document segment in the first language and a document segment in each of a plurality of third languages, providing an association between sample segments in the second language each of which corresponds to a segment in each of the plurality of third languages, identifying at least two sample segments that are identical as a deduced association segment; and associating the deduced association segment with the document segment in the first language.