Abstract: A system and method for searching for and identifying traditional Arabic poems in unstructured text. The system includes a CPU, a computer readable memory and a computer readable storage media. The system further includes first program instructions to identify lines of text within the document that have equivalent length. The system further includes second program instructions to group the identified lines of text as candidate verses. The system further includes third program instructions to select the candidate verses to generate a candidate poem. The first, second, and third program instructions are stored on the computer readable storage media for execution by the CPU via the computer readable memory.
Type:
Application
Filed:
May 11, 2011
Publication date:
November 15, 2012
Applicant:
KING ABDULAZIZ CITY FOR SCIENCE AND TECHNOLOGY
Abstract: Computational models of dialog context have often focused on unimodal spoken dialog or text, using the language itself as the primary locus of contextual information. But as spoken unimodal interaction is replaced by situated multimodal interaction on mobile platforms supporting a combination of spoken dialog with graphical interaction, touch-screen input, geolocation, and other non-linguistic contextual factors, a need arises for more sophisticated models of context that capture the influence of these factors on semantic interpretation and dialog flow. The systems, methods, and computer program products disclosed herein address this need. A method for multimodal search includes, in part, determining an intended location of search query based upon information received from a remote mobile device that issued the search query.
Abstract: A computer-implemented method is provided for searching documents containing complex bodies of knowledge, such as patents and research papers. The computer-implemented method and related hardware and software provides methodology to interpret the intent of the searcher (the meaning of the searcher's query) into a MetaLanguage, including but not limited to the use of Fundamental Nature Attributes, Fundamental Action Attributes and Weighting of these attributes as it pertains to the intent of the searcher. The invention relates to semantic based searches. The same methodology that is used on the searcher's query is also used to mine and store the existing databases of patents and research papers into databases of MetaLanguage for the purpose of producing search results that better match search inquiries.
Abstract: Various embodiments provide a system, method, and computer program product for sorting and/or selectively retrieving a plurality of documents in response to a user query. More particularly, embodiments are provided that convert each document into a corresponding document language model and convert the user query into a corresponding query language model. The language models are used to define a vector space having dimensions corresponding to terms in the documents and in the user query. The language models are mapped in the vector space. Each of the documents is then ranked, wherein the ranking is based at least in part on a position of the mapped language models in the vector space, so as to determine a relative relevance of each of the plurality of documents to the user query.
Abstract: A computer implemented data processor system automatically disambiguates a contextual meaning of natural language symbols to enable precise meanings to be stored for later retrieval from a natural language database, so that natural language database design is automatic, to enable flexible and efficient natural language interfaces to computers, household appliances and hand-held devices.
Abstract: A method and apparatus for sub-topic identification from a search result that matches a query, said method including the steps of receiving a search result, extracting snippets from said search result that contain said query, truncating snippets on an instance of a boundary token, identifying phrases within said snippets that include the query, comparing all said phrases to determine optimal phrases, and presenting said optimal phrases. The apparatus for sub-topic identification from a search result that matches a query may include a dedicated server or a proxy for processing the search and sub-topic query.
Abstract: System and method for performing Unicode matching for comparing and merging similar data objects having Unicode strings that are equivalent yet not exact matches. Unicode characters are characterized by number of strokes, stroke order, radicals, geometry, phonemes in association with input method editor and keyboard characteristics such as location of a character on an IME or keyboard (or number of GUI interface interactions used in entering the character, e.g., via tapping where “a” on a mobile device keyboard takes 1 tap of a key, “b” takes 2 taps). These characteristics associated with code points and IME's/keyboards are utilized to create subdomains for matching and determining “distance” to other Unicode code points (e.g., number of keyboard keys away). Allows for determining whether close, yet incorrect data entry may have taken place. Enables merging of duplicate data objects into master data object where minor differences or spelling errors introduce actually represent duplicate data.
Type:
Application
Filed:
December 21, 2007
Publication date:
June 25, 2009
Inventors:
Paul N. Weinberg, Richard T. Endo, Xidong Zheng, Nathan F. Yospe, Ariel Hazi
Abstract: The present invention relates to a system and methodology to facilitate automated retrieval and classification of information. A system and associated methods are provided that facilitate generation of code and/or documents. The system includes a component that receives data relating to at least one of a user's request for desired code functionality and one or more desired documents. A mapping component correlates parsed subsets of the data to specific functional objects respectively located remote from the user, wherein a generator employs the functional objects to form at least one of the desired code and the documents.