Assissted Knowledge Discovery and Publication System and Method

-

A system and method is presented for knowledge discovery that incorporate both human and computers to index, process, and communicate and share the knowledge and electronic contents. It also provides a platform for launching unlimited number of qualified and content reviewed publishing/broadcasting ventures. The system assists individuals for faster and more efficient discovery/creation of new and useful knowledge, and valuable artistic content. It also provides incentives to the owners of the ventures and a method for rewarding or compensating all contributors.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS REFERENCE TO RELATED FOREIGN PATENT APPLICATION

The present application claims priority to Canadian patent application No. CA 2,595,541, filed on Jul. 6, 2007 entitled “Assisted knowledge discovery and publication system and method” by the same applicant.

FIELD OF INVENTION

This invention generally relates to knowledge discovery, content creation, and content sharing using people, computer systems, software program agents, and databases.

BACKGROUND OF THE INVENTION

Internet has provided a long awaiting tool for connection and communication of people around the world. One of the most important applications and implication of Internet is its use in enhancing ideas and rapid information exchange between people or groups of people with similar interests. Such growing interest has created many applications and systems for group discussions and question answering, such as Yahoo ask, wikipeida, search engines, photo and video sharing, numerous portals, discussion groups, and the like. These systems and applications have accelerated knowledge discovery, creation of artistic contents, producing novel and useful inventions, and in general advancement of our understanding of the universe around us.

However, since most of these knowledge sharing and contributions are arbitrary qualified, it takes time for general public to come to a robust and lasting understanding of a subject, or appreciation of a content. Therefore, the vast amount of data, that is being generated daily, has to be filtered out over a relatively long period of time by collective wisdom of public before it can be used. While in most subject matters of general public interest, ordinary people may contribute to the subject and let the fact and best solution to be found overtime, these unsupervised method of general public understanding growth lacks the rigor and credibility that is needed for a real advancement of public well being. The rigor and credibility only comes after a relatively long period of time. Mostly the information available through Internet needs further verification and research by the consumer and this could be time consuming and frustrating.

The process of peer reviewed scientific contribution publication, on the other hand, has the rigor and substance and therefore the credibility that is needed for true advancement of human knowledge, nevertheless it is a very slow process and does not present the speed and ease of accessibility that is necessary to tap into the vast potential of general public brain power and knowledge. Editors and reviewers of scientific journals do not have much incentive to serve unknown ordinary contributors. Moreover, naturally, they do not have the resources or expertise to find and cover all the subject matters of importance and assess and investigate all submitted contents.

Therefore there is a need in the art to have a system that, automatically or semi-automatically, can assist both publication/broadcasting administration and contributors to screen and assess all submitted contents in terms of their intrinsic value and substance before being viewed or used by public, without posing the above mentioned constraints. It is also desirable to have a system that can systematically guide users, through their research to discover, innovate, create, and make valuable contributions. It is also advantageous to have a central system that allow all the qualified experts launch their own publication/broadcasting ventures with the least amount of investment and overhead for commercial gain thereby accelerating the rate of knowledge discovery, knowledge distillation, and economic growth.

SUMMARY OF THE INVENTION

In this application a system and method is presented for knowledge sharing and discovery by analyzing the content of online repositories, building an association database of ontological subjects, and solicitation of electronic contents in the form of a text, audio, or video and any combination of them. The system and method can assist and guide the users and creators, regardless of their level of knowledge, to being able to make valuable contributions, while shortening the research and creation time significantly. The shared knowledge is peer reviewed by authorities in each subject so that their quality and substance is more reliable than arbitrary qualified contents presently available in the Internet.

The system is comprised of information processing units in the form of hardware and software that are connected to the Internet by communication means. The processing units can be comprised of electronic hardware such as CPUs (central processing units) memories, and software in the form of specialized programs and algorithms, and intelligent agent program, in any applicable computer language.

In building the system software agents are used to find important subject matters/fields of interest by looking up into a list of subjects gathered from various sources such as lexicons, ontologies, dictionaries, special dictionaries, and searching through Internet and counting and ranking the importance of a subject by counting the number of documents containing that subject or any other ranking methods for concepts. At the same time the software agent is looking for proper names and affiliations and addresses that are associated with the subject and ranking them accordingly based on their level of authority. Alternatively the system finds the subjects of importance and interests and the associated experts by directly searching through readily available databases where it can find the desired information such as university URLS, specialized professional associations, who's who, and all online publication collections available.

The system then assigns appropriate names or titles for such subject matters and makes a list of available subjects and titles as candidate name for publication/broadcasting shop to be used for subscription and running by users. In the preferred embodiment, the system further provides an online publishing/broadcasting format/s for each subject matter in the form of online journals or knowledge sharing groups, interactive conferences, broadcasting templates and the like, which is called a publishing/broadcasting shop in this application. The system further contains a database of authorities' experts in each subject matter for consultation and reviewing.

Users, who want to establish their own online publishing/broadcasting shop, then may apply to subscribe or buy online publishing/broadcasting shop's title/s among the topics and titles available. Alternatively the system accepts suggestion from interested users or subscriber to open a shop with their own suggested title or name. Interested users can include individuals, legal entities, a group of individuals as well as computer agents. The system will grant the privilege of establishing an online publication/broadcasting shop according to the system's predetermined standards. Once the application is approved and a title of publishing/broadcasting shop is assigned to the user the owner of the online shop can use the service of the system and start soliciting and providing the service to her/his group of people interested in that subject matter.

To assist the editors and contributors, reviewers, and users, the system has a distilled universal repository of human knowledge that is called Ontological Subject Map (OSM) in this application. The OSM is used to screen, evaluate, guide and assist, and measure the value of a submitted content, its novelty, and overall merit of a contribution. By consulting the OSM the system can pose useful questions and make intelligent suggestions and guides for further research or clarification.

The OSM is a layered indexed repository of universal knowledge that is built by indexing all related existing concepts and subjects, nouns, proper nouns, compound nouns, named entities or in general all such conceivable entities and concepts, that we call Ontological Subjects (OS) in this invention. The layered index or database is built by starting from one or a number of most popular ontological subjects and searching the available databases to find all other ontological subjects associated with each of them ordered by their association ranks (e.g. counts.) Then each ontological subject is indexed with a desired number of other ontological subjects in each layer ordered by their association ranking. Once this layer is constructed and indexed we repeat the procedure to find the most related OSs with each member of this layer. A node in an open 2-dimensional tree like graph may represent each OS. Each node therefore can only be connected to its above OS node and a number of other nodes below it. In each layer there are two types of nodes, namely Dormant or Non-Dormant (growing). In each layer a node is dormant if the corresponding OS is already been growing in upper layer/s or the same layer. In a situation and according to one exemplary embodiment, if more than one OS is found associated with several upper nodes, and it is not growing in an upper layer, then it will become Non-Dormant only under a single node which has the highest ranking association to its immediately above node. In this manner each ontological subject is growing only once in the whole index. Therefore each non-dormant node is connected to one node above and is connected to a number of nodes below it. Dormant nodes are only connected to its immediately above node. If desired number of associated OS was not found for a node, then we add extra nodes and mark them as unknown. The desired number of associated OS for each node can be arbitrarily selected. However, for simplicity we may choose a constant number of associations for each node.

Furthermore we may consequently represent an OS with a discrete spectral like function whose horizontal axis is the associated OSs and the vertical axis is the value of each associate. In this way an Association Value (AV) function is defined and stored in the database for each OS for further usage. The association value (AV) function can be considered a signature spectrum of an OS. Using signal-processing techniques, such as cross correlation, autocorrelation, Fourier Transformation (FT), Discreet Fourier Transformation (DFT) one then is able to extract the information and find a hidden relationship between OSs. For instance, using the concept of power spectral density, one may define and measure the power of an OS as a sign of its importance or for approximate reasoning application etc.

At the same time or after the indexing of OS association is completed, another software agent will look for the kind of associations between each OS and it's associates by searching through databases such as WordNet, FrameNet, the whole internet, or any such a database that a relation between an OS and its components is expressed by natural languages. The agent will look for patterns of explicitly expressed statements or semantic frames, as defined by FrameNet project in Berkeley University, to establish the kind of relationship between each two OSs. The agent may also use natural language processing (NLP) methods and algorithms such as text simplification, to find such an association pattern. However since there is a vast amount of data available, the chances are that the agent will be able to find the explicitly expressed and verified statement or frame, which is composed by humans, that is looking for. The verification of relations is done by statistical analysis of the database. Diversity of sources and a number of times that a statement is repeated to express a relation between two OS leads to the verification of that statement. These statements, or semantic frames, expressing a relationship between an OS and its components are also stored and indexed for further reference.

This database is then used to assess textual documents or any electronic content, such as audio or video, pictures, graphs, curves etc., that its information is transferred to textual format. The system first extracts the ontological subjects of a document and forms an OS spectrum or associated set for the document, with predetermined weighting coefficients rules. In one simple aspect of the invention, the system then can select an OS as the principal OS of the document and compare the document spectrum with that of the principal OS spectrum stored in the database, for further analysis. Alternatively one may partition a document to a number of parts and repeat the process of OS mapping to these collections of smaller content in the same way that an OSM is made from larger collection of contents.

The analysis includes, but not limited to, discovery of new ontological subjects, and discovery and verification of new associations between OSs. Over the time, new nodes and associations will show their importance by leading to growth of its newly discovered node or other nods, and finding the verified associations that are valuable to other contributors or is of commercial interest to commercial entities and ventures.

The system may also expand each OS to its constituent OS components and forms a more expanded OS spectrum for the document. In this way for each document we can form an almost distinguishable OS spectrum. The document OS spectrum bears important information about the value of the text compositions, its novelty and main points. Peaks and valleys may be used to analyze the content in terms of its novelty and an indication of possible new knowledge. For instance from the document spectrum we may select the highest amplitude OS as the main or principal subject of the text, then look at the next number of highest amplitudes OSs and form an abbreviated or abstracted spectrum of the text. Then compare this abstracted spectrum with the spectrum of the main OS already stored in the database, if there is a strong correlation between the abbreviated spectrum of the text and the principal OS spectrum in the data base, chances are that the content of the text does not bear much information. However for further checking one may look at the kind of statement and frames that is been used in the text to connect the components of the document spectrum to the main OS and compare it with the existing database of known relations between the these OSs. Generally there are more ways known in the art of spectral and signal analysis to evaluate the correctness and novelty of the text using the mentioned OS spectrum. When there are distinguishable peaks in the document spectrum that system does not have a record of verified relations for them, then the system mark them as novel and worthy of investigation and can compose a series of questions or suggestion to explain their relationship. It may also zoom to less amplified OSs and question and suggest a relationship between a high amplitude OS with a lower ones etc. All these information are available both to the editors of each shop and the creator of the content. The system or the editor of each shop can present such unknown to the public and solicit for contributions to the solution.

The strength of such a knowledge discovery system lies in its systematic processes, large number of potential participants, limitless subject matters, and its vast databases that are not readily available to individuals. The potential value of the system also lies in that the method enables measuring and quantification of one's contribution, both implicitly and explicitly to the advancement of the knowledge database.

To represent such knowledge to public, the system uses publishing/broadcasting shops as mentioned above. The system will receive the information content in the form of a text, audio, video, or any combination of them that is in general related to one or more subject or category, either solicited or not. The content received is tagged with a unique reference, authenticated submitter information such as such as digital signatures, biometric information, IP address etc. or any other means that is appropriate to make sure the content being submitted is uniquely tagged and owned by a real single entity, individual/s, agents, and legal entities and the like.

The subject or category can be identified by either a computer program or by the creator/s of the content, or by people other than the creator of content, or in general by any combination of these three groups. The system, then, with or without the help of the shop administrator/s, qualifies the content of submission as described above in terms of its merit novelty, importance, and impact. The system may further add the overall merits including of a submission by looking at the rank and credit of submitters, and their affiliations.

The system finds the authorities expert in the subject again by either computer programs automatically or by human, then the content is sent to one or more of these authorities which we call reviewers and ask them to evaluate, comment, make suggestions, give opinion, and feed back via an online communication channel such as email and the like.

The reviewer are being asked to evaluate the information content of the creator/s and give their feedback to either recommend the content for inclusion in the data or knowledge repository of the system for use by other users or clients, or being rejected for inclusion, or being included after a revision by the creator/s subject to satisfaction of the reviewer/s.

If the reviewers recommend the content for inclusion or online publishing/broadcasting conditionally, then the content and the comments or questions are sent to the creator/s and are given a creating time to send the revised content. The revised content along with the answers to the reviewer comments or questions can be sent to the reviewer again and ask for their recommendation either for inclusion in the data/knowledge base of the system or rejection. Then the creator/s will be informed of the final decision.

The subject matter is basically limitless as long as qualified reviewers can be found by human assistance or automatic program (a program which finds the authorities and rank them based an algorithm which we can call “Ranked Subject Matter Authorities or RSMA”). If the system cannot find qualified authorities then content can still be published under different collection, which is marked as non-reviewed contents. Since the publications are peer or expert reviewed, the collection is citable and can be used to the credit of creator of the content.

Paid subscriber to each or a number of shops, selling copies of contents, advertisement and all the known methods of electronic commerce revenue sources, may generate revenue for each shop and the system. Moreover, the system can be mandated from an entity to make an effort to find a solution to a challenging problem that is important for that entity. The system then splits the proceeds to all the contributing parties according to a predefined contract.

The commercial success of the system is mostly based on the substance of the contents published or broadcasted and the value of its service to the users. Therefore the system, in one aspect of this invention, will share the success to its contributors. Over the time, depending on the success of the a content in terms of its popularity and importance, a creator accumulates credit points and at some point they can claim their credits in some form of monetary valuable compensation, rewards, prizes, profit sharing, ownership etc. There is provided a method to quantify the importance of one's contribution to the art. The more a submitted content generates further ontological subjects and grows its node, the higher the rank of importance and contribution of content will be. Also ranking algorithm of linked databases, such as pagerank, can be applied to evaluate the importance and impact of content over the time.

Considering that each shop's title is also a node in the Ontological Subjects database, it is also possible to evaluate the overall rank and importance of the shops in a similar fashion. The success of a shop is measured by both its popularity and importance of its subject and impact as well as the revenue that a shop or the owner of the shop has generated. The system allows shop owners, with or without the help of system, to generate income by, for example, displaying other entities advertisement, banner, etc. or any other means appropriate and accepted by law. The system again is benefited from such income based on the predefined agreements with each shop owner.

The present invention provides a system and method for faster and efficient universal knowledge discovery by firstly providing and presenting the worthwhile and important subjects to explore and work on. Secondly, by having built the map of ontological subjects of the universe, assisting and guiding users to explore, and to assess their work and discover new knowledge of the subjects as fast as possible. Thirdly providing an environment for rapid expert reviewed circulation and publication of new or filtered knowledge that it is more credible and rigorous than non-reviewed published materials over the Internet. The described system and method does not impose any limit on the number of subjects and the number of content being received, thereby enabling exploration of all possible subject matters of interest and importance to the public and science while it maintains the desirable standards of the published contents. This will bring the cost of useful knowledge discovery significantly down.

Moreover the invention provides a system and method that allows people to get a fast assessment of their work universally and have a rapid access to the authorities' comment on a creation that they have worked on. The described method further provides access to the most updated, yet assessed, ideas and state of the universal knowledge.

Among the advantages of the present disclosure, in a preferred embodiment, is having a central system that allow all the qualified experts launch their own publication/broadcasting ventures with the least amount of investment and overhead for commercial gain. Furthermore this invention provides a method for rewarding the contributors by measuring the impact of their contributions and sharing with them the commercial success or profit of the system accordingly, thereby encouraging the brightest to participate in the advancement of the state of the art and economics.

It is also another object of this invention to build an upper or universal knowledge repository or ontological subject map that can address all the queries while it is expanding over time. Such a map can help the users to confidently navigate through state of the art knowledge of universe and effectively guiding them in their research leading to new discoveries.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: Illustrates one simplified exemplary architecture, of knowledge discovery and publication/broadcasting method according to the present invention.

FIG. 2: An exemplary schematic of building the repository of subject matters and corresponding authorities with their ranking and contacts information using general databases.

FIG. 3: Another exemplary schematic of building the repository of subject matters and corresponding authorities with their ranking and contacts information using targeted databases.

FIG. 4 A, and B: Shows the content of the basic databases of publication/broadcasting shops available for users and subscribers, 4A is provided by automatic method according to invention, and 4B is the list of user suggested shops according to invention.

FIG. 5: Shows exemplary building blocks and process flow of publication/broadcasting method according to the invention.

FIG. 6: Shows one exemplary graphical representation of the ontological subject association database (or Ontological Subject Map, OSM) according to the invention.

FIG. 7: Shows one flow diagram of the process of the building the ontological subject association database (i.e.OSM).

FIGS. 8a and 8b: Show exemplary representations of an OS versus its associated OS (constituent OS): A: the constituent OSs ordered by their association value to the OS1 and B: The Association Value (AV or AVD) function representation of an OS or a document in relation to universal OS axis (domain).

FIG. 9: Document Association Value (AVD) function of a document, after all or some of constituent OSs were expanded.

FIG. 10: One exemplary flow diagram of extracting, indexing, and updating the database of association statement/frame and scoring the merit of an input.

FIG. 11: Another exemplary representation of an OS association graphs indicating that each shop is considered as node and shows there could exist some unknown nodes and relations.

FIG. 12. One exemplary flow diagram of extracting indexing and updating the database of association statement/frame and scoring the merit of an input. Shows how OS database is updated and created over the time, associations are updated, new associations are established and new nodes are added as the knowledge base is increased.

DETAILED DESCRIPTIONS

The invention is now described in detailed disclosure accompanied by several exemplary embodiments of the system and its building blocks.

Without restriction intended for any form of electronic contents such as text, audio, video, pictures and the like, we start by describing the embodiments with regards to inputs that are in the form of text. However, for other forms of electronic content the present methodology and process can be used once one considers that all types of electronic contents are different realization of semantic representation of universe. Therefore a semantic or knowledge representation transformation will make the current description applicable to all forms of electronic contents submitted to the system.

To be clear throughout this description lets define “Subject Matter (SM)” and “Ontological Subject (OS)” at the beginning. Generally, any string of characters can be a “Subject Matter (SM)” or “Ontological Subject (OS)” according to the definitions of this invention. Less generally, they could be any word or combination of words. Therefore SMs and OSs have in principal the same characteristics and are not distinguishable from each other. Yet less generally and a bit more specifically, a subject matter (SM) is a word or combination of words that shows a repeated pattern in many documents and people or some groups of people come to recognize that word or combinatory phrase. Nouns and noun phrases, verbs and verb phrases with or without adjectives are examples of subject matters. For instance the word “writing” could be a subject matter, and the phrase “Good Writing” is also a subject matter. A subject matter can also be a sentence or any combination of a number of sentences. We define “Ontological Subjects (OS)” as subject matters worthy of knowing about. They are mostly related, but not limited, to nouns, noun phrases, entities, and things, real or imaginary.

Referring to FIG. 1, there is shown one brief and simplified schematic block diagram of the system of knowledge discovery and publication/broadcasting method. The system is consisted of one or more databases and one or more publishing/broadcasting shop. Computer software programs are provided for providing the services to the users. As shown, the system first receives content through a communications media such as internet where upon authenticate the submission and tags it with the desired tagging information. Then the submission is passed to the content admin. The content admin job is to find and assign the right shop that the content should be considered for publishing or broadcasting, find the expert related to the subject of the content from the database and once the content analysis and revision is complete, send it to the corresponding shop for inclusion to its database accessible by other users through communication means such as Internet. Content admin also passes the content to the content analyzer. Content analyzer role is to evaluate the submission merit in terms of its credibility, informative statements, investigation of existence of new knowledge and any other criteria that might be related to the values of a submission. In doing so, the content analyzer consults with the pre-built knowledge database that contains the indexed Ontological Subject (OS) and their relations.

The important step in building such a system proposed in FIG. 1, is to build a repository of subject matters of importance and interest.

In FIG. 2, the flow diagram of identifying and finding subject matters of interest for discussion, research, and further investigation for online system inclusion is illustrated. FIG. 2 shows the process of finding the subject matters and a potential title for e-pub/broadcast shops. This can be done by feeding a list of concept from a primary knowledge repository such as lexicons (e.g. Wordnet) or a semantic frames list (e.g. Framenet) or from a universal ontology (e.g. SUMO, the suggested upper merged ontology), or any other such lists of subjects assembled automatically or manually, to a Searching Agent (SA). The SA can search the internet and look for specific information such as the number of documents over the internet dealing with a specific term or concept, or find a relation between any concept and proper noun entities who have contributed in that subject, or any other desirable task. Such searching agents, also called intelligent search agent or web robots, can vary in their tasks. In an article, by G. M. Youngblood entitled, “Web hunting: Design of a Simple Intelligent Web Search Agent,” appeared in the ACM Crossroads Student Magazine (summer 1999), there is provided the basic elements of intelligent agents that are used for construction of intelligent Web search agents. The article describes the basic principals of composing such web robots to do a variety of tasks by searching through the databases in the Internet. By Internet database, it is meant all forms of data that can be found from a single web page to the more structured databases like specific domain databases of published material, to the whole databases of a search engine company such as Google or Yahoo or MSN and the like.

In particular a web robot can be employed to do searching through a search engine and finds the roughly total number (counts) of web pages containing a word, or a phrase, or count of co-occurrences of each two OS. Furthermore it can be programmed with such programming languages like Perl, Python, AWK, and many others like C, C++, C# and the like, to look for specific textual patterns, co-occurrence of words within certain proximities and basically extracting any type of character string that is desirable in a text. Those familiar with Natural Language Processing (NLP) and Computational Linguists (CL) can readily use such languages to write scripts and programs to extract different types of textual information from a text. In principal it is possible to parse sentences, simplify compound sentences, rephrase text, summarize, finding lexical elements such as noun phrases, extract proper nouns or named entities, synonym replacement, syntactic and semantic analysis of a text, make lists, build databases, manipulate strings of characters, and generally can execute any algorithm that is designed for a specific goal. A good introduction to the subject of NLP and CL can be found in the web site of “American Association for Artificial Intelligence,” organization (www.aaai.org). Another good source is a book entitled “text mining application program” by Manu Konchady, published by Charles River Media, Boston, Mass., 2006. The book provides information, teaching, training, and many accompanied application programs to perform the tasks mentioned here.

Referring to FIG. 2 again, a searching software agent (SSA), that includes a SA, is employed to search, gather and analyze the information available in the Internet for the specific purposes. One primary function of such SSA in this configuration is finding the important subject of interest to the society and their importance or rank, from the whole available human knowledge repository, such as Internet. The second important function of software agent, that also includes a SA, in FIG. 2, is to find the name of the real entities, individuals or agents, considered expert in each of these subject matters, and extract their affiliation and contact address.

There are a number of ways of doing this task. One simple way to find and list the important subject matters of interest is to use a search engine and look at number of web pages that contain that term or phrase. The term or phrase, is to feed to SSA, can be from any list of words, such as dictionaries, ontologies, list of proper names, or any list of words and phrases that exist or may exist. Search engines usually show the web counts (or hits) that can be used as an indication of importance of a term. The web counts that a search engine shows indicate the level of obsession and importance to the society, though not an exact indication of intrinsic value of a subject matter. Specially searching for web count of general nouns such as Science, Physics, Biology, or combination of them such as “Biophysics” or “Biochemical machine” and seeing a large number of documents containing that term is an indication of human obsession to that term and hence its intrinsic importance in human life. More sophisticated rules and algorithms and criteria may be devised to find important subject matters.

FIG. 3, shows a more effective way to find important subject matters and the name and address of the authorities. In this configuration the SSA is provided with the address of URLs that have rich repository of subject matters and terms of interests and they also contain name and affiliation and addresses of a large number of experts. For instance SSA can be used to extract a subject matter and the individual name and address associated with that subject matter, by searching in all the universities' web sites that usually contain “.edu”, scientific organizations such as “ieee.org”, online content stores such as “amazone.com”, and many other online content collections. These collections contain the title, the expert names, and other necessary information that can readily be extracted by the searching agent/s. For example, in the paper entitled, “White Page Construction from Web Pages for Finding People on the Internet”, appeared in Computational Linguistics and Chinese Language Processing, vol. 3, no. 1, February (1998), by Hsin-Hsi Chen, Guo-Wei Bian, the authors describe a method of finding the name and extracting the contact address of individuals from the Internet.

Consequently the system shown in FIGS. 2, and 3 will create a list of subject matters and find an appropriate title that will reflect the essence of the subject matter and put them in a list of Subject Matters, i.e. SMs. The list of SMs may or may not be hierarchical. The system in FIGS. 2, 3 will further create a list of individual expert considered authorities for each subject matter. The list of authorities may also be ranked according to certain metrics for example the number of quality contribution to each subject or how many times other have referred to them or their work, or how many important sources have referred to them, etc. Different algorithm can be used to rank the subject matters and authorities.

FIG. 4A, shows the list of the titles or subject matters, their corresponding authorities, the list of shops with the title available, and the list of qualified people who are eligible candidates for running a shop. Titles are not necessarily the same as subject matters but they are preferable if they reflect their corresponding subject matter. FIG. 4B, shows that such a list may also be proposed and referred by users other than the list that the system has built. The list is available for interested user who wants to publish an online journal or a broadcasting shop. Users, who want to establish their own online publishing/broadcasting shop, then may apply to subscribe or buy online publishing/broadcasting shop's title/s among the topics and titles available or by their own suggestion to the system. Interested users can include individuals, legal entities, and a group of individuals as well as computer agents. The system (could be called as the main host) may also publish and administer as many shops as it desires under its own administration. The system (main host) will grant the privilege of establishing an online publication/broadcasting shops according to the system's predetermined standards. The system however needs not to be physically located at one place and different parts of the system, such as servers, databases, storages, or even control and administration may be placed or done in different places.

Once the application is approved and a title of publishing/broadcasting shop is assigned to the user, the owner of the shop can use the service of the system and start soliciting or being open to receive contents, and providing the service to her/his/its group of people interested in that subject matter. The system or administrators of the publishing/broadcasting system may also invite certain individuals to administer one or more of the publishing shops and act as editor or promoter of the journal (publishing/broadcasting shop). For instance a computer program identifies subjects of interest by searching and analyzing the information available, e.g. by automatically searching the internet, and finds association between a subject of interest and the authorities in the subject and invites them to administer and to establish their own online journal using certain rules and protocols that is provided by the host publisher (the main publishing site and system). New subjects can be introduced or proposed by a user and once the user's authenticity and credit is established the user can also establish her/his/its own shop with the proposed title or subject.

The subject matter is basically limitless as long as qualified reviewer can be found by human assistance or automatic program (a program which finds the authorities and rank them based an algorithm which we can call “Ranked Subject Matter Authorities or RSMA”). If the system cannot find qualified authorities, then content can still be published under different collection which is marked as non reviewed contents. Since the publications are peer or expert reviewed the collection is citable and can be used to the credit of creator of the content.

Referring to FIG. 5: it shows another embodiment of online publishing system according to the present invention in more details than FIG. 1. The system is composed of N (being an integer) number of online shops. The shops have been established by real individuals, or other entities and/or even computer agent, which administer a publishing shop or journal. The system receives contents by creator/s through; for instance, its webpage or any other means of communication. The system initially tags the received content with the required and desired information, such as date and time of submission and IP address of the submitting computer and the like. The system also provides for interested people to subscribe to one or more of shops by online registration process, which is customary in the e-business. The creator may or may not be a registered subscriber of member of the system or any of its shop. The readers and contributors (creators) can usually search the system to find their shop of interest to read or submit their content or manuscripts. If the creator does not specify its respective shop then the system will assign a shop for considering the submitted content for possible online publication/broadcasting. The system or administers of the publishing/broadcasting shops may also invite certain individuals to administer one or more of the shops and act as editor or promoter, or provide reviewing service.

In the preferred embodiment, the content is submitted through the main publishing host and therefore each content being submitted get the submission date that can be used for crediting the contributor/s or as an indication of priority.

As shown in the FIG. 5, once an information content is received by the system and the subject or main semantic is assigned, the system will find the authorities expert in the subject again either automatically by computer programs or by human, then the content is sent to one or more of these authorities which we call reviewers and ask them to evaluate, comment and give opinion and feed back, etc. via an online communication channel such as email and the like.

The reviewers are being asked to evaluate the information content of the creator/s and give their feedback to either recommend the content for inclusion in the data or knowledge repository of the system for use by other users or clients, or being rejected for inclusion, or being included after a revision by the creator/s subject to satisfaction of the reviewer/s.

If the reviewers recommend the content for inclusion or online publishing conditionally, then the content and the comments or questions are sent to the creator/s and are given a creating time to send the revised content. The revised content along with the answers to the reviewer comments or questions can be sent to the reviewers again and ask for their recommendation either for inclusion in the data/knowledge base of the system or rejection. Then the creator/s will be informed of the final decision. It should be mentioned that the reviewers, in general, could be intelligent expert agents in that subject matter. The content after final acceptance will be included in the repository of the corresponding shop. The accepted content can then be published immediately and being made available to the users (readers) and be readable by special software for viewing such materials such as the Zinio's digital publishing software (www.zinio.com) and/or being collected and released periodically in the form of a magazine or any other format that is desired and available based on the capabilities of the state of the art at the time of publishing.

Referring to FIG. 5 again, there is the block that will initially assess the merits of the content being submitted. The block consults with knowledge data bases (KDB, OS map) and extracts the knowledge in the content and also assists the creators and users in general by providing the analysis results, and guides them to enrich their content. The knowledge database shown in FIG. 5 contains an index of ontological subjects.

FIG. 6 shows a layered indexed repository of universal knowledge that is built by indexing all related existing concepts and subjects, nouns, proper nouns, compound nouns, named entities or in general all such conceivable entities, that is called ontological subjects (OS) in this invention, as defined earlier. As seen, a node in an open 2-dimensional tree like graph may represent an OS. The graph is called “Ontological Subject Map” or OSM for short, in this invention.

FIG. 7 shows one preferred exemplified algorithm to build the index in FIG. 6. The index in FIG. 6, is built by starting from one or a number of most popular ontological subjects and searching the available databases to find all other ontological subjects associated with each of them ordered by their association ranks (e.g. counts). On simple way is using a search engine and searching for a combined pair of OS and looking at the web counts figure. Then each ontological subject is indexed with a desired number of other ontological subjects, i.e. associated set of ontological subjects, in each layer ordered by their association ranking. Once this layer is constructed and indexed, the procedure is repeated, to find the most related OSs with each member of this layer. The index consists of several index frames that can uniquely identify each OS on the OSM. As an example, the indexing frame can be a multi digit frame that can accommodate the desired or predefined maximum number of association with an OS. For example a 3 hexadecimal number (a 12 bit frame) can uniquely identify up to 4096 OSs in connection to its upper layer node. In one exemplary embodiment of OSM shown in FIG. 6, the indexing is done as follows: the number of indices shows the layer that the OS is in, and the values of the indices, excluding the last index frame, points to the OS in its above layer that is associated with, while the value of the last index frame indicates its association rank with its above pointing OS node. More indexing frame can be added or defined for other purposes.

In FIG. 6, however, for ease of depiction only the value of each frame is shown. Accordingly, for example OS1 . . . OSM, belongs to the layer “1” (one), and OSxyz represents an OS in layer “3” (because it has 3 indices) which is the zth highest associate of OSxy in the layer “2”. OS0 is not counted as a layer and while basically can be any Ontological Subject (as the starting point), we consider it to be “the whole information that there is in the internet” and therefore the layer 1 in FIG. 6 consisted of basically the most popular Ontological Subjects (OS) in the Internet. Although not necessary, in searching for OSs of the layer 1, we may, exceptionally, want to exclude proper names in order to find the most substantiated OSs for layer 1 (one).

Referring to FIG. 6 again, each node therefore is only connected to its above OS node and a number of other nodes below it. In each layer there are two types of nodes, namely Dormant or Non-Dormant (growing). In each layer a node is dormant if the corresponding OS is already been growing in upper layer/s or the same layer. In a situation and according to one exemplary embodiment, if an OS is found to be associated with more than one upper node, and it is not growing in an upper layer, then it will become Non-Dormant only under a single node for which it has the highest ranking association, which is an immediately above node. In this manner each ontological subject is growing only once in the whole index. Therefore each non-dormant node is connected to one node above it and is connected to a number of nodes below. Dormant nodes are only connected to its immediately above node. Dormant nodes also are tagged with the information that points to their open position (growing place) in the database. Moreover if desired number of associated OS was not found for a node, then we add extra nodes and mark them as unknown. The desired number of associated OS for each node can be selected based on predefined criteria. For instance one criterion might be to ensure that certain numbers of growing nodes are existent under each OS. However for simplicity we may choose a constant number of associations for each node and assign a minimum certain portion of them to be non-dormant. Also in practice one may choose or defines other indexing formats and methods as long as the OSs and their association information are uniquely indexed in the database.

Referring to FIG. 8a now, we may represent an OS with a discrete spectral like function whose horizontal axis is the associated OSs and the vertical axis is the ranked or weighted value of each associate. In this way an Association Value (AV) function is defined and stored in the database for each OS for later usage. In FIG. 8a, the AV function is depicted versus the constituent OS in its lower layer as indexed in FIG. 6 which starts and numbered from the strongest association and declines towards the higher numbered indices. However in FIG. 8b, the AV function is depicted versus the constituent OS of the whole OS association database (universal OS map). That is, in FIG. 8b, the horizontal axis covers all the existing OSs and is universal. The association value (AV) function can be considered a signature spectrum of an OS. Using signal-processing techniques, such as cross correlation, autocorrelation, Fourier Transformation or Discreet Fourier Transformation (DFT) one is able to extract the information and find a hidden relationship between OSs. For instance using the concept of power spectral density one may define and measure the power or energy of an OS as a sign of its importance or for approximate reasoning application, or comparison, or the like. For instance in FIGS. 8a or 8b, we can define an energy function, (i.e. integral over the power spectral density) for the OS and in selecting the desired number of constituent OSs we may chose enough number of constituent OS so that they will account for the 98% of the total energy of the OS node.

Concurrent with or after the indexing of OS association is completed, another software agent will look for the kind of associations between each OS and it's associates by searching through databases such as WordNet, FrameNet, the whole internet, or any such a database that a relation between an OS and its components is expressed by natural languages. The agent will look for patterns of explicitly expressed statements, such as SVO sentences, to establish the kind of relationship between each two OSs. The agent may also use natural language processing (NLP) methods and algorithms such as text simplification, to find such an association pattern. However since there is a vast amount of textual data available, the chances are that the agent will be able to find the explicitly expressed and verified statements, composed by humans, that the agent is looking for. The verification of relations is done by statistical analysis of the database. Diversity of sources and the number of times that a statement is repeated to express a relation between two OS leads to the verification of that statement. These statements, expressing a relationship between an OS and its components also stored and indexed for further reference.

This database is then used to assess textual documents or any electronic content, such as audio or video, pictures, graphs, curves and the like, that its information can be transferred to textual format. The system first extracts the ontological subjects of a document and forms an OS spectrum for the document, with predetermined weighting coefficients rules. For example depending on the position of an OS in the text and counts of each OS, a coefficient for that OS is assigned. Also, for instance, one may partition a document to a desired number of parts such as chapters, pages, paragraphs, or sentences and repeat the process of OS mapping to these collections of smaller content in the same way that an OSM is made from larger collection of contents, i.e. finding the association and co-occurrences counts of each two OS.

FIG. 8b shows that an AV function of an OS may as well represent an OS spectrum of a document. In this case it is called a Document Association Value or DAV function as shown in FIG. 8b. In one simple aspect of the invention, the system can select an OS as the principal OS of the document and compare the document spectrum with that stored in the database (OSM), for further analysis. The analysis includes, but not limited to, discovery of new ontological subject, and discovery and verification of new associations between OSs. For instance one can subtract document spectrum of the principal OS (made from the document) from the universal spectrum of the same OS, then observe peaks and valleys that might correspond to new relations or new nodes, or filling one of the unknown nodes in the universal OSM. Other sophisticated analysis can also be applied without departing from the spirit of this disclosure.

FIG. 9 shows that the system may also expand the spectrum of each OS or each document to its constituent OS components and forms a more expanded OS spectrum for the document. In this way for each document we can form an almost distinguishable OS spectrum. The expansion might be done several times for various reasons depend on the need and objective of the analysis. The document spectrum bears important information about the value of the text composition, its novelty and its main points. Peaks and valleys may be used to analyze the content in terms of its novelty and an indication of possible new knowledge. For instance from the document spectrum we may select the highest amplitude OS as the main or principal subject of the text, then look at the next number of highest amplitudes OSs and form an abbreviated or abstracted spectrum of the text. Then compare this abstracted spectrum with the spectrum of the main OS already stored in the database, if there is a strong correlation between the abbreviated spectrum of the text and the principal OS spectrum in the data base, chances are that the content of the text does not bear much information. However for further checking one may also look at the kind of statements and frames that have been used in the text to connect the components of the document spectrum to the main OS, or to each other, and compare it with the existing database of known relations between these OSs.

FIG. 10, shows how the knowledge database of OS associations and relational statement can be used to evaluate the merits of a content being submitted to the system as an initial evaluation as shown in the FIG. 5. The submitted content is simplified by natural language processing (NLP) techniques and algorithms to simplify the text and extract its Ontological Subjects along with the statement of the facts about the OSs and the associations of the OSs in the document as stated by the creator of the content.

The resultant OS spectrum of the document and corresponding associating relationship between the OSs of the document, is compared both with the internal knowledge database of the system as shown in FIGS. 6-9 and also checked and compared with the knowledge database of outside the system, e.g. Internet, for further assessment. Overall based on the verified statements and novel statements and novelty of the content in comparison with the indexed OSs and their stored relationship in the system and what is already known in the outside KDB and also by checking the affiliation and ranks of the creator/s the system assigns an overall score of merit. If the score is above the predefined threshold, depending on its internal criteria, the system then considers it for review by authorities as explained earlier.

Generally there are more ways known in the art of spectral and signal analysis to evaluate the correctness and novelty of the text using the mentioned OS spectrum. When there are distinguishable peaks in the document spectrum that system does not have a record of verified relations for them, then the system marks them as novel and worthy of investigation and can compose a series of questions or suggestion to explain their relationship. It may also zoom to less amplified OSs and question and suggest a relationship between a high amplitude OS with a lower ones etc. All these information are available both to the editors of each shop and the creator of content. The system or the editor of each shop can present such unknown to the public and solicit for contributions to the solution.

FIG. 11 shows another representation of an OS, expanded one or more times to its constituent OSs, whereas existence of other OSs and novel and unknown relationships has been detected. Each OS association, unknown to the system, in the FIG. 11, can be considered as topic of discussion or possible worthy of having a shop of its own. The existence of possible novel relationships can also guide the editor or administrators as well as the users or creator/s of the content to places for further focus and zooming investigation.

For instance assume in FIG. 11 the main OS and topic is “skin cancer” and the system has detected by spectral expansion and analysis, or led to the existence of some unknown OSs that possibly are associated with some known OSs such as health, aging, physical exercise, genome, parents, the age of earth, the age of sun, or eating, children, etc. Then the system pose a question as what is the relation between the age and skin cancer, what is the relation between the age of the sun and skin cancer and what is the relation between number of children and the skin cancer and so on. Once these questions are answered and verified by the process explained in FIGS. 6, and 10, there will be more nodes added in the OSs database and the association database and then there will be more questions to ask. The process, then leads to finding the verified answers and statements that establishes new OSs and its association information in the Knowledge Database (KDB).

FIG. 12, shows one flow diagram of a software agent which proposes existence of new OSs and topics of further research, validates the proposed associative statements of the input content in regards to a subject matter or OS, and updates Knowledge database of the system. The software agent in FIG. 12, further saves the information of the creator of each such novel association or implicit or explicit discoverer of new OSs.

Over the time, new nodes and associations will show their importance by leading to growth of its newly discovered node or other nods, and finding the verified associations that are valuable to other contributors or is of interest to commercial entities and ventures.

The system then is able to rank the importance of a contribution over the time, universally or in each domain, based on an algorithm that quantifies the intrinsic value of the newly found associations or nodes. For instance the values of a contribution over time can be evaluated by a software agent that shows how many other contribution have been build upon one's original contribution, following its submission.

It should also be emphasized that each OS is placed in the map uniquely and in the universal context not a domain specific context. Therefore one of the applications and advantages of the system is to show users and creators the rout to navigate and get the direction in their exploration of a subject matter through the OSM. In other words it can be used as searching tool to quickly get the hint as to what are the most important subjects or issues related to their subject matter of interest. In almost most of the cases users do not know what they do not know in relation to a subject matter. Moreover they might not be informed enough to recognize the importance scores of other subjects to their subject mater of interest. Therefore, by using the OSM, the user can prioritize his/her effort and get the direction to navigate his/her exploration in search of finding useful new knowledge.

It should also be noted that the system can and preferably is realized distributedly and need not to operate in a single physical location. Basically each part of the system can be placed anywhere in the world and being connected together by communication means, yet yielding the same function and providing the desired service to its users.

The system can sustain its service by several methods of generating revenue and profit. Paid subscriber to each or a number of shops, selling copies of contents, advertisement and all the known methods of electronic commerce revenue sources, may generate revenue for each shop and the system. Moreover, the system can be mandated from an entity to make an effort to find a solution to a challenging problem that is important for that entity. The system then splits the proceeds to all the contributing parties according to a predefined contract.

Additionally, fresh and timely contributions can be sold online to other researchers interested in that research content to keep them update. There could be enough interest from peer researches to get the result. The price of content download can be decreased over time in a certain fashion and of course the contributor/s can get a reward and share the profit from the sale of their contribution. The revenue generation model can be from targeted advertising fee as well. Since the shops become specialized the advertisement in each shop are more relevant to the reader of each publications/ broadcasting shop in general and the revenue from target ads from each shop will be shared by the owner of the shop and the publishing host. Each shop can arrange its own real or virtual face-to-face meeting and organize conferences, etc. or have gatherings and organize events.

The success of the system commercially is mostly based on the substance of the contents published or broadcasted and the value of its service to the users. Therefore the system, in one aspect of this invention, will share the success to its contributors. Over the time, depend on the success of a content in terms of its popularity and importance, a creator accumulates credit points and at some point they can claim their credits in some form of monetary compensation, rewards, prizes, profit sharing, ownership or the like allowable by laws. There is provided a method to quantify the importance of one's contribution to the art. For instance, the more a submitted content generates further ontological subjects and grows its node, the higher the rank of importance and contribution of the content will be. Also ranking algorithm of linked databases, such as the page-rank, can be applied to evaluate the importance and impact of the content over the time.

Considering that each shop's title is also a node in the Ontological Subjects database, it is also possible to evaluate the overall rank and importance of the shops in a similar fashion. The success of a shop is measured by both its popularity and importance of its subject and impact as well as the revenue that a shop or the owner of the shop has generated. The system allows shop owners, with or without the help of system, to generate income by, for example, displaying other entities advertisement, banner, or any other means appropriate and accepted by law. The system again is benefited from such income based on the predefined agreements with each shop owner.

The system can have its own rules or protocols to ensure its profitability and its competitiveness. For instance, while many of the shops are identified and set up and establish by the system there is provided an opportunity for qualified users to establish their own shops through the system as well. There could be of course parallel and competing online shop but they will be given a certain time to produce enough interest in their own shops by writing or soliciting content from other so that they can produce enough interest and online traffic to keep their shops open. Failing to do so the system can stop giving them service or put them in the lower rank or lower tier shop. It is possible for a demoted shop administrator to upgrade to higher tier over time by producing higher quality contents. The contributors that have made significant and valuable contribution may have the privilege and advantages on establishing their own shops.

In conclusion the disclosed system and method will help to accelerate the rate of knowledge discovery for everyone's benefit, by providing the subject matters of intrinsic value for exploration, tools for knowledge navigation and content evaluation, rapid circulation and communication, and providing incentive for all the contributors such as content creators, editors, shop owners, and administrators. The system and method thus can help to improve the quality of life and increase economic growth and prosperity.

It is understood that the preferred or exemplary embodiments and examples described herein are given to illustrate the principles of the invention and should not be construed as limiting its scope. Various modifications to the specific embodiments could be introduced by those skilled in the art without departing from the scope and spirit of the invention as set forth in the following claims.

Claims

1. A method of knowledge discovery and publication comprising:

a. identifying at least one subject matter for exploration;
b. designating at least one online shop, for publishing and broadcasting electronic contents, corresponding to at least one of said at least one subject matter;
c. finding at least one person having expertise in one of said at least one subject matter;
d. receiving at least one electronic content from at least one creator, related to at least one of said at least one subject matter;
e. providing a first computer implemented routine, having access to human knowledge repositories, configured to screen all the received at least one electronic content, and initially assess the merits of the at least one electronic content for being reviewed by at least one of a second computer implemented routine and at least one of said at least one expert, for inclusion in at least one of said at least one online shop;
f. providing incentive to at least some of all contributors, including the at least one creator and said at least one person expert for administering the at least one online shop and reviewing said at least one electronic content, according to a predefined arrangement; and
g. publishing the reviewed at least one electronic content, in the designated at least one online shop, corresponding to the at least one subject matter of said at least one electronic content, for viewing by other creators and users, whereby to accelerate the rate of knowledge discovery and substantially increase the amount of credible and useful knowledge.

2. The method of claim 1, further comprising a searching software agent for identifying the subject matters for exploration and related titles by searching through predetermined corpuses, thereby being able to provide a large number of the subject matters for exploration and the related titles for the corresponding online publishing shops.

3. The method of claim 1, further comprising a searching software agent for identifying names and contact information of experts and authorities having expertise and credentials related to the subject matters, thereby identifying, for each subject matter, a number of experts and authorities, for acting as one or more role of a reviewer, editor, administrator, and shop owner of one or more publication shop having the subject matter for exploration related to their expertise.

4. The method of claim 1, further comprising selling at least one of the publishing shops with predetermined conditions, each said shop having the title related to at least one of the subject matter for exploration, to at least one person interested in owning and operating at least one of the publishing shops, wherein each said shop has the right to use the first computer implemented routine for screening and assessing the electronic content having a subject matter related to that shop.

5. A computer implemented method of indexing ontological subjects comprising:

a. providing a plurality of ontological subjects;
b. providing access to at least one collection of electronic content by way of electronic communication;
c. evaluating numerically an association value for a selected number of pairs of ontological subjects, using data information of co-occurrence of each of said pairs in said at least one collection of electronic content;
d. selecting a first set of ontological subject, having at least one member, and selecting a first associated sets of ontological subject, each of said first associated set having at least one member, wherein each of the first associated sets is corresponded to a member of the first set, wherein each member of each set of said first associated sets having association value grater than a predetermined threshold with at least one member of the first set;
e. indexing the members of the first set and the members of said first associated sets in a multilayer index with indices, by an indexing method comprising; i. indexing the members of the first set wherein indices are configured to show places of the members in the first layer of said multilayer index; ii. indexing the members of the first associated sets in next layer of the index, wherein indices are configured to show places of the members in that layer and to indicate the association of a member of the first set to the members of the corresponded first associated set; iii. marking an ontological subject in its current layer of the index as dormant, if said ontological subject has been indexed in previous layer and if an ontological subject is not indexed in previous layer but is a member of more than one of the first associated sets, mark said ontological subject as dormant everywhere that appear in its current layer of the index except in one place in its current layer of the index;
f. selecting a second set of ontological subject, having at least one member, from the ontological subjects of the first associated sets which are not marked as dormant, i.e. non-dormant, in the index, and select a second associated sets, each of said second associated sets having at least one member, wherein each of the second associated sets is corresponded to a member of the second set, wherein each member of each set of said second associated sets having association value grater than a predetermined threshold with at least one member of the second set;
g. indexing the members of the second associated sets in next layer of the index wherein indices are configured to show places of the members in their current layer and to indicate the association of a member of the second set to the members of corresponded second associated set, and marking an ontological subject in its current layer of the index as dormant, if said ontological subject has been indexed in one of the previous layers and if the ontological subject is not indexed in one of the previous layers but is a member of more than one of the second associated sets, mark the said ontological subject as dormant everywhere that appear in its current layer of the index except in one place in its current layer of the index; and
h. repeating the step f to form a next set of ontological subjects from the not marked as dormant, i.e. non-dormant, members of the last associated sets of ontological subject and to form a next associated sets and indexing the ontological subject of the said next associated sets by repeating the step g, wherein the term the second is replaced with the third, the forth and so on, whereby to uniquely index the desired number of ontological subjects in the context of said at least one collection of electronic contents, and wherein each ontological subject positioned once in the index as non dormant; and
i. storing the index of the ontological subject on a storage device.

6. The method of claim 5, wherein said except one place, for indexing the non-dormant ontological subject, is chosen in its current layer such that the non-dormant ontological subject has the highest association value with its previous layer member of the ontological subject set.

7. The method of claim 5, wherein the index provides necessary information in indexing so that the marked dormant ontological subjects can point to the place in the index that said marked ontological subject is a non-dormant ontological subject.

8. The method of claim 5, wherein at least one of the associated sets of ontological subject contains at least one non-dormant member.

9. The method of claim 5, wherein number of members of each of said associated sets is a predetermined number.

10. The method of claim 5, wherein the plurality of ontological subject contains all existing ontological subjects of the universe to that time.

11. The method of claim 5, wherein the at least one collection of electronic content includes collection of all electronic contents in the Internet.

12. The method of claim 5, wherein the at least one collection of electronic content includes contents of an Internet search engine database.

13. The method of claim 5, wherein said association value of the pair of ontological subject is evaluated using the count information by querying at least one Internet search engine and getting the co-occurrence count of the pair from the search engine.

14. The method of claim 5, wherein the ontological subject index is represented with a corresponding ontological subject map, wherein each indexed ontological subject is shown by a node in the map, configured to demonstrate if the corresponding ontological subject is marked dormant, and wherein each node corresponding to a not marked dormant, i.e. non dormant, ontological subject, is uniquely positioned in the ontological subject map, whereby providing a tool for visual navigation and focusing on a desired particular place of the map.

15. The method of claim 5, wherein the associated set of the ontological subject is represented by at least one form of spectral graph having ontological subjects in one axis and showing the association value of the members of the set in another axis.

16. The method of claim 5, wherein the ontological subject index is used for identifying at least one of, related subject, most important subjects related to another subject, indirect relation of two or more subjects, whereby to increase efficiency in searching and acquiring new knowledge.

17. The method of claim 5, wherein the ontological subject index is used to guide and show to a user routs for exploration in search of new knowledge, thereby assisting the user in knowledge discovery.

18. The method of claim 5, wherein said ontological subject index is updated periodically or continually.

19. The method of claim 5, further comprising:

a. finding explicit form of relations of association between each of selected pairs of ontological subjects by searching through the at least one collection of contents; and
b. recording and storing the said explicit forms of relations in a database configured for easy retrieval.

20. The method of claim 19, wherein said database of explicit form of relations of association is updated periodically or continually.

21. The method of claim 1, further comprising:

a. providing a reference ontological subject index built from a selected list of ontological subjects and a selected collection of content;
b. building an ontological subject index for the received electronic content, from the ontological subjects contains in the received content; wherein the index has at least one layer and has at least one ontological subject at its first layer;
c. comparing the ontological subject index of the received electronic content with the reference ontological subject index by matching the first layer at least one ontological subject to position of the same at least one ontological subjects in the reference index;
d. scoring the merit of the received content by a predetermined formula, wherein at least one of the variables in the formula is the number of similar ontological subjects in the associated set of the first level at least one ontological subject; and
e. passing the content for reviewing if the score is higher than a predetermined criteria, whereby to screen a large number of received electronic content automatically before reviewing.

22. The method of claim 5, wherein the ontological subject index is used as reference for scoring the merit of an electronic content in terms of validity, novelty and importance.

23. The method of claim 21, or 22, wherein ontological subject map is also used for performing said methods.

24. A system for knowledge discovery and publication of contents comprising:

a. a database of a plurality of subject matters for exploration;
b. a plurality of online publishing/broadcasting shop each having a title related to at least one of said subject matters;
c. at least one reference indexed database representing an ontological subject map that contains a desired number of ontological subjects related to said subject matters, wherein said ontological subject map indicating the association of each ontological subject to all other desired number of ontological subjects ordered and spread in the map by a predefined method based on their association value to each other, and wherein said database of the map can be derived by having access to a collection of contents having at least one content;
d. means for receiving at least one electronic content from creators, assigning at least one possible publishing shop for said at least one electronic content, and building at least one ontological subject map using said received at least one electronic content;
e. at least one software program configured to automatically measure at least one merit of said received at least one electronic content, by comparing predefined parameters and functions of said at least one ontological subject map of said content with said at least one reference ontological subject map, and passing the content, if the at least one merit meets a predetermined criteria, for publication to said at least one of assigned shop; and
f. means for making said received at least one electronic content available for public access; whereby to publish a large number of credible and useful knowledge in short period of time.

25. The system of claim 24, further including: a database of persons' names and contact information, each person having expertise or authority in at least one of said subject matters.

26. The system of claim 25, further including: means for communicating with the authorities and the creators, thereby reviewing the received electronic content, by at least one authority from the said database of authorities, and assisting the creator by informing about the assessment and status of the received electronic content.

27. The system of claim 24, wherein each said online publishing shop having at least one editor from one of human or intelligent software agent, configured to be able to review an electronic content, to select some of the received contents for publishing according to predetermined criteria.

28. The system of claim 24, further including software means configured to rank the importance of subject matters and online publishing shops by several predetermined factors, at least one of the factors is selected from popularity, position in the reference ontological map, members of the associated set, and number of non-dormant members in the associated set, of the ontological subject corresponding to the subject matter.

29. The system of claim 24, wherein said at least one reference index being updated periodically or continually, and said system further including at least one database, also being updated periodically or continually, having indexed explicit form of relations of association between each of selected pairs of ontological subjects.

30. The system of claim 29, further including software artifact configured to perform

a. detecting and recognizing: i. inclusion of at least one new ontological subject in the said updated databases or inclusion of a new ontological subject in the associated set of an ontological subject; ii. position changing or displacement of an ontological subject in said updated databases; and
b. ranking the importance of the said new ontological subject and re-ranking the importance of the displaced ontological subject.

31. The system of claim 30, wherein said at least one new ontological subject, having importance factor higher than a threshold, is included in the database of subject matters for exploration and designating an online shop based on a predefined criteria.

32. The system of claim 30, further including a software artifacts configured to inform at least some of users and creators, about the inclusion of at least one new ontological subject to the at least one reference index, or displacement of an ontological subject in the reference ontological subject index; thereby notifying the users about an important potential discovery.

33. The system of claim 24, further including a platform for sale configured to sell the publishing shops with predetermined conditions, each said shop having the title related to at least one of the subject matter for exploration, to at least one person interested in owning and operating at least one of the publishing shops.

34. The system of claim 28, further including a platform for sale configured to sell the publishing shops with predetermined conditions, each said shop having the title related to at least one of the subject matter for exploration, to at least one person interested in owning and operating at least one of the publishing shops, wherein the price depends on the importance rank of the subject matter.

35. The system of claim 24, wherein further including a compensation platform configured to reward and compensates at least some of persons and entities who contributed to provide the contents and operate the system.

36. The system of claim 24 and in accordance with claims 26, 27, 33, 34, and 35, wherein said compensation platform to reward and compensate said those contributed including, if applicable, at least some of creators whose electronic content being published by at least one of the online publishing shop, at least some of authorities who reviewed at least one electronic content, at least some of administrators, at least some of said editors, and at least some of the owners of the online publishing shops.

37. The system of claim 24, wherein the system is distributed and at least one part of the system is physically located in, or performs from, different location from the rest of the system.

Patent History
Publication number: 20090030897
Type: Application
Filed: Jul 24, 2008
Publication Date: Jan 29, 2009
Applicant: (Thornhill)
Inventor: Hamid Hatami-Hanza (Thornhill)
Application Number: 12/179,363
Classifications
Current U.S. Class: 707/5; Query Processing For The Retrieval Of Structured Data (epo) (707/E17.014)
International Classification: G06F 7/10 (20060101); G06F 17/30 (20060101);