Assissted Knowledge Discovery and Publication System and Method
A system and method is presented for knowledge discovery that incorporate both human and computers to index, process, and communicate and share the knowledge and electronic contents. It also provides a platform for launching unlimited number of qualified and content reviewed publishing/broadcasting ventures. The system assists individuals for faster and more efficient discovery/creation of new and useful knowledge, and valuable artistic content. It also provides incentives to the owners of the ventures and a method for rewarding or compensating all contributors.
Latest Patents:
The present application claims priority to Canadian patent application No. CA 2,595,541, filed on Jul. 6, 2007 entitled “Assisted knowledge discovery and publication system and method” by the same applicant.
FIELD OF INVENTIONThis invention generally relates to knowledge discovery, content creation, and content sharing using people, computer systems, software program agents, and databases.
BACKGROUND OF THE INVENTIONInternet has provided a long awaiting tool for connection and communication of people around the world. One of the most important applications and implication of Internet is its use in enhancing ideas and rapid information exchange between people or groups of people with similar interests. Such growing interest has created many applications and systems for group discussions and question answering, such as Yahoo ask, wikipeida, search engines, photo and video sharing, numerous portals, discussion groups, and the like. These systems and applications have accelerated knowledge discovery, creation of artistic contents, producing novel and useful inventions, and in general advancement of our understanding of the universe around us.
However, since most of these knowledge sharing and contributions are arbitrary qualified, it takes time for general public to come to a robust and lasting understanding of a subject, or appreciation of a content. Therefore, the vast amount of data, that is being generated daily, has to be filtered out over a relatively long period of time by collective wisdom of public before it can be used. While in most subject matters of general public interest, ordinary people may contribute to the subject and let the fact and best solution to be found overtime, these unsupervised method of general public understanding growth lacks the rigor and credibility that is needed for a real advancement of public well being. The rigor and credibility only comes after a relatively long period of time. Mostly the information available through Internet needs further verification and research by the consumer and this could be time consuming and frustrating.
The process of peer reviewed scientific contribution publication, on the other hand, has the rigor and substance and therefore the credibility that is needed for true advancement of human knowledge, nevertheless it is a very slow process and does not present the speed and ease of accessibility that is necessary to tap into the vast potential of general public brain power and knowledge. Editors and reviewers of scientific journals do not have much incentive to serve unknown ordinary contributors. Moreover, naturally, they do not have the resources or expertise to find and cover all the subject matters of importance and assess and investigate all submitted contents.
Therefore there is a need in the art to have a system that, automatically or semi-automatically, can assist both publication/broadcasting administration and contributors to screen and assess all submitted contents in terms of their intrinsic value and substance before being viewed or used by public, without posing the above mentioned constraints. It is also desirable to have a system that can systematically guide users, through their research to discover, innovate, create, and make valuable contributions. It is also advantageous to have a central system that allow all the qualified experts launch their own publication/broadcasting ventures with the least amount of investment and overhead for commercial gain thereby accelerating the rate of knowledge discovery, knowledge distillation, and economic growth.
SUMMARY OF THE INVENTIONIn this application a system and method is presented for knowledge sharing and discovery by analyzing the content of online repositories, building an association database of ontological subjects, and solicitation of electronic contents in the form of a text, audio, or video and any combination of them. The system and method can assist and guide the users and creators, regardless of their level of knowledge, to being able to make valuable contributions, while shortening the research and creation time significantly. The shared knowledge is peer reviewed by authorities in each subject so that their quality and substance is more reliable than arbitrary qualified contents presently available in the Internet.
The system is comprised of information processing units in the form of hardware and software that are connected to the Internet by communication means. The processing units can be comprised of electronic hardware such as CPUs (central processing units) memories, and software in the form of specialized programs and algorithms, and intelligent agent program, in any applicable computer language.
In building the system software agents are used to find important subject matters/fields of interest by looking up into a list of subjects gathered from various sources such as lexicons, ontologies, dictionaries, special dictionaries, and searching through Internet and counting and ranking the importance of a subject by counting the number of documents containing that subject or any other ranking methods for concepts. At the same time the software agent is looking for proper names and affiliations and addresses that are associated with the subject and ranking them accordingly based on their level of authority. Alternatively the system finds the subjects of importance and interests and the associated experts by directly searching through readily available databases where it can find the desired information such as university URLS, specialized professional associations, who's who, and all online publication collections available.
The system then assigns appropriate names or titles for such subject matters and makes a list of available subjects and titles as candidate name for publication/broadcasting shop to be used for subscription and running by users. In the preferred embodiment, the system further provides an online publishing/broadcasting format/s for each subject matter in the form of online journals or knowledge sharing groups, interactive conferences, broadcasting templates and the like, which is called a publishing/broadcasting shop in this application. The system further contains a database of authorities' experts in each subject matter for consultation and reviewing.
Users, who want to establish their own online publishing/broadcasting shop, then may apply to subscribe or buy online publishing/broadcasting shop's title/s among the topics and titles available. Alternatively the system accepts suggestion from interested users or subscriber to open a shop with their own suggested title or name. Interested users can include individuals, legal entities, a group of individuals as well as computer agents. The system will grant the privilege of establishing an online publication/broadcasting shop according to the system's predetermined standards. Once the application is approved and a title of publishing/broadcasting shop is assigned to the user the owner of the online shop can use the service of the system and start soliciting and providing the service to her/his group of people interested in that subject matter.
To assist the editors and contributors, reviewers, and users, the system has a distilled universal repository of human knowledge that is called Ontological Subject Map (OSM) in this application. The OSM is used to screen, evaluate, guide and assist, and measure the value of a submitted content, its novelty, and overall merit of a contribution. By consulting the OSM the system can pose useful questions and make intelligent suggestions and guides for further research or clarification.
The OSM is a layered indexed repository of universal knowledge that is built by indexing all related existing concepts and subjects, nouns, proper nouns, compound nouns, named entities or in general all such conceivable entities and concepts, that we call Ontological Subjects (OS) in this invention. The layered index or database is built by starting from one or a number of most popular ontological subjects and searching the available databases to find all other ontological subjects associated with each of them ordered by their association ranks (e.g. counts.) Then each ontological subject is indexed with a desired number of other ontological subjects in each layer ordered by their association ranking. Once this layer is constructed and indexed we repeat the procedure to find the most related OSs with each member of this layer. A node in an open 2-dimensional tree like graph may represent each OS. Each node therefore can only be connected to its above OS node and a number of other nodes below it. In each layer there are two types of nodes, namely Dormant or Non-Dormant (growing). In each layer a node is dormant if the corresponding OS is already been growing in upper layer/s or the same layer. In a situation and according to one exemplary embodiment, if more than one OS is found associated with several upper nodes, and it is not growing in an upper layer, then it will become Non-Dormant only under a single node which has the highest ranking association to its immediately above node. In this manner each ontological subject is growing only once in the whole index. Therefore each non-dormant node is connected to one node above and is connected to a number of nodes below it. Dormant nodes are only connected to its immediately above node. If desired number of associated OS was not found for a node, then we add extra nodes and mark them as unknown. The desired number of associated OS for each node can be arbitrarily selected. However, for simplicity we may choose a constant number of associations for each node.
Furthermore we may consequently represent an OS with a discrete spectral like function whose horizontal axis is the associated OSs and the vertical axis is the value of each associate. In this way an Association Value (AV) function is defined and stored in the database for each OS for further usage. The association value (AV) function can be considered a signature spectrum of an OS. Using signal-processing techniques, such as cross correlation, autocorrelation, Fourier Transformation (FT), Discreet Fourier Transformation (DFT) one then is able to extract the information and find a hidden relationship between OSs. For instance, using the concept of power spectral density, one may define and measure the power of an OS as a sign of its importance or for approximate reasoning application etc.
At the same time or after the indexing of OS association is completed, another software agent will look for the kind of associations between each OS and it's associates by searching through databases such as WordNet, FrameNet, the whole internet, or any such a database that a relation between an OS and its components is expressed by natural languages. The agent will look for patterns of explicitly expressed statements or semantic frames, as defined by FrameNet project in Berkeley University, to establish the kind of relationship between each two OSs. The agent may also use natural language processing (NLP) methods and algorithms such as text simplification, to find such an association pattern. However since there is a vast amount of data available, the chances are that the agent will be able to find the explicitly expressed and verified statement or frame, which is composed by humans, that is looking for. The verification of relations is done by statistical analysis of the database. Diversity of sources and a number of times that a statement is repeated to express a relation between two OS leads to the verification of that statement. These statements, or semantic frames, expressing a relationship between an OS and its components are also stored and indexed for further reference.
This database is then used to assess textual documents or any electronic content, such as audio or video, pictures, graphs, curves etc., that its information is transferred to textual format. The system first extracts the ontological subjects of a document and forms an OS spectrum or associated set for the document, with predetermined weighting coefficients rules. In one simple aspect of the invention, the system then can select an OS as the principal OS of the document and compare the document spectrum with that of the principal OS spectrum stored in the database, for further analysis. Alternatively one may partition a document to a number of parts and repeat the process of OS mapping to these collections of smaller content in the same way that an OSM is made from larger collection of contents.
The analysis includes, but not limited to, discovery of new ontological subjects, and discovery and verification of new associations between OSs. Over the time, new nodes and associations will show their importance by leading to growth of its newly discovered node or other nods, and finding the verified associations that are valuable to other contributors or is of commercial interest to commercial entities and ventures.
The system may also expand each OS to its constituent OS components and forms a more expanded OS spectrum for the document. In this way for each document we can form an almost distinguishable OS spectrum. The document OS spectrum bears important information about the value of the text compositions, its novelty and main points. Peaks and valleys may be used to analyze the content in terms of its novelty and an indication of possible new knowledge. For instance from the document spectrum we may select the highest amplitude OS as the main or principal subject of the text, then look at the next number of highest amplitudes OSs and form an abbreviated or abstracted spectrum of the text. Then compare this abstracted spectrum with the spectrum of the main OS already stored in the database, if there is a strong correlation between the abbreviated spectrum of the text and the principal OS spectrum in the data base, chances are that the content of the text does not bear much information. However for further checking one may look at the kind of statement and frames that is been used in the text to connect the components of the document spectrum to the main OS and compare it with the existing database of known relations between the these OSs. Generally there are more ways known in the art of spectral and signal analysis to evaluate the correctness and novelty of the text using the mentioned OS spectrum. When there are distinguishable peaks in the document spectrum that system does not have a record of verified relations for them, then the system mark them as novel and worthy of investigation and can compose a series of questions or suggestion to explain their relationship. It may also zoom to less amplified OSs and question and suggest a relationship between a high amplitude OS with a lower ones etc. All these information are available both to the editors of each shop and the creator of the content. The system or the editor of each shop can present such unknown to the public and solicit for contributions to the solution.
The strength of such a knowledge discovery system lies in its systematic processes, large number of potential participants, limitless subject matters, and its vast databases that are not readily available to individuals. The potential value of the system also lies in that the method enables measuring and quantification of one's contribution, both implicitly and explicitly to the advancement of the knowledge database.
To represent such knowledge to public, the system uses publishing/broadcasting shops as mentioned above. The system will receive the information content in the form of a text, audio, video, or any combination of them that is in general related to one or more subject or category, either solicited or not. The content received is tagged with a unique reference, authenticated submitter information such as such as digital signatures, biometric information, IP address etc. or any other means that is appropriate to make sure the content being submitted is uniquely tagged and owned by a real single entity, individual/s, agents, and legal entities and the like.
The subject or category can be identified by either a computer program or by the creator/s of the content, or by people other than the creator of content, or in general by any combination of these three groups. The system, then, with or without the help of the shop administrator/s, qualifies the content of submission as described above in terms of its merit novelty, importance, and impact. The system may further add the overall merits including of a submission by looking at the rank and credit of submitters, and their affiliations.
The system finds the authorities expert in the subject again by either computer programs automatically or by human, then the content is sent to one or more of these authorities which we call reviewers and ask them to evaluate, comment, make suggestions, give opinion, and feed back via an online communication channel such as email and the like.
The reviewer are being asked to evaluate the information content of the creator/s and give their feedback to either recommend the content for inclusion in the data or knowledge repository of the system for use by other users or clients, or being rejected for inclusion, or being included after a revision by the creator/s subject to satisfaction of the reviewer/s.
If the reviewers recommend the content for inclusion or online publishing/broadcasting conditionally, then the content and the comments or questions are sent to the creator/s and are given a creating time to send the revised content. The revised content along with the answers to the reviewer comments or questions can be sent to the reviewer again and ask for their recommendation either for inclusion in the data/knowledge base of the system or rejection. Then the creator/s will be informed of the final decision.
The subject matter is basically limitless as long as qualified reviewers can be found by human assistance or automatic program (a program which finds the authorities and rank them based an algorithm which we can call “Ranked Subject Matter Authorities or RSMA”). If the system cannot find qualified authorities then content can still be published under different collection, which is marked as non-reviewed contents. Since the publications are peer or expert reviewed, the collection is citable and can be used to the credit of creator of the content.
Paid subscriber to each or a number of shops, selling copies of contents, advertisement and all the known methods of electronic commerce revenue sources, may generate revenue for each shop and the system. Moreover, the system can be mandated from an entity to make an effort to find a solution to a challenging problem that is important for that entity. The system then splits the proceeds to all the contributing parties according to a predefined contract.
The commercial success of the system is mostly based on the substance of the contents published or broadcasted and the value of its service to the users. Therefore the system, in one aspect of this invention, will share the success to its contributors. Over the time, depending on the success of the a content in terms of its popularity and importance, a creator accumulates credit points and at some point they can claim their credits in some form of monetary valuable compensation, rewards, prizes, profit sharing, ownership etc. There is provided a method to quantify the importance of one's contribution to the art. The more a submitted content generates further ontological subjects and grows its node, the higher the rank of importance and contribution of content will be. Also ranking algorithm of linked databases, such as pagerank, can be applied to evaluate the importance and impact of content over the time.
Considering that each shop's title is also a node in the Ontological Subjects database, it is also possible to evaluate the overall rank and importance of the shops in a similar fashion. The success of a shop is measured by both its popularity and importance of its subject and impact as well as the revenue that a shop or the owner of the shop has generated. The system allows shop owners, with or without the help of system, to generate income by, for example, displaying other entities advertisement, banner, etc. or any other means appropriate and accepted by law. The system again is benefited from such income based on the predefined agreements with each shop owner.
The present invention provides a system and method for faster and efficient universal knowledge discovery by firstly providing and presenting the worthwhile and important subjects to explore and work on. Secondly, by having built the map of ontological subjects of the universe, assisting and guiding users to explore, and to assess their work and discover new knowledge of the subjects as fast as possible. Thirdly providing an environment for rapid expert reviewed circulation and publication of new or filtered knowledge that it is more credible and rigorous than non-reviewed published materials over the Internet. The described system and method does not impose any limit on the number of subjects and the number of content being received, thereby enabling exploration of all possible subject matters of interest and importance to the public and science while it maintains the desirable standards of the published contents. This will bring the cost of useful knowledge discovery significantly down.
Moreover the invention provides a system and method that allows people to get a fast assessment of their work universally and have a rapid access to the authorities' comment on a creation that they have worked on. The described method further provides access to the most updated, yet assessed, ideas and state of the universal knowledge.
Among the advantages of the present disclosure, in a preferred embodiment, is having a central system that allow all the qualified experts launch their own publication/broadcasting ventures with the least amount of investment and overhead for commercial gain. Furthermore this invention provides a method for rewarding the contributors by measuring the impact of their contributions and sharing with them the commercial success or profit of the system accordingly, thereby encouraging the brightest to participate in the advancement of the state of the art and economics.
It is also another object of this invention to build an upper or universal knowledge repository or ontological subject map that can address all the queries while it is expanding over time. Such a map can help the users to confidently navigate through state of the art knowledge of universe and effectively guiding them in their research leading to new discoveries.
The invention is now described in detailed disclosure accompanied by several exemplary embodiments of the system and its building blocks.
Without restriction intended for any form of electronic contents such as text, audio, video, pictures and the like, we start by describing the embodiments with regards to inputs that are in the form of text. However, for other forms of electronic content the present methodology and process can be used once one considers that all types of electronic contents are different realization of semantic representation of universe. Therefore a semantic or knowledge representation transformation will make the current description applicable to all forms of electronic contents submitted to the system.
To be clear throughout this description lets define “Subject Matter (SM)” and “Ontological Subject (OS)” at the beginning. Generally, any string of characters can be a “Subject Matter (SM)” or “Ontological Subject (OS)” according to the definitions of this invention. Less generally, they could be any word or combination of words. Therefore SMs and OSs have in principal the same characteristics and are not distinguishable from each other. Yet less generally and a bit more specifically, a subject matter (SM) is a word or combination of words that shows a repeated pattern in many documents and people or some groups of people come to recognize that word or combinatory phrase. Nouns and noun phrases, verbs and verb phrases with or without adjectives are examples of subject matters. For instance the word “writing” could be a subject matter, and the phrase “Good Writing” is also a subject matter. A subject matter can also be a sentence or any combination of a number of sentences. We define “Ontological Subjects (OS)” as subject matters worthy of knowing about. They are mostly related, but not limited, to nouns, noun phrases, entities, and things, real or imaginary.
Referring to
The important step in building such a system proposed in
In
In particular a web robot can be employed to do searching through a search engine and finds the roughly total number (counts) of web pages containing a word, or a phrase, or count of co-occurrences of each two OS. Furthermore it can be programmed with such programming languages like Perl, Python, AWK, and many others like C, C++, C# and the like, to look for specific textual patterns, co-occurrence of words within certain proximities and basically extracting any type of character string that is desirable in a text. Those familiar with Natural Language Processing (NLP) and Computational Linguists (CL) can readily use such languages to write scripts and programs to extract different types of textual information from a text. In principal it is possible to parse sentences, simplify compound sentences, rephrase text, summarize, finding lexical elements such as noun phrases, extract proper nouns or named entities, synonym replacement, syntactic and semantic analysis of a text, make lists, build databases, manipulate strings of characters, and generally can execute any algorithm that is designed for a specific goal. A good introduction to the subject of NLP and CL can be found in the web site of “American Association for Artificial Intelligence,” organization (www.aaai.org). Another good source is a book entitled “text mining application program” by Manu Konchady, published by Charles River Media, Boston, Mass., 2006. The book provides information, teaching, training, and many accompanied application programs to perform the tasks mentioned here.
Referring to
There are a number of ways of doing this task. One simple way to find and list the important subject matters of interest is to use a search engine and look at number of web pages that contain that term or phrase. The term or phrase, is to feed to SSA, can be from any list of words, such as dictionaries, ontologies, list of proper names, or any list of words and phrases that exist or may exist. Search engines usually show the web counts (or hits) that can be used as an indication of importance of a term. The web counts that a search engine shows indicate the level of obsession and importance to the society, though not an exact indication of intrinsic value of a subject matter. Specially searching for web count of general nouns such as Science, Physics, Biology, or combination of them such as “Biophysics” or “Biochemical machine” and seeing a large number of documents containing that term is an indication of human obsession to that term and hence its intrinsic importance in human life. More sophisticated rules and algorithms and criteria may be devised to find important subject matters.
Consequently the system shown in
Once the application is approved and a title of publishing/broadcasting shop is assigned to the user, the owner of the shop can use the service of the system and start soliciting or being open to receive contents, and providing the service to her/his/its group of people interested in that subject matter. The system or administrators of the publishing/broadcasting system may also invite certain individuals to administer one or more of the publishing shops and act as editor or promoter of the journal (publishing/broadcasting shop). For instance a computer program identifies subjects of interest by searching and analyzing the information available, e.g. by automatically searching the internet, and finds association between a subject of interest and the authorities in the subject and invites them to administer and to establish their own online journal using certain rules and protocols that is provided by the host publisher (the main publishing site and system). New subjects can be introduced or proposed by a user and once the user's authenticity and credit is established the user can also establish her/his/its own shop with the proposed title or subject.
The subject matter is basically limitless as long as qualified reviewer can be found by human assistance or automatic program (a program which finds the authorities and rank them based an algorithm which we can call “Ranked Subject Matter Authorities or RSMA”). If the system cannot find qualified authorities, then content can still be published under different collection which is marked as non reviewed contents. Since the publications are peer or expert reviewed the collection is citable and can be used to the credit of creator of the content.
Referring to
In the preferred embodiment, the content is submitted through the main publishing host and therefore each content being submitted get the submission date that can be used for crediting the contributor/s or as an indication of priority.
As shown in the
The reviewers are being asked to evaluate the information content of the creator/s and give their feedback to either recommend the content for inclusion in the data or knowledge repository of the system for use by other users or clients, or being rejected for inclusion, or being included after a revision by the creator/s subject to satisfaction of the reviewer/s.
If the reviewers recommend the content for inclusion or online publishing conditionally, then the content and the comments or questions are sent to the creator/s and are given a creating time to send the revised content. The revised content along with the answers to the reviewer comments or questions can be sent to the reviewers again and ask for their recommendation either for inclusion in the data/knowledge base of the system or rejection. Then the creator/s will be informed of the final decision. It should be mentioned that the reviewers, in general, could be intelligent expert agents in that subject matter. The content after final acceptance will be included in the repository of the corresponding shop. The accepted content can then be published immediately and being made available to the users (readers) and be readable by special software for viewing such materials such as the Zinio's digital publishing software (www.zinio.com) and/or being collected and released periodically in the form of a magazine or any other format that is desired and available based on the capabilities of the state of the art at the time of publishing.
Referring to
In
Referring to
Referring to
Concurrent with or after the indexing of OS association is completed, another software agent will look for the kind of associations between each OS and it's associates by searching through databases such as WordNet, FrameNet, the whole internet, or any such a database that a relation between an OS and its components is expressed by natural languages. The agent will look for patterns of explicitly expressed statements, such as SVO sentences, to establish the kind of relationship between each two OSs. The agent may also use natural language processing (NLP) methods and algorithms such as text simplification, to find such an association pattern. However since there is a vast amount of textual data available, the chances are that the agent will be able to find the explicitly expressed and verified statements, composed by humans, that the agent is looking for. The verification of relations is done by statistical analysis of the database. Diversity of sources and the number of times that a statement is repeated to express a relation between two OS leads to the verification of that statement. These statements, expressing a relationship between an OS and its components also stored and indexed for further reference.
This database is then used to assess textual documents or any electronic content, such as audio or video, pictures, graphs, curves and the like, that its information can be transferred to textual format. The system first extracts the ontological subjects of a document and forms an OS spectrum for the document, with predetermined weighting coefficients rules. For example depending on the position of an OS in the text and counts of each OS, a coefficient for that OS is assigned. Also, for instance, one may partition a document to a desired number of parts such as chapters, pages, paragraphs, or sentences and repeat the process of OS mapping to these collections of smaller content in the same way that an OSM is made from larger collection of contents, i.e. finding the association and co-occurrences counts of each two OS.
The resultant OS spectrum of the document and corresponding associating relationship between the OSs of the document, is compared both with the internal knowledge database of the system as shown in
Generally there are more ways known in the art of spectral and signal analysis to evaluate the correctness and novelty of the text using the mentioned OS spectrum. When there are distinguishable peaks in the document spectrum that system does not have a record of verified relations for them, then the system marks them as novel and worthy of investigation and can compose a series of questions or suggestion to explain their relationship. It may also zoom to less amplified OSs and question and suggest a relationship between a high amplitude OS with a lower ones etc. All these information are available both to the editors of each shop and the creator of content. The system or the editor of each shop can present such unknown to the public and solicit for contributions to the solution.
For instance assume in
Over the time, new nodes and associations will show their importance by leading to growth of its newly discovered node or other nods, and finding the verified associations that are valuable to other contributors or is of interest to commercial entities and ventures.
The system then is able to rank the importance of a contribution over the time, universally or in each domain, based on an algorithm that quantifies the intrinsic value of the newly found associations or nodes. For instance the values of a contribution over time can be evaluated by a software agent that shows how many other contribution have been build upon one's original contribution, following its submission.
It should also be emphasized that each OS is placed in the map uniquely and in the universal context not a domain specific context. Therefore one of the applications and advantages of the system is to show users and creators the rout to navigate and get the direction in their exploration of a subject matter through the OSM. In other words it can be used as searching tool to quickly get the hint as to what are the most important subjects or issues related to their subject matter of interest. In almost most of the cases users do not know what they do not know in relation to a subject matter. Moreover they might not be informed enough to recognize the importance scores of other subjects to their subject mater of interest. Therefore, by using the OSM, the user can prioritize his/her effort and get the direction to navigate his/her exploration in search of finding useful new knowledge.
It should also be noted that the system can and preferably is realized distributedly and need not to operate in a single physical location. Basically each part of the system can be placed anywhere in the world and being connected together by communication means, yet yielding the same function and providing the desired service to its users.
The system can sustain its service by several methods of generating revenue and profit. Paid subscriber to each or a number of shops, selling copies of contents, advertisement and all the known methods of electronic commerce revenue sources, may generate revenue for each shop and the system. Moreover, the system can be mandated from an entity to make an effort to find a solution to a challenging problem that is important for that entity. The system then splits the proceeds to all the contributing parties according to a predefined contract.
Additionally, fresh and timely contributions can be sold online to other researchers interested in that research content to keep them update. There could be enough interest from peer researches to get the result. The price of content download can be decreased over time in a certain fashion and of course the contributor/s can get a reward and share the profit from the sale of their contribution. The revenue generation model can be from targeted advertising fee as well. Since the shops become specialized the advertisement in each shop are more relevant to the reader of each publications/ broadcasting shop in general and the revenue from target ads from each shop will be shared by the owner of the shop and the publishing host. Each shop can arrange its own real or virtual face-to-face meeting and organize conferences, etc. or have gatherings and organize events.
The success of the system commercially is mostly based on the substance of the contents published or broadcasted and the value of its service to the users. Therefore the system, in one aspect of this invention, will share the success to its contributors. Over the time, depend on the success of a content in terms of its popularity and importance, a creator accumulates credit points and at some point they can claim their credits in some form of monetary compensation, rewards, prizes, profit sharing, ownership or the like allowable by laws. There is provided a method to quantify the importance of one's contribution to the art. For instance, the more a submitted content generates further ontological subjects and grows its node, the higher the rank of importance and contribution of the content will be. Also ranking algorithm of linked databases, such as the page-rank, can be applied to evaluate the importance and impact of the content over the time.
Considering that each shop's title is also a node in the Ontological Subjects database, it is also possible to evaluate the overall rank and importance of the shops in a similar fashion. The success of a shop is measured by both its popularity and importance of its subject and impact as well as the revenue that a shop or the owner of the shop has generated. The system allows shop owners, with or without the help of system, to generate income by, for example, displaying other entities advertisement, banner, or any other means appropriate and accepted by law. The system again is benefited from such income based on the predefined agreements with each shop owner.
The system can have its own rules or protocols to ensure its profitability and its competitiveness. For instance, while many of the shops are identified and set up and establish by the system there is provided an opportunity for qualified users to establish their own shops through the system as well. There could be of course parallel and competing online shop but they will be given a certain time to produce enough interest in their own shops by writing or soliciting content from other so that they can produce enough interest and online traffic to keep their shops open. Failing to do so the system can stop giving them service or put them in the lower rank or lower tier shop. It is possible for a demoted shop administrator to upgrade to higher tier over time by producing higher quality contents. The contributors that have made significant and valuable contribution may have the privilege and advantages on establishing their own shops.
In conclusion the disclosed system and method will help to accelerate the rate of knowledge discovery for everyone's benefit, by providing the subject matters of intrinsic value for exploration, tools for knowledge navigation and content evaluation, rapid circulation and communication, and providing incentive for all the contributors such as content creators, editors, shop owners, and administrators. The system and method thus can help to improve the quality of life and increase economic growth and prosperity.
It is understood that the preferred or exemplary embodiments and examples described herein are given to illustrate the principles of the invention and should not be construed as limiting its scope. Various modifications to the specific embodiments could be introduced by those skilled in the art without departing from the scope and spirit of the invention as set forth in the following claims.
Claims
1. A method of knowledge discovery and publication comprising:
- a. identifying at least one subject matter for exploration;
- b. designating at least one online shop, for publishing and broadcasting electronic contents, corresponding to at least one of said at least one subject matter;
- c. finding at least one person having expertise in one of said at least one subject matter;
- d. receiving at least one electronic content from at least one creator, related to at least one of said at least one subject matter;
- e. providing a first computer implemented routine, having access to human knowledge repositories, configured to screen all the received at least one electronic content, and initially assess the merits of the at least one electronic content for being reviewed by at least one of a second computer implemented routine and at least one of said at least one expert, for inclusion in at least one of said at least one online shop;
- f. providing incentive to at least some of all contributors, including the at least one creator and said at least one person expert for administering the at least one online shop and reviewing said at least one electronic content, according to a predefined arrangement; and
- g. publishing the reviewed at least one electronic content, in the designated at least one online shop, corresponding to the at least one subject matter of said at least one electronic content, for viewing by other creators and users, whereby to accelerate the rate of knowledge discovery and substantially increase the amount of credible and useful knowledge.
2. The method of claim 1, further comprising a searching software agent for identifying the subject matters for exploration and related titles by searching through predetermined corpuses, thereby being able to provide a large number of the subject matters for exploration and the related titles for the corresponding online publishing shops.
3. The method of claim 1, further comprising a searching software agent for identifying names and contact information of experts and authorities having expertise and credentials related to the subject matters, thereby identifying, for each subject matter, a number of experts and authorities, for acting as one or more role of a reviewer, editor, administrator, and shop owner of one or more publication shop having the subject matter for exploration related to their expertise.
4. The method of claim 1, further comprising selling at least one of the publishing shops with predetermined conditions, each said shop having the title related to at least one of the subject matter for exploration, to at least one person interested in owning and operating at least one of the publishing shops, wherein each said shop has the right to use the first computer implemented routine for screening and assessing the electronic content having a subject matter related to that shop.
5. A computer implemented method of indexing ontological subjects comprising:
- a. providing a plurality of ontological subjects;
- b. providing access to at least one collection of electronic content by way of electronic communication;
- c. evaluating numerically an association value for a selected number of pairs of ontological subjects, using data information of co-occurrence of each of said pairs in said at least one collection of electronic content;
- d. selecting a first set of ontological subject, having at least one member, and selecting a first associated sets of ontological subject, each of said first associated set having at least one member, wherein each of the first associated sets is corresponded to a member of the first set, wherein each member of each set of said first associated sets having association value grater than a predetermined threshold with at least one member of the first set;
- e. indexing the members of the first set and the members of said first associated sets in a multilayer index with indices, by an indexing method comprising; i. indexing the members of the first set wherein indices are configured to show places of the members in the first layer of said multilayer index; ii. indexing the members of the first associated sets in next layer of the index, wherein indices are configured to show places of the members in that layer and to indicate the association of a member of the first set to the members of the corresponded first associated set; iii. marking an ontological subject in its current layer of the index as dormant, if said ontological subject has been indexed in previous layer and if an ontological subject is not indexed in previous layer but is a member of more than one of the first associated sets, mark said ontological subject as dormant everywhere that appear in its current layer of the index except in one place in its current layer of the index;
- f. selecting a second set of ontological subject, having at least one member, from the ontological subjects of the first associated sets which are not marked as dormant, i.e. non-dormant, in the index, and select a second associated sets, each of said second associated sets having at least one member, wherein each of the second associated sets is corresponded to a member of the second set, wherein each member of each set of said second associated sets having association value grater than a predetermined threshold with at least one member of the second set;
- g. indexing the members of the second associated sets in next layer of the index wherein indices are configured to show places of the members in their current layer and to indicate the association of a member of the second set to the members of corresponded second associated set, and marking an ontological subject in its current layer of the index as dormant, if said ontological subject has been indexed in one of the previous layers and if the ontological subject is not indexed in one of the previous layers but is a member of more than one of the second associated sets, mark the said ontological subject as dormant everywhere that appear in its current layer of the index except in one place in its current layer of the index; and
- h. repeating the step f to form a next set of ontological subjects from the not marked as dormant, i.e. non-dormant, members of the last associated sets of ontological subject and to form a next associated sets and indexing the ontological subject of the said next associated sets by repeating the step g, wherein the term the second is replaced with the third, the forth and so on, whereby to uniquely index the desired number of ontological subjects in the context of said at least one collection of electronic contents, and wherein each ontological subject positioned once in the index as non dormant; and
- i. storing the index of the ontological subject on a storage device.
6. The method of claim 5, wherein said except one place, for indexing the non-dormant ontological subject, is chosen in its current layer such that the non-dormant ontological subject has the highest association value with its previous layer member of the ontological subject set.
7. The method of claim 5, wherein the index provides necessary information in indexing so that the marked dormant ontological subjects can point to the place in the index that said marked ontological subject is a non-dormant ontological subject.
8. The method of claim 5, wherein at least one of the associated sets of ontological subject contains at least one non-dormant member.
9. The method of claim 5, wherein number of members of each of said associated sets is a predetermined number.
10. The method of claim 5, wherein the plurality of ontological subject contains all existing ontological subjects of the universe to that time.
11. The method of claim 5, wherein the at least one collection of electronic content includes collection of all electronic contents in the Internet.
12. The method of claim 5, wherein the at least one collection of electronic content includes contents of an Internet search engine database.
13. The method of claim 5, wherein said association value of the pair of ontological subject is evaluated using the count information by querying at least one Internet search engine and getting the co-occurrence count of the pair from the search engine.
14. The method of claim 5, wherein the ontological subject index is represented with a corresponding ontological subject map, wherein each indexed ontological subject is shown by a node in the map, configured to demonstrate if the corresponding ontological subject is marked dormant, and wherein each node corresponding to a not marked dormant, i.e. non dormant, ontological subject, is uniquely positioned in the ontological subject map, whereby providing a tool for visual navigation and focusing on a desired particular place of the map.
15. The method of claim 5, wherein the associated set of the ontological subject is represented by at least one form of spectral graph having ontological subjects in one axis and showing the association value of the members of the set in another axis.
16. The method of claim 5, wherein the ontological subject index is used for identifying at least one of, related subject, most important subjects related to another subject, indirect relation of two or more subjects, whereby to increase efficiency in searching and acquiring new knowledge.
17. The method of claim 5, wherein the ontological subject index is used to guide and show to a user routs for exploration in search of new knowledge, thereby assisting the user in knowledge discovery.
18. The method of claim 5, wherein said ontological subject index is updated periodically or continually.
19. The method of claim 5, further comprising:
- a. finding explicit form of relations of association between each of selected pairs of ontological subjects by searching through the at least one collection of contents; and
- b. recording and storing the said explicit forms of relations in a database configured for easy retrieval.
20. The method of claim 19, wherein said database of explicit form of relations of association is updated periodically or continually.
21. The method of claim 1, further comprising:
- a. providing a reference ontological subject index built from a selected list of ontological subjects and a selected collection of content;
- b. building an ontological subject index for the received electronic content, from the ontological subjects contains in the received content; wherein the index has at least one layer and has at least one ontological subject at its first layer;
- c. comparing the ontological subject index of the received electronic content with the reference ontological subject index by matching the first layer at least one ontological subject to position of the same at least one ontological subjects in the reference index;
- d. scoring the merit of the received content by a predetermined formula, wherein at least one of the variables in the formula is the number of similar ontological subjects in the associated set of the first level at least one ontological subject; and
- e. passing the content for reviewing if the score is higher than a predetermined criteria, whereby to screen a large number of received electronic content automatically before reviewing.
22. The method of claim 5, wherein the ontological subject index is used as reference for scoring the merit of an electronic content in terms of validity, novelty and importance.
23. The method of claim 21, or 22, wherein ontological subject map is also used for performing said methods.
24. A system for knowledge discovery and publication of contents comprising:
- a. a database of a plurality of subject matters for exploration;
- b. a plurality of online publishing/broadcasting shop each having a title related to at least one of said subject matters;
- c. at least one reference indexed database representing an ontological subject map that contains a desired number of ontological subjects related to said subject matters, wherein said ontological subject map indicating the association of each ontological subject to all other desired number of ontological subjects ordered and spread in the map by a predefined method based on their association value to each other, and wherein said database of the map can be derived by having access to a collection of contents having at least one content;
- d. means for receiving at least one electronic content from creators, assigning at least one possible publishing shop for said at least one electronic content, and building at least one ontological subject map using said received at least one electronic content;
- e. at least one software program configured to automatically measure at least one merit of said received at least one electronic content, by comparing predefined parameters and functions of said at least one ontological subject map of said content with said at least one reference ontological subject map, and passing the content, if the at least one merit meets a predetermined criteria, for publication to said at least one of assigned shop; and
- f. means for making said received at least one electronic content available for public access; whereby to publish a large number of credible and useful knowledge in short period of time.
25. The system of claim 24, further including: a database of persons' names and contact information, each person having expertise or authority in at least one of said subject matters.
26. The system of claim 25, further including: means for communicating with the authorities and the creators, thereby reviewing the received electronic content, by at least one authority from the said database of authorities, and assisting the creator by informing about the assessment and status of the received electronic content.
27. The system of claim 24, wherein each said online publishing shop having at least one editor from one of human or intelligent software agent, configured to be able to review an electronic content, to select some of the received contents for publishing according to predetermined criteria.
28. The system of claim 24, further including software means configured to rank the importance of subject matters and online publishing shops by several predetermined factors, at least one of the factors is selected from popularity, position in the reference ontological map, members of the associated set, and number of non-dormant members in the associated set, of the ontological subject corresponding to the subject matter.
29. The system of claim 24, wherein said at least one reference index being updated periodically or continually, and said system further including at least one database, also being updated periodically or continually, having indexed explicit form of relations of association between each of selected pairs of ontological subjects.
30. The system of claim 29, further including software artifact configured to perform
- a. detecting and recognizing: i. inclusion of at least one new ontological subject in the said updated databases or inclusion of a new ontological subject in the associated set of an ontological subject; ii. position changing or displacement of an ontological subject in said updated databases; and
- b. ranking the importance of the said new ontological subject and re-ranking the importance of the displaced ontological subject.
31. The system of claim 30, wherein said at least one new ontological subject, having importance factor higher than a threshold, is included in the database of subject matters for exploration and designating an online shop based on a predefined criteria.
32. The system of claim 30, further including a software artifacts configured to inform at least some of users and creators, about the inclusion of at least one new ontological subject to the at least one reference index, or displacement of an ontological subject in the reference ontological subject index; thereby notifying the users about an important potential discovery.
33. The system of claim 24, further including a platform for sale configured to sell the publishing shops with predetermined conditions, each said shop having the title related to at least one of the subject matter for exploration, to at least one person interested in owning and operating at least one of the publishing shops.
34. The system of claim 28, further including a platform for sale configured to sell the publishing shops with predetermined conditions, each said shop having the title related to at least one of the subject matter for exploration, to at least one person interested in owning and operating at least one of the publishing shops, wherein the price depends on the importance rank of the subject matter.
35. The system of claim 24, wherein further including a compensation platform configured to reward and compensates at least some of persons and entities who contributed to provide the contents and operate the system.
36. The system of claim 24 and in accordance with claims 26, 27, 33, 34, and 35, wherein said compensation platform to reward and compensate said those contributed including, if applicable, at least some of creators whose electronic content being published by at least one of the online publishing shop, at least some of authorities who reviewed at least one electronic content, at least some of administrators, at least some of said editors, and at least some of the owners of the online publishing shops.
37. The system of claim 24, wherein the system is distributed and at least one part of the system is physically located in, or performs from, different location from the rest of the system.
Type: Application
Filed: Jul 24, 2008
Publication Date: Jan 29, 2009
Applicant: (Thornhill)
Inventor: Hamid Hatami-Hanza (Thornhill)
Application Number: 12/179,363
International Classification: G06F 7/10 (20060101); G06F 17/30 (20060101);