Stateful, Real-Time, Interactive, and Predictive Knowledge Pattern Machine

Info

Publication number: 20230306284
Type: Application
Filed: Mar 21, 2023
Publication Date: Sep 28, 2023
Applicant: Birdview Films, LLC (Malibu, CA)
Inventor: Isabella Tappin (Malibu, CA)
Application Number: 18/187,447

Abstract

This disclosure describes a knowledge pattern machine that is distinct from and goes beyond a traditional artificially intelligent predictive knowledge system employing simple domain-specific numerical regression models. Rather than generating purely quantitative projections within a static set of parameters and data, the disclosed pattern machine uses various layers of artificial intelligence to recognize and derive dynamically evolving predictive patterns and correlations among quantitative and/or qualitative information pertaining to one or multiple domains. The pattern machine extracts knowledge items, including various signals, events, properties, and correlations therebetween, and predicts future trends and evolvements of relating knowledge items to automatically and intelligently answer user queries. The generated predictive answers are rendered as reports updated in real-time without user interference as the underlying data sources evolve over time and are sharable among different users at various levels. The various knowledge items are timestamped and used to further yield a stateful pattern machine.

Description

Description

CROSS REFERENCE

This application is a continuation application and claims the benefit of priority to U.S. Pat. Application No. 17/685,261, filed on Mar. 2, 2022, which is a continuation-in-part application of and claims priority to PCT International Patent Application No. PCT/US2021/047166, filed on Aug. 23, 2021, which is based on and claims priority to U.S. Pat. Application Serial No. 17/090,625 filed on Nov. 5, 2020, U.S. Pat. Application Serial No. 17/159,707 filed on Jan. 27, 2021, and U.S. Pat. Application Serial No. 17/320,876 filed on May 14, 2021. These applications are herein incorporated by reference in their entireties is herein incorporated by reference in its entirety.

TECHNICAL FIELD

This disclosure relates to intelligent interactive data analytics in a real-time predictive knowledge pattern machine.

BACKGROUND

Existing artificially intelligent predictive methods process confined, numerical, and historical datasets within a given domain and static set of parameters. The resulting predictions are limited to the initial data sources and metrics.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of this disclosure, reference is made to the following description and accompanying drawings.

FIG. 1 illustrates an example computer network platform for implementing a stateful, real-time, and predictive knowledge pattern machine.

FIG. 2 illustrates example functional and/or structural blocks of an example knowledge pattern machine.

FIG. 3 illustrates example functional and/or structural blocks for query process.

FIG. 4 illustrates example functional and/or structural blocks for query refinement.

FIG. 5 illustrates example functional and/or structural blocks for signal processing.

FIG. 6 illustrates example functional and/or structural blocks for prediction based on signals and events.

FIG. 7 illustrates example functional and/or structural blocks for answer generation.

FIG. 8 illustrates example functional and/or structural blocks for query recommendation.

FIG. 9 illustrates example and/or structural blocks for data scraping.

FIGS. 10-14 illustrate various interactive graphical user interfaces.

FIG. 15 illustrates an example computing device that can be employed in any above component of the pattern machine.

DETAILED DESCRIPTION

The following description and drawing set forth certain illustrative implementations of this disclosure in detail, which are indicative of several example manners in which the various principles of the disclosure may be carried out. The illustrated examples, however, are not exhaustive of the many possible embodiments of the disclosure. Other objects, advantages and novel features of the disclosure are set forth in the following detailed description when considered in conjunction with the drawings.

Introduction

Current predictive artificially intelligent systems are limited to generating a static numerical output using traditional statistical algorithms and regression models. They are generally incapable of dynamically adjusting the parameters of the machine-learning model and real-time inputs to generate continuous updates and responses and are incapable of automatically tracking, detecting and identifying changing correlative relationships and features within new datasets. They are further incapable of combining semantic and syntactic processing and lack architecture and components for automatic delineation of input information into qualifiable and/or quantifiable metrics (herein referred to as “signals”) and discrete singular or repeated occurrences (herein referred to as “events”) and tracking their effects and relationships in real-time.

The disclosure below describes an interactive knowledge pattern machine that goes beyond such traditional artificially intelligent systems to predictively answer any user query by intelligently and automatically detecting and identifying signals and events, features associated with the signals and events, and/or other correlative relationships relevant to the user’s query to generate real-time and dynamic predictive responses in a quantitative and/or qualitative manner within a domain, across multiple independent domains, and/or multiple intersecting domains. The knowledge pattern machine goes beyond domain-specific statistical regression and can combine semantic and mathematical information to render more complete predictions and is capable of identifying and analyzing predictive trends and patterns across multiple domains and topics that may or may not intersect.

In addition, the knowledge pattern machine described in this disclosure may dynamically process both quantitative and/or qualitative data/information to continuously generate predictions of future trends, evolvements, and/or characteristics of signals, discrete events, correlative relationships, effects, features, conditions, and/or other properties/information. The knowledge pattern machine may use various layers of artificial intelligence to discover such correlations among the quantitative and qualitative information to derive and recognize predictive patterns that may dynamically evolve.

The knowledge pattern system may be configured to intelligently analyze an input query with or without supplemental information, which may be supplied from a user either voluntarily or interactively (e.g., in response to options generated and prompted by the pattern machine when the original input query is determined by the system as lacking information) to transform the original input query into a more complete and formalized query. The pattern machine may be further configured to automatically identify relevant information entities in the formalized query and expand to additional relevant information entities and correlations, or a lack thereof, between all the different relevant information entities based on one or more natural language understanding module(s), one or more query database(s), one or more knowledge graph(s), signals database(s), and/or events database(s). The knowledge graphs, signals databases, and events databases may be precomputed but automatically and continuously updated.

These various information items may be further processed by one or more predictive models automatically selected from a model library to perform a prediction. Both quantitative information such as metrical and numerical datasets and/or qualitative information such as textual input, events, conditions of events, and other situational and conditional information may be processed by the predictive models. A predictive machine-learning engine may correlate events, signals, and precomputed information items to generate quantitative and qualitative and continuously updating predictions, and further interpret the outcome of variables linked to the formalized query of the user.

The resulting predictive output may be further formulated into an easy-to-understand visual form in a graphical user interface presented to the user as an answer to the original query. A report may be further generated. The report may combine visual information for answering the input query, including but not limited to graphs, charts, spreadsheets, animations, textual descriptions, and the like. These visual information reports may depict, for example, evolutions, trajectories, and interrelationships between the signals and events extracted from the input query. The format of the report may be determined intelligently, and the report may be automatically and constantly updated while new queries are being ingested and as various datasets underlying the answer to the queries change over time. The queries, the answers, and the various steps along the query-to-answer process are all stored/logged and time stamped as historical data and can be used by the various components of the pattern machine in its data analytics, information extraction, query formalization, and prediction processes.

The various databases and knowledge graphs are automatically and dynamically kept up to date as any of the underlying data sources and analytics models evolve over time. The user may be further provided with an interactive user interface to input, modify, and supplement the query, to select system-suggested options, and to view intermediate data analytics results during various stages of the query processing and answer/report generation process by the pattern machine. Previous formalized queries and answers may be accumulated and incorporated in identification of new data and new predictive knowledge items in response to future user queries, thus providing a stateful pattern machine. The queries, answers, and intermediate datasets may be shared among different users at various levels.

As described in more detail below, the knowledge pattern machine integrates various levels of artificial intelligence to provide predictive data analytics that significantly improve relevancy and accuracy in identifying quantitative as well as qualitative futuristic analytics with respect to free-form input queries. The pattern machine intelligently and automatically conducts predictive data analytics to generate qualitative and/or quantitative answers and trends based on user queries. As described above, the pattern machine may be organized by domains (such as education, politics, engineering, transportation, chemistry, and the like) and may be integrated so that it is applicable to multiple domains and can be used to perform cross-domain predictions.

The underlying data for the pattern machine may be publicly available and/or may be from private/personal data sources. Publicly available data, for example, may include data scraped or crawled from the internet (web) across various domains.

Overall Architecture

FIG. 1 shows an example network system 100 for implementing a knowledge pattern machine. System 100 includes one or more knowledge pattern servers and databases 106 and 108, and data sources 102 and 104. The knowledge pattern servers and databases 106 and 108 may be accessed by an individual or a group of users 122, 124, 126, and 128 via their computing devices 112, 114, 116, and 118. The computing devices 112-118, knowledge pattern servers and databases 106 and 108, and data sources 102 and 104 may be connected via public or private communication networks 101. The knowledge pattern servers and databases 106 and 108 and data sources 102 and 104 may be centralized or may alternatively be distributed across various geographic regions. The knowledge pattern servers and databases 106 and 108 and data sources 102 and 104 may be implemented as dedicated computers. Alternatively, the knowledge pattern servers and databases 106 and 108 and data sources 102 and 104 may be implemented as virtual machines in, for example, a cloud computing environment. Data within the databases 106 and 108 may be organized as one or more knowledge graphs, relational datasets, and/or the like. The computing devices 112-118 may be implemented as any electronic devices capable of accessing the knowledge pattern servers and databases 106 and 108 and data sources 102 and 104 via the communication network 101. The access may be provided by means of webpages accessible in web browsers running on the computing devices 112-118 or may be provided alternatively via dedicated client application programs running on the computing devices 112-118. Such access may be associated with a user account and may be permissioned via user password protection. Alternatively, user access to the webpages or application interface may also be publicly accessible to those with internet connectivity.

FIG. 2 illustrates various subsystems of an overall system functional architecture of an example pattern machine system. The pattern machine system 100 of FIG. 1 may be configured to include a combination of any of a query processing subsystem 210, a signal/event processing subsystem 220, a prediction engine 230, an answer generator subsystem 240, and a query recommendation subsystem 270. These subsystems may be configured in tandem as shown in FIG. 2. A graphical interface 250 in various forms may be further provided for users 260 to interactively and visually communicate with the pattern machine system throughout the querying process and answer presentation process. For example, the users 260 may input and interactively refine an initial query with the query processing subsystem 210. Optionally, the users 260 may further interact with the signal processing subsystem 220 to assist in signal identification for consumption by the prediction engine 230. In addition, the users 260 may interact with the answer generator subsystem 240 to view various optional forms of the predictive answers and to modify the format of the system’s response to their query. Furthermore, the users 260 may interact with the query recommendation subsystem 270 via the interactive user interface 250 to view and receive recommendations for intelligent follow-up query recommendations after reviewing the answers.

Query Processing

As shown in 210 of FIG. 2, the query processing subsystem may be configured to interactively receive and process a free-form user input query and optionally supplemental information associated with the query to generate a formalized question. In some example implementations, the query processing subsystem 210 may include receiving a free-form user input query as a question, analyzing, refining, supplementing, expanding, filtering, and reformatting the question into a formalized form that can be efficiently processed by the signal/event processing subsystem 220.

The query processing subsystem 210 may include various artificial intelligence components configured to understand and contextualize the user query in, for example, the form of a free-form question that is submitted by the user. The various circuitry configured to implement the query processing subsystem 210 may include, for example, various Natural Language Understanding (NLU) models and other data analytics components and models that are adapted to generate a formalized question. Such a formalized question may at least include one or more quantifiable metrics, the prediction of which are to be generated by the signal/event processing subsystem 220 and/or the prediction engine 230. The formalized question including the metrics as identified/determined/extracted/derived by the query processing subsystem 210, for example, may be passed on to the signal/event processing subsystem 220, which may be connected to some pre-existing or pre-developed and dynamically updatable databases or one or more knowledge graphs to generate relevant signals and major events associated with the one or more metrics. Both the signals and events are entities that may be characterized by multiple features and associated with quantitative and/or qualitative information. A signal, for example, may be a quantifiable variable that evolves as a function over time. A signal may be associable with various properties including features and quantitative and/or qualitative information. The values and/or dependence of the values of the signal on time may differ when the signal is associated with different properties. For example, atmospheric temperature may be a signal. Atmospheric temperature may be quantifiable and vary over time. Atmospheric temperature and/or its time evolution may be different in different geolocations (e.g., geolocation may be one of the properties or features of the atmospheric temperature signal). An event, for example, may either occur or not occur, or occur in part, depending on its various associable features, conditions, and probabilistic likelihoods. An event may be quantified by an occurrence probability and may be further characterized by features and/or various conditions for occurrence, causality, and/or conditionality. Probability of occurrence of an event may be related to conditions involving these features or properties. For example, legislation of an environmental law may be an event. It may be likely or unlikely to pass based on certain conditions associated with the legislation. Signals may be correlated with events. For example, the atmospheric temperature may be correlated with the legislation of an environmental law. A signal may be correlated with many events and likewise, an event may be correlated with many signals. The signals and major events, once intelligently identified/extracted/derived/determined, may then be used at least as part of the input to the prediction subsystem 230 for performing data analytics to generate predicted metrics in, e.g., textual or numerical or graphical or visual form, as an answer to the user question embedded in the user query.

The main functionalities of the query processing subsystem 210, for example, may include but are not limited to any combination of:

(1). Receiving a free-form input query via the graphical user interface 250 of FIG. 2.

(2). Parsing the input user query into concepts or variables.

(3). Determining a domain of the user query (e.g., knowledge category and field).

(4). Performing analytics to understand the query and identify whether a question embedded therein is answerable.

(5). Performing query expansion to augment the user query in order to ensure that relevant signals and/or events can be identified in the pre-existing or predeveloped signal and events databases in the identified knowledge domain and field.

(6). Identifying and blocking out-of-scope/unethical/biased topics.

(7). Identifying and clarifying ambiguity in the original query, and interactively obtaining supplemental information and/or clarification from the user via the graphical user interface 250 of FIG. 2. In some implementations, the query subsystem 210 may generate options based on the original user query for interactive refinement and selection by the user during the query analysis process in order to resolve ambiguity, narrow the scope of the query, and more accurately identify user intention and variables, timelines, and metrics associated with the original user query.

(8). Identifying typos and other errors in the user query and perform corrective operations.

(9). Recommending an alternative query that may be more appropriate based on the input query and historical processed queries and their metadata (this function may be performed in conjunction with the answer generation subsystem 240, as described in further detail below).

The functionalities above for the query processing subsystem 210 form a core of the query entry portion of the pattern machine architecture, in which the input query is extracted, cleansed, expanded, supplemented, and analyzed to generate one or more formalized questions with well-defined signals/events and predictable metrics and variables in a single domain or in a combination of multiple domains. For example, the query processing subsystem 210 maps metrics in an input query, after expansion, supplementation, and/or cleansing, to related signals of the input query that are pre-computed and constantly updated. Alternatively, historical processed queries may be recommended. A more detailed example pipeline for the query processing subsystem 210 is shown in 300 of FIG. 3.

As shown in FIG. 3, for example, the query processing pipeline may include but is not limited to query preprocessing 302, query type processing 304, query refinement 306, query deduplication 308, query analytics/extraction 310 and query embedding 312. The user 320 interacts with the query processing pipeline 300 by inputting initial query as shown by 330 and by optionally assisting with query refinement, as shown by 350. These components/processes are configured to intelligently parse, preprocess, supplement, and analyze the user query to extract information in order to match the query with the pre-computed signals. Merely as non-limiting examples, the various components of the query processing pipeline 300 above are descried in further detail below.

User Query/Input 330: The initial user query may be input as a free-form question in, for example, a textual format, for which the user 320 wants the pattern machine to provide a predictive answer. The user 320 may type the query in the graphical user interface. In case more information (e.g., supplemental information, optional information) is needed from the user, the pattern machine system may first try to automatically supplement the missing information based on, for example, the contextual information of the query and the historical data associated with the user or similar queries, and then as the next resort, may prompt the user to provide more information. For example, the user may be prompted to provide some additional input to the system either in text format or by selecting suggested options by the pattern machine from the graphical user interface. In some implementations, the user query and/or the supplemental information may be entered by the user via audio or voice recognition.

Preprocessing 302: Preprocessing of the user query and input may form one of the essential steps for the query processing pipeline 300 as it may be configured to ensure that the user query is in a state that is useful for the downstream components of the pattern machine described above with relation to FIG. 2. Example techniques that can be adopted for this preprocessing component 302 may include but are not limited to text normalization, stemming/lemmatization (finding the linguistic root of each word), tokenization (breaking down the sentences into words), text cleaning (removing unwanted characters, emojis, non-ascii characters, etc.) and typo correction.

Query Type Processing 304: The query processing pipeline 300 may include the query type processing module as shown in 304 for identifying specific types of queries to assist the pattern machine in generating more accurate predictive answers. In some implementations, the query type processing module 304 may be configured to classify user queries into, for example, three categories of regular, out of scope (OOS), and biased queries.

For example, OOS type may be used to represent queries or questions that are unethical to answer. Ethical boundaries may be set based on topic-dependent parameters. This type of question would mean that the entire question or most of the question is unethical. In some implementations, a model for determining unethical questions may be established by starting from sample out-of-scope topics, using both language models and vector embedding to identify similar queries. For biased queries, determination of the bias in the queries may vary depending on the topical domain. Certain opinionated language can typically indicate strong bias. For example, some input queries may contain bias and/or profanity in the form of strong opinion towards the entities in the queries or are sarcastic. These queries may be first identified and then further processed to remove the bias. Various models may be designed and trained based on labeled datasets (pre-classified user queries) for classifying an input query into these various example query types. For example, supervised/reinforcement learning may be employed. The underlying models may be based on neural networks. A bias detector may be built for each domain. A bias detector may also combine a general bias detection component supplemented by domain bias detecting components. Bias may be automatically removed once identified.

In some implementations, regular queries (queries that do not contain bias or out-of-scope content) may be directed to the query refinement component 306. OOS queries, on the other hand, may be redirected to a fallback answer component in which users 320 may be provided with one or more sample fallback answers to indicate to the user that the pattern machine system is unable to provide a predictive answer to the question. As for the biased queries, once determined, they may be further redirected to a bias type detection component for further determination of the type of bias in the user query followed by a fallback answer or user refinement of the query in 306. In some implementations, the bias types may be predefined and for each of them, it may be predetermined whether a fallback answer should be provided. Otherwise, the query may be passed to the query refinement component 306 for refinement and for removal of the detected bias.

Additionally, in the query processing section, all improper words may be automatically removed from the query. Once the system removes the phrases indicating bias and/or an improper phrase, the system then proceeds to downstream processing. If there is enough of the query left, then the system may proceed to formalize the query and answer the question. If there is not enough of a query after bias is removed, then the user may receive a multiple-choice option page to select further topics suggested by the system to continue with a proper and formalized question, or alternatively receive a fallback response indicating their question will not be answered.

Query Refinement 306: Query refinement component 306 of the query processing pipeline 300 may be optionally configured to ensure that the user query is complete and in a format that is usable for identifying related events and signals in the subsequent data analytics by the signal processing component 220. In some implementations, the query refinement process 306 may be configured to identify and match one or more predefined question templates that are similar to the user query, and in case there is missing information in the user query according to the identified template, to prompt the user to select supplemental and classifying information from a set of choices intelligently suggested by the system. For example, the query refinement component 306 may be configured to suggest entity options to the user based on the matched query templates and the information already present in the initially filtered user query. The query refinement component 306 may use the information in the knowledge graph to identify the most plausible entity options, which may include items such as but not limited to, events, signals, time periods, and/or locations, and to suggest them to the user for selection.

In some implementations, in order to provide a better experience for the user, the query refinement component 306 may be optionally adapted to provide the most similar query templates to the user query as selectable options to the users so that the users can identify a compatible query template with their query in the options provided by the pattern machine with minimal effort. The query refinement component 306 may further make use of a knowledge graph in order to find the most similar question templates as well as entity options to the user that are not only related but also semantically and contextually meaningful. The query refinement component 306 thus may help disambiguate the user queries and complete queries that are otherwise ambiguous and/or incomplete.

Query templates may be domain dependent. Query templates may be signal based or focused. Alternatively, query templates may be event based or focused. Within these templates, subcategories of question types may correspond to different query templates. These templates may be generated by a trained model.

Query expansion techniques may be further provided in the query refinement component 306 to add syntactically similar words in the query (using, for example, any type of pretrained syntactical expansion models) or semantically similar words in the query (using, for example, pre-trained word-embedding models) to facilitate matching the user query to precomputed signals and events. In other words, the purpose of the query refinement is to automatically complete the user query and identify relevant information based on one or more question templates to provide sufficient data items to the later query information extraction and/or query embedding components (see, for example, components 310 and 312 below). Optionally, the user may be prompted to supplement information or may be prompted with options for selection via the graphical user interface in some instances when the query refinement component 306 is less certain about the expansion, as shown by 350.

A specific example logic/data flow for the query refinement component 306 is illustrated as 400 in FIG. 4. The query refinement logic/data flow 400 may include, for example, question or query type matching process 402 for matching the user input query 401 (preprocessed and filtered, e.g., output of the query type processing component 304 of FIG. 3, which is either a regular query or a biased query which the pattern machine system can provide answers to) to one or more sets of predefined and stored question/query templates, the analytics process 406 to determine whether information for the matched question type is complete in the user input query at 401, and a recommendation process 408 for generating suggested supplemental information to the user or optionally prompting the user to enter supplemental information when it is determined that the information in the input query (even after automatic supplementation) is still not complete, as shown by 409. The query refinement component 400 generates refined user query 410. The set of predefined and pre-stored question templates may cover most of the queries users may ask. The question templates may include some textual information as well as placeholders for specific entities such as signals and events, described in further detail below.

The output refined user query from 410 may contain sufficient information for the subsequent data analytics in query information extraction and query embedding. Interactive user involvement in the query refinement data/logic flow is merely optional and as needed to ensure that the refined user query contains sufficient information for further processing. In some implementations, the user may be provided with selectable options (e.g., dropdown menus) for supplemental query information, alternatively to or in addition to, being prompted to enter free text information in particular aspects of the original user query. The selectable options in, e.g., a dropdown menu, may be intelligently determined based on information available in but not limited to the events database, signals database, and knowledge graph (triplets containing various entities of information and their relationships) further described below.

Query de-duplication 308: This component of the query processing pipeline 300 of FIG. 3 may be configured to identify duplicate queries in a historical user query database which may be exactly or to a large extent the same as the current user query. In particular, historical user queries may be stored alongside relevant metadata (including the answer to the query) in, for example, a NoSQL or other type of database for all the users of the pattern machine system. Once a query is entered into the system, the query de-duplication function may be configured to identify duplicate queries in the database in order to obtain part or all of an answer already previously generated in answering the new query. Alternatively, the system might use at least a portion of a previously generated answer to configure a new prediction. If a duplicate query is found in the query database, then the pre-stored predictive answer to the duplicate query may be provided to the user without having to pass the query through the rest of the pipeline, thereby enhancing the user experience by reducing the time it takes to provide the answer to the user, even when the query may still be processed by the prediction engine (for real-time prediction and an updated answer). Detection of duplicate queries may be implemented by training a classification model which can determine whether two queries are considered the same query or not. The trained model may be developed such that it is capable of not only finding out if queries have the same intent, but also determining whether that they have the exact same entities and/or the answer to them would be the same.

Query information extraction 310: The query information extraction component 310 of the query processing pipeline 300 of FIG. 3 may be configured to extract relevant information from the preprocessed, refined, and de-duplicated query in order to better understand the user intent. A wide range of information extraction techniques may be implemented based on the types of information to be extracted from the query. These information extraction techniques may include but are not limited to keyword/keyphrase extraction, named entity recognition (NER), entity mention detection (EMD), semantic role labelling (SRL), intent recognition, and the like. Each of these example techniques may be implemented in different manners using either classical or neural network-based approaches. Additionally, each of these techniques may be implemented in different manners using vector embedding and pre-trained language databases or models. Each of these techniques may be further customized for each domain to better fit domain-specific vocabulary. The query information extraction component 310 may be configured to perform a query expansion in order to identify the entities in the query and either disambiguate or augment these entities. In addition, the query information extraction component 310 may be further configured to extract relevant information from the user query in order to facilitate matching it with pre-computed signals by creating a detailed and structured format of the query which can be more conveniently used to perform matching with the information available in the pre-computed signals.

Query information extraction may be customized to particular domain. Such domain-specific query extraction may facilitate recognition of vocabulary that may be categorically and difficult to recognize by general language models and databases. As such, query information extraction may be based on domain-specific language database. For example, in a domain of politics, there might be proper nouns (people, places, bills, political movements) that are new, niche, and/or changing. For NER, EMD, or SRL to properly identify these named entities, a language database that records these entities may be relied upon. These language databases may be preestablished. Alternatively, they may be trained and updated in real-time by scaping data resources such as news and other political records. Additionally, in cases that these entities differ over time and/or by location, such information may be also accommodated and recorded. Similar tools may be used for query information extraction, with the exception that query extraction in a specific domain might be linked to specific language databases.

Query embedding 312: The query embedding component 312 of the query processing pipeline 300 of FIG. 3 may be configured as an alternative or additional component to the query information extraction component 310 for performing matching of the input query with pre-computed signals. Query embedding constitutes an alternative way to represent or embed relevant information contained in the user query in a vector-space representation in contrast to and thus supplementing the structured format used by the query information extraction component 310. Query embedding may be generally implemented as one of two different example methodologies: syntactic embedding and semantic embedding. In some example implementations of the syntactic embedding techniques, relevant syntactic information may be extracted from the user query (as processed by the various preceding components described above) and then may be converted to a vector representation using various embedding techniques. On the other hand, semantic embedding techniques may be used to focus more on semantic information (i.e. the meaning of the words) and embedding that information (e.g. using pre-trained language models to find the query embedding). The matching of the input query and the pre-computed signal may be based on the syntactic embedding, the semantic embedding, or a combination (or hybrid) thereof. The term “hybrid” is used to refer employing both techniques together in one pipeline. For instance, both syntactic and semantic embedding could be used simultaneously, and the better result may be adopted or combined. Alternatively, both embedding techniques could be run sequentially. The embedding of a processed input query, may be generated by an embedding model (either a syntactic embedding model, a semantic embedding model, or a hybrid model). Such a model may be trained based using a natural language model and is capable of providing embedding vectors for any input query.

In the example implementations above, the query information extraction component 310 and query embedding component 312 are configured in parallel with each other and both of them generate some representation of the user query. The output of either one or both of these two components form a formalized query or queries, or “formalized question or questions,” as an input(s) to various downstream components of the pattern machine system (e.g. the signal processing subsystem 220 of FIG. 2), as will be described in further detail below. The two forms (either structured query, or embedding) may be complementary to each other in some cases and may collectively provide more accurate input to the subsequent processing components.

As described above and illustrated in FIG. 3, the example query processing pipeline 300 may be configured with the following characteristics:

Data analytics Implementations: The various components of the query processing pipeline 300 may employ any combination of data processing procedures performed on the input query and intermediary forms of the input query. For example, the query preprocessing 302 may involve one or more of: text normalization, text stemming/lemmatization, text tokenization, and/or text cleaning (removing unwanted characters, emojis, non-ascii characters, etc.). Various existing and custom-designed text analytics libraries may be employed. For another example, the query type processing 304 may perform profanity filtering using any combination of custom-built and/or open-source pre-trained models and may detect out-of-scope or biased questions/queries using custom-built and trained models that may or may not be based on various publicly or privately available libraries or other custom libraries. Query expansion as part of query information extraction 310 may be performed, for example, based on adding syntactically similar words to the query using various public, private, and/or custom libraries, and/or based on adding semantically similar words to the query using any pre-trained word-embedding models. The query de-duplication 308 and the query embedding component 312, for example, may be performed also based on functions and models provided via various libraries. The query extraction component 310, for example, may be performed based on various Keyword and Keyphrase data extraction analytics such as NER, EMD, SRL, Open/Closed information extraction (OIE/CIE), Intent recognition, and/or the like. Various information extraction libraries/tools may be used individually or in combination to implement these procedures.

Query Processing Pipeline Input: The input to the query processing pipeline 300 constitutes a free-form user query and other supplemental user inputs as described above through the interactive graphical user interface. The initial user query, for example, may be input in text format or alternatively may be converted into text format in case the user provides the information through a selection of suggested values or options provided in the interactive user graphical interface or through other forms (such as voice input).

Query Processing Pipeline Output: The output of the query processing pipeline 300 may be either one of or a mix of outputs of the query information extraction component 310 and the query embedding component 312. As described above, the output of these two components forms the formalized question as input to the downstream components. When used in the mixed and/or hybrid manner, they form complementary inputs to the downstream components of the pattern machine system.

Signal Processing

FIG. 5 shows an example overall architecture of a signal processing component 500 of the signal processing portion of the signal/event processing subsystem 220 of FIG. 2. The signal processing component 500 may include a signal matching engine 502 and a metadata retriever 504 for processing the query information extraction output 510 and query embedding output 520 from the query processing data/logic flow 300 to generate retrieved signals and their metadata 530.

As such, the input to the signal matching pipeline is part of or all of the formalized output of the query processing component 300 above which may include either one or a combination of the extracted information from the expanded user query in a structured format (output of the query information extraction component 310) and/or also in a vectorized format (output of the query embedding component 312).

The signal matching engine 502 may constitute the main component for matching primary entities mentioned in the formalized user query to one or more pre-computed signals. Matching signals among the pre-computed signals may be identified by querying a signal database and a knowledge graph.

The knowledge graph may be configured to store information with regards to different signals, events, features, and/or other relevant information entities, and the relationships therebetween. The knowledge graph thus may contain signals, events, features, relationships, and other entities that are related or connected to one another in various degrees. The signals, events, features, and other information entities may constitute various nodes in the knowledge graph whereas the relationships may constitute the edges. As such, given a particular signal in the input formalized query, a set of related signals may be identified by querying the knowledge graph. The related signals may include primary signals, secondary signals, additional signals, and any number of properties, features, and/or conditions thereof. The primary signals may include signals within the signal knowledge graph that are already contained in the formalized user query, whereas the secondary signals may include signals that are related to the primary signals according to the knowledge graph. These related secondary signals with respect to the signal being queried may be connected to the queried entity directly or indirectly in the knowledge graph such that a connection may be described by a degree of connection and/or confidence interval. For example, indirect connection between two signals via other entities (or nodes in the knowledge graph) may be a connection of less confidence compared with a direct connection between the two signals. Direct connections between two signals in the knowledge graph corresponds to known edges, which together with the two signals form a triple in the knowledge graph. Additionally, the knowledge graph may store multiple connections to varying degrees to form nodes and the relationships between them by various degrees of relevancy. The knowledge graph thus may include a collection of known signals, events, and other entities and their properties and various relationships and known triples.

In some implementations, the knowledge graph may be constructed to store pre-computed entities and their relationships. Such a knowledge graph may be constructed and updated separately for each domain or may be constructed integrally for multiple domains in various optional forms as explained in further detail below. The data sources for generating and updating the knowledge graph may include public and/or private data collections. As described in further detail below, the public and private/personal data sources may be scraped periodically or in real-time to keep the knowledge graph up to date. In some implementations, the knowledge graph may be configured to self-expand, self-correct, and to dynamically categorize information over time. The knowledge graph may optionally contain a layer of machine learning and/or be configured as an artificially intelligent network to be predictive in capacity, to autonomously update, to dynamically re-structure the knowledge graph itself, and/or to self-correct. In this manner, the knowledge graph may identify or predict previously unknown entities and/or new connections/relationships between known or unknown nodes over time.

Once various primary and secondary signals associated with the user query are identified by the signal matching engine 502 from the formalized query and/or the knowledge graph, the metadata retriever 504 may be invoked for obtaining meta-information attached to each matched signal according to the signal database. Such meta-information or metadata may be used by the prediction engine component (230 of FIG. 2) to build a set of datasets for prediction. Such metadata may include but is not limited to tabular datasets (e.g. csv files) which contain information about specific signals, textual information related to the signals, and the like. In some implementations, a metadata item or dataset may be associated with or correlated to auxiliary information of a signal such as a timestamp, a location (e.g., geographical location), and the like. The corresponding prediction model may contain temporal and/or location models that consume such timestamp and location metadata information.

In some implementations, the meta-information database may be implemented as any type of relational or non-SQL database, or a combination thereof. For example, tabular meta-information or metadata may be stored as XML or CSV files or the like. Unstructured datasets such as text, image, JSO files, and the like that are associated with the signals may also be stored in non-SOL formats.

The output of the signal/entity processing component 500 may include the primary and secondary related signals as well as their metadata, represented by 530 in FIG. 5. The output entities/signals and their metadata 530 may then be processed further by the prediction engine 230 in FIG. 2 described above and in further detail below in relation to FIG. 6.

Event Processing

An example overall architecture of an event processing component of the event processing portion of the signal/event processing subsystem 220 of FIG. 2 is further described below. The event processing component may include an event matching engine and an event information retriever for processing the query information extraction output 310 and query embedding output 312 of the query processing subsystem 300 of FIG. 3 to generate retrieved related events and their various characteristics or properties.

As such, the input to the event matching pipeline is part of or all of the formalized output of the query preprocessing data flow and logic 300 above, which includes the extracted information from the expanded user query in both or either a structured format (output of the query information extraction component 310) and/or also in a vectorized format (output of the query embedding component 312).

The event matching engine may constitute the main component for matching event information contained in the formalized user query to the knowledge graph and to an event database containing pre-computed events. Matching events among the pre-computed events in the event database and the knowledge graph may be identified by querying the event database and the knowledge graph. The knowledge graph may be the same precomputed knowledge graph that contains entities including both the signals and events and their relationships.

Again, the knowledge graph may be configured to store information with regards to different signals, events, features, other entities and the relationships therebetween. The knowledge graph thus contains events in addition to signals and other entities that are related to one another. As such, given a particular event or signal in the input formalized query, a set of related events in addition to related signals may be identified by querying the knowledge graph. The related events may include primary events and secondary events. The primary events may include the events within the knowledge graph that are already contained in the formalized user query, whereas the secondary events may include the events that are related to the primary events, primary signals, secondary signals, event features, or signal features. These related secondary events may be connected to the queried primary events, primary signals or secondary signals directly or indirectly, such that a connection may be described by a degree of connection and/or confidence interval. For example, an indirect connection between a primary event and another event or signal via other entities/relationships in the knowledge graph may be a connection of less confidence compared to a direct connection between the primary events and other entities. Direct connections in the knowledge graph may be represented by an edge, which together with the two relating entities, form a triple in the knowledge graph. Additionally, the knowledge graph may store multiple connections of varying degrees to form nodes and the relationships between them by various degrees of relevancy. The knowledge graph thus includes a collection of known entities and their features, properties, and/or conditions and relationships, and known triples.

Again, the knowledge graph may be constructed to store pre-computed signals, events, other information entities, and their relationships. Such a knowledge graph may be constructed and updated separately for each domain or may be constructed integrally for multiple domains in various optional forms. The data sources for the knowledge graph may include public and/or private data collections. As described in further detail below, the public and private/personal data sources may be scraped and updated periodically or in real-time to keep the knowledge graph up to date. In some implementations, the knowledge graph may be configured to self-expand, self-correct, and dynamically categorize information over time. The knowledge graph may optionally contain a layer of machine learning and/or be configured as an artificially intelligent network to be predictive in capacity, to autonomously update and dynamically re-structure the knowledge graph itself, and/or to self-correct. In this manner, the knowledge graph may identify or predict previously unknown new relationships between known or new unknown entities over time.

Once various events associated with the user query are identified by the event matching engine from the knowledge graph, the event information retriever may be invoked for obtaining or deriving characteristic, property, and/or conditional information attached to each matched event and stored in an event database. Such characteristic information may be used by the prediction engine component (230 of FIG. 2) to build a set of datasets in addition to the signal datasets 530 of FIG. 5 for training the prediction model. Such characteristic event information may include but is not limited to tabular datasets (e.g. csv files) which may contain information about specific events, textual information related to the events, and the like. In some implementations, a characteristic information item or dataset for an event may include event impact information, timeline and timestamp information, likelihood of the event, and the like. Additionally, such characteristic information for an event may include conditions, features, and other properties that may indicate causality, conditionality, or necessity for an event to occur in part and/or in entirety, along with information that may assist in measuring the probability of such occurrences.

In some implementations, the event database may be implemented as any type of relational or non-SQL database, or a combination thereof. For example, tabular event information may be stored as XML or CSV files or the like. Unstructured datasets for events such as text, image, JSON files and the like may be stored in non-SOL format.

In some implementations, the event database and the signal database may be implemented as separate databases. In some other implementations, the event database and the signal database may be integrated as a single database.

In some implementations, the event information, e.g., likelihood information, may be predictive, and may be obtained and pre-computed using separate prediction data analytics associated with the event information retriever. The prediction may be based on changing or evolving data scraped or crawled from various public, private, and personal data sources. The event information and metadata may be stored in the event database. The event database may thus be automatically updated in real-time or at least periodically.

The output of the event processing component as part of the signal/event processing subsystem 220 of FIG. 2 may include related events. The output may additionally include impact, timeline and likelihood of future occurrences, and/or the like of the related events and other relevant event-related information. The output events and their characteristic information may then be further processed along with the signal output 530 of FIG. 5 by the prediction engine 230 in FIG. 2 described above and in further detail below in relation to FIG. 6.

Prediction Engine

FIG. 6 shows an example implementation 600 of the prediction engine 230 of FIG. 2 for generating a prediction to the user query. The prediction engine 600, which may constitute the convergence point of the pattern machine architecture, uses the information from all the upstream components described above that are related to the real-time updated and precomputed signals and/or events matched with the input user query to intelligently generate a prediction for the input user query. The prediction engine 600, for example, may be configured to run one or more simulation models and or other AI models in order to forecast multiple outcomes in the future.

In the example implementation of FIG. 6, the input to the prediction engine 600 may include the signals 602 and events 604 generated by the signal processing component 500 and event processing component of the pattern machine, respectively, as well as, optionally, any other relevant information extracted from the knowledge graph, as described above. The prediction engine 600 may include a prediction component 610 and a simulation engine 620. The prediction component 610, for example, may be configured to generate signal predictions by using various models selected by a model fetching component 630 from a model database 640 to process the input signals 602. The prediction of the signal and/or signals may include qualitative and/or quantitative predictions of the signal. For example, the prediction of the signal and/or signals may include predicted future values, numerical trends, and contextual information of the signals. The simulation engine 620 may process the predictive output from the prediction component 610 and the events output from the event processing component described above to conduct a simulation of the signals and the events and generate one or more simulation output 960, e.g., the most probable outcome (occurrence) of the predictions associated with the one or more extracted signals and/or events. This may include the most probable outcome of the impact on and among the one or more signals and/or events together.

The model database 640 may include a library of pre-computed models containing various prediction models that may be selected by the model fetching component 630 for use by the prediction component 610. The model fetching component 630 may determine appropriate models and/or algorithms and/or mathematical equations by performing analytics on the input signal 602. The fetched models may be used by the prediction component 610 for qualitative prediction or to quantify the future values and predicted evolution of the signal. The model library 640 may further contain separate or integrated models/equations/algorithms for estimating margin of errors, confidence intervals, Root-Mean-Square-Error (RSME), and the like for the signal prediction. The model library 640 may additionally include models/equations/algorithms for estimating occurrence probability and likelihood of events, and these models/equations/algorithms may be invoked by the simulation engine 620 for determining the predicted most probable outcome of the event(s) 604 extracted from the input query. The various models/equations/algorithms above may include but are not limited to neural networks, regression algorithms, probabilistic formulae, statistical models, and the like.

In some implementations, the prediction engine above may be used for precomputation of signal predictions and/or event predictions for the signals and events stored in the knowledge graph or signal and/or event databases in real time. Such predictions may be qualitative or quantitative. Such predictions may be associated with various properties of the signals or events. In other words, predictions of quantitative or qualitative future values or behaviors of the signals or events may depend on their properties. The predictions may be performed in real-time by automatically selecting and executing appropriate models from the model library 640. The predictions may be updated with timestamps in the signal and/or event databases.

The simulation engine 620 process the input event information 604 and the predicted signal information by 610 to generate simulated output 660. The simulation engine may overlay the semantic connections and qualitative information that has been identified as relevant to the initial formalized query with the numerical predictions and quantitative information that has been identified and calculated as relevant to the initial formalized query (e.g., by the prediction component 610). Thus, the simulation output may be a projection rendered by any combination of the various signals, events, features, conditions, timelines, locations, relationships between different entities and/or any other information items relevant to the formalized query. The prediction from the prediction component 610 as an input to the simulation engine may contain any combination of quantitative and/or qualitative predictions. For example, the simulation engine may generate a projection that indicates an impact of a signal on certain events through an evaluation of the properties, features, and conditions of the events as affected by the signal and the signal’s properties, features, and conditions. A margin of error may be additionally generated. Although the simulation may generate a list of possible future scenarios, the ending result portrayed to the user may be the most likely future scenario or scenarios with the best confidence rating of the prediction. In some optional implementations, the prediction output and/or the simulation results may be interactively provided to the user so the user can interactively modify them, if needed to generate the final prediction output, as shown by 670 in FIG. 6. The simulation engine may also generate the simulation output 660 in the format of various real-time reports in the forms of, e.g., spreadsheet, data tables, graphs, charts, and the like. The system may contain a database of formats and may be configured to intelligently select or suggest the optimal formats for the reports based upon the nature of the prediction output by the simulation engine 620. The graphical user interface may be configured to provide the user with the capability to download and/or store the reports. The system may be configured to update the predictive answer and report automatically with changing circumstances. Examples of changing circumstances include but are not limited to, change of any of the underlying datasets in the pipelines above for generating the answer and report from the input query, updates in the various databases (such as the events database and/or the signals database) and knowledge graphs, and updates of various analytics models used in the entire pattern machine pipelines. The update of the answers and reports may be automatically triggered by these changes or may be performed periodically or in real-time. In some implementations, the triggering may be configured to effectuate an update, for example, after a threshold amount of changes of circumstances have been detected.

In some general implementations, the formalized output from the query processing flow may contain information about one of the signals, events, or both. The simulation engine above may be configured to automatically react, depending on the scope of the output from the query processing flow to generate the answer.

Answer Generation

FIG. 7 illustrates an example implementation 700 of the answer generator subsystem 240 of FIG. 2. The answer generation component 700 of the pattern machine may be configured to summarize the prediction and simulation results from the prediction engine 600 to generate an output to the user via the graphical user interface that constitutes a human readable answer (such as any combination of a textual and/or verbal natural language answer, charts, graphics, other illustrative visuals, and the like). The output answer may be optionally formatted based on user requirement, preference, or selections. In addition to a contextual answer to the query, the answer generation component 700 may also generate various real-time reports, in the forms of, e.g., spreadsheet, data tables, graphs, charts, and the like. The user graphical interface may be configured to provide the user with the capability to download and/or store the reports. Very importantly, the answer generation component 700 may be configured to update the answer and the report automatically with changing circumstances. As described above, examples of changing circumstances include but are not limited to change of any of the underlying datasets in the pipelines above for generating the answer and report from the input query, updates in the various databases (such as the events database and/or the signals database) and knowledge graphs, and updates of various analytics models used in the entire pattern machine pipelines. The update of the answer and reports may be automatically triggered by these changes or may be performed periodically or in real-time. In some implementations, the triggering may be configured to effectuate an update, for example, after a threshold amount of changes of circumstances have been detected.

As shown in the example in FIG. 7, the answer generation component 700 may include an answer formatting engine 710 for processing the prediction results 702 and the simulation results 704 generated by the prediction engine 600 of FIG. 6 to generate a formatted answer and report with respect the user query and to provide the user with the formatted answer and report via the graphical user interface 720.

The answer generation component 700 may further include a natural language generation (NLG) component 730 for processing the prediction 702 and for generating an intermediate answer 740 in natural language form prior to being formatted by the answer formatting engine 710. For example, the NLG component 730 may be configured to convert the prediction results 702 into a natural language answer. NLG techniques may fall into template-based and/or model-based categories. Template-based NLG techniques may utilize some natural language templates 750 and inject the prediction results into the templates to create the intermediate answer 740, whereas model-based NLG techniques may utilize NLP (Natural Language Processing) models to generate the intermediate answer 740.

The answer formatting engine 710 may be configured to generate the final answer and report in a format that is convenient for the user to view and to download. Examples of the answer and/or report format for the user may include PDF and Word documents. Users may also specify their desired format as long as it is supported by the answer formatting engine 710. The answer formatting engine 710 may create a user interface or device-compatible format of the answer that can be passed to the graphical user interface (e.g. JSON format). The various reports generated by the answer formatting engine 710 may include but are not limited to, for example, excel tables, tableau graphs, and charts.

Query Recommendation

Returning to the query recommendation subsystem 270 of FIG. 2, an example data/logic flow of such a subsystem is illustrated as 800 in FIG. 8. The query recommendation data/logic flow 800, for example, may be configured to recommend follow-up queries to the user based on inputs such as the user’s input query, the user’s historical queries, historical queries from all the users, the knowledge base, and optionally based on the answer generated by the answer generation subsystem described above. The purpose of query recommendation is that it may provide the user with optional suggested follow-up queries for the user to explore additional information related to the original query and/or the answer to the original query. The user may choose from the suggested queries listed, for example, as part of the answering page on the graphical user interface, to run a new query. The selected new query may then be processed by the knowledge pattern machine to generate further predictive answers and reports.

The recommendation component 806, for example may be configured to recommend queries of further interest to users. As described above, this component may generate one or more types of query recommendations to the user. The recommendation component 806 may be configured to identify historical user queries from the historical query database 808 that are either syntactically or semantically similar to the current query and recommend them to the user to explore. It may also create new queries by adding variations to the current user query as well as by generating relevant queries based on the answer of the current query. The query database 808 may include historical queries as well as their metadata. The knowledge graph 842 may be utilized by the recommendation component 806 to identify relevant entities and relationships based on the user query to facilitate the identification of similar historical or new queries for recommendation. The recommendation component 806 may further utilize the query/question templates 804 to ensure that it recommends queries that are compatible with the templates and in a completed, formalized format so that no additional user refinement may be necessary. As such, the queries presented by the recommendation system may be considered already formalized, and as such may result in an immediate predictive answer once selected by the user, if the user does not select to optionally modify the recommended query.

In some specific implementations, the query recommendation data/logic flow 800 may be configured to provide the following example types of query recommendations.

Based on the user’s initial query, the query database (the formalized historical queries in the database which may or may not be embedded in a vector model), and optionally, the events and signals databases and the knowledge graph, the recommended queries may include new suggested queries that: A) are relevant to one specific focus within the initial question (i.e., narrow the scope of the original query), B) are relevant for expanding in some way upon the specific topic of the initial query, C) are related to past queries a single user has entered, if the user has a saved profile or has selected this option within their user profile settings, and D) are popular queries that are currently trending by multiple users within the PM system as a whole. The system may automatically recommend these new queries to the user.

(1) Given the input user query, the system may provide additional queries to the user based on the user’s initial query, the query database (the formalized historical queries in the database which may or may not be embedded in a vector model), and optionally, the events and signals databases and the knowledge graph. Recommended queries may include historical queries that are either semantically or syntactically similar to the user query and/or queries that address a different aspect of the initial query;

(2) Given the input user query, the system may provide more focused/narrower queries to the user based on the knowledge graph and signals and events databases;

(3) The system may recommend further queries to the user by adding a variation/expansion to the current query. Variations can be implemented by selecting different signals, signal features, events, event features, or any other related entity that may be considered relevant yet different to the initial user query. Additionally, variations may be implemented by changing verbs/adjectives in the query or any other technique that may help to create a valid variation of the user query.

(4) The system may recommend queries based on the answer to the initial input user query. These recommendations are relevant to the answer generated by the pattern machine system and may help users further explore the initial answer by asking more relevant queries.

(5) The system may also recommend queries previously saved in the user profile and/or popular queries that are trending within the pattern machine as a whole.

The query recommendation process 800 as depicted in FIG. 8 may use any of the formalized query 802 (output of the query processing data/logic flow 300 as described above) and the answer 803 generated by the answer generator subsystem 240 of FIG. 2 or 700 of FIG. 7 as input to generate recommended queries 810 using a recommendation component 806 that communicates with question templates 804 and the historical query database 808 and knowledge graph 842 to pull information that is either generic or user specific. Additionally, the signals and events databases may be used. For example, for recommendation Types (1)-(3) above, a combination of semantic embedding and the knowledge graph, query database, signals database, and events database may be used to identify and suggest new queries that offer deeper insight into the topic of the original query. Semantic embedding may be used to identify correlative phrases or entire queries from within the embedding space. The knowledge graph 842 allows the system to identify and recommend new queries based on temporally relevant concepts and patterns within the same topic of the original question. For recommended queries that expand on a specific topic of the initial query, the system may identify relevant aspects of the signal(s) and/or event(s) present in the initial query and create a new query that is narrower in scope categorically. The opposite type of query may be additionally recommended, as a recommended query may also broaden the categorical scope of the initial query by relating the signal(s) and/or event(s) of the initial query to categorically broader signal(s) and/or event(s) via information extracted from the signals database, events database, and/or the knowledge graph. For another example, for recommendation Type (5) above, the query database 808 may be pulled. Specifically, the pattern machine system may store all query entries from all users, which allow the system to identify overall “trending” queries and patterns. This can be determined by popularity across different domains and topics.

The recommendation may be shown to the user via graphical user interface 820. Using the graphical user interface 820, the user may select any one of the suggested/recommended queries to run. The query recommendation user interface 820 may form part of the answer interface (e.g., it may be integrated in the same UI 720 of FIG. 7). The user may be allowed to optionally modify the recommended query via selection from pre-suggested multiple-choice options to narrow, broaden, and/or specify the scope of the recommended query. If the user does not modify the recommended query and merely proceeds to view the result of the recommended query, the resulting page will report the predictive answer to the recommended query. The user may further be allowed to modify configuration parameters related to the recommender system, including but not limited to the number of recommendations suggested to the user. In some implementations, the user may be provided first with recommended options of types/topics/categories of questions for selection and then provided with specific recommended questions in a subsequent query recommendation interface after a specific type/topic/category is selected by the user. In some implementations, the user may opt to have the system providing any type(s) (type (1) through (5)) and A)-D) of the recommendation above via preference setting.

In some implementations, the recommended queries may be directly generated in a formalized form. As such, after user selection for further exploration, the selected query recommendation may not need to be preprocessed and formalized by the query preprocessing flow 300, reducing the time needed by the knowledge pattern machine to provide a follow-up answer.

Additional Implementations Data Scraper

Data scraping, although not depicted explicitly in the various drawings above, may constitute an important component of the pattern machine. It may be used by the various components described above that rely on various datasets. For example, data scraping may be responsible for scraping publicly or privately available data from specific sources in various domains. The data sources, for example, may include various URLs on the Internet. The data sources may include other databases accessible by the pattern machine system. The scraped data from the various sources may be stored in formats that are usable for other components of the pattern machine. The scraped data can be used for building the various databases and knowledge graphs described above, and for training, testing, and/or validating the various intelligent models used in the various data analytics components described above. For example, the knowledge graph, the SQL and/or non-SQL databases, the events database, and the signals database may all be based on the datasets generated by the data scraper.

Data scraping may be performed periodically, in real-time, or may be triggered by changes of data in relevant data sources. In some implementations, it may be important to ensure that the data scraping is updated frequently. In some implementations, it may also be important for the system to automatically ascertain the reliability of the data sources and to perform fact checking before the data is scraped and used by the pattern machine system.

In some implementations, the data scraper may be implemented as a web data scraper. The web data scraper may be configured to provide the following functionalities:

Given a URL, crawling content of the URL and returning all identifiable data in a machine-readable format (e.g., JSON).
Using existing web crawl datasets to determine whether the data scraper needs to crawl a particular URL or use the crawled data from the existing datasets.
Parsing information from the crawled datasets and updating the various knowledge graphs and databases above.
Controlling the crawling frequently to ensure that the most up-to-date datasets are available to the pattern machine according to the application requirements and characteristics of particular domains.
Extracting and retrieving metadata from datasets and data sources, including information such as timestamps and location.

FIG. 9 shows an example architecture for a web scraper 900. In accordance with the description of its example functionalities above, the web scraper 900 may include a data crawler (or web crawler) 910 and parser 920. The data crawler 910 may be in communication with data sources 930 (represented by URLs). The data crawler 910 may return results in a machine-readable format which can be used by the parser 920. Also, it may be able to use the existing crawled datasets to fetch other known datasets, if available. The crawled datasets may represent a repository containing an existing crawl of the web data. It may contain datasets already crawled from billions of URLs that are updated, for example, every month. The crawled datasets can be used in case the URLs to-be-scraped are already crawled. The scraped data may then be processed by the parser 920 to transform the data into formats that are suitable for further processing into the various NoSQL and SQL databases 950 and knowledge graphs, as described above. The parsed datasets from the data crawler may be stored in various SQL and NoSQL databases 950 and knowledge graphs. In addition, any type of object storage techniques may be used for storing the raw files (images, videos, voices, etc.). The choice of databases used for storage may be based on the type of the dataset (e.g., whether the data is structured or unstructured).

Fact-Checking

The scraped data may be further automatically fact checked. A fact-checking component may be used at any point where scraped or new information is being input into the various databases (the knowledge graph, the signal database, and/or the event database). The fact-checking component may validate and improve reliability of the information being stored in these databases.

Fact-checking may be performed automatically on one or more various levels. For example, fact-checking may be performed according to data sources and/or datasets. In some implementations, data sources may be characterized by various reliability levels (e.g., by confidence interval, and/or degree of confidence and accuracy through vector embedding and/or clustering). Unreliable sources may be filtered out first. The filtering may be rule-based and/or may be based on trained models. Such filtering may be based on, e.g., a neural network model and/or vector embedding and/or clustering network that may evolve to accommodate new data sources. For example, the reliability of a data source may be evaluated based on factors including but not limited to the privacy profile of the source, the security profile, anonymity, and/or profile/credibility of the author of the source, citation of the source by others, the reliability of other sources cited by the data source, and the copyright profile of the data source. Various underlying models may be used to extract/quantify these factors. These models may be based on various language, regression, and/or numerical algorithms. These factors may be aggregated to form a clustering result signifying a reliability level of the data source.

For another example, fact-checking may be performed on the information contained in a data source itself. Such information may likewise be characterized in various reliability levels and/or embedded in a clustering model with confidence interval scores. Completely unreliable information may be further filtered out from the datasets. The filtering may be rule-based or may be based on trained models. Such filtering may be based on, e.g., a neural network model that may evolve to accommodate new information and/or vector embedding and/or a clustering model. For example, the reliability of a particular information item may be evaluated based on factors including but not limited to the frequency of the information item’s appearance across various reliable sources, the formality of the information item (grammar, use of language, and the like), the amount of supporting citations, and a flag of bias. Various underlying models may be used to extract/quantify these factors. These models may be based on various language, regression, and/or numerical algorithms. These factors may be aggregated to form a clustering result signifying a reliability level of the particular information item.

The various models above for fact-checking or determination/quantification of the various factors may be based on a supervised learning model using, e.g., a text classifier from labeled training data. Labeled data might have indices and features that facilitate the categorization of the information or sources as described above.

In some implementations, a clustering algorithm may be used to determine the reliability of a source or a particular information item as described above. For example, each piece of an information item that appears more than once and conflicts may be compared by source, and calculations may be made to determine the reliability of one piece of the information item over another based on the frequency of mention and confidence score of the information piece and its source. If there is conflicting information about a particular data item that is present across multiple trusted sources, then it may be recorded into the databases but the data that is more frequently considered as accurate across different mentions may be more likely to be validated over the data that is less frequently considered accurate. The clustering process may be applied to all data items for populating the signal database, the event database, and the knowledge graph.

Other measures may be taken for validating the accuracy and reliability of the information scraped by the data scraper or otherwise presented to the knowledge pattern machine. For example, the information may be processed through margin of error analysis and further confidence interval testing.

Fact-checking may be implemented continuously and automatically, even for data initially validated and populated into the various databases and knowledge graph, as data sources and other datasets related to these data items change or evolve over time.

Contextual Information Extraction and Knowledge Graph Creation

The creation of the knowledge graph above is described in more detail using the following example implementations. The knowledge graph, as described above may be created for the following functionalities:

Store relevant information entities and their relationships, including but not limited to information related to entities, signals, events, signal features and/or other properties, event features and/or other properties, and their relationships in a format on which queries can be run with reasonable response time by the various components of the pattern machine as described above.
Update the existing knowledge graph based on new information as well as domain-specific ontologies, in real-time, periodically, or as triggered by changes in data sources. The update may be performed at different levels depending on knowledge graph type, domain, and/or types and domains within the knowledge graph.
Provide information by query about signals and events and their relationships as well as their metadata.
Identify correlations between all relevant triples (triples are formed as two nodes and the relationships between them)
Match/find signals and events by query based on mathematical and semantic information.
Establish time stamps to maintain time-stamped versions of the knowledge graph in that the nodes and relationships and the triples are all associated with timestamps, if pertinent.
Identify and detect patterns within the relating information present within the knowledge graph, and additionally identify and detect temporal patterns based on the chronological information that can be detected and generated via time-stamped metadata.

FIG. 12 shows an example architecture which can be used to build and update the knowledge graph. The creation of the knowledge graph may be based on a contextual information extraction engine to extract relevant information from trusted data sources before processing the data and storing them in the form of knowledge graph.

Contextual data may be used as data sources for building the knowledge graph. The contextual data, for example, may include but is not limited to domain-specific data gathered from crawling multiple sources of data (e.g. web, documents), using, for example, the data scraper or crawler described above in relation to FIG. 9. The contextual data may be validated and fact checked from both the source level and the data content level, as discussed above. Also as described above, the contextual data as crawled and parsed may include data organized in multiple formats such as tables, text, image, videos and graphs. Take textual information as a non-limiting example, the contextual data may be fed into, e.g., various parallel (alternative or additional) processing flows for information extraction and for building the knowledge graph thereafter.

For example, one of the various parallel processing flows may include preprocessing the contextual information followed by entity recognition, and further followed by relation extraction. These processing steps may be responsible for extracting entities from the contextual data and the relationships therebetween. In some example implementations, the preprocessing of the contextual data may be similar to the pre-processing of the input user query by the component 302 in the query processing pipeline 300 of FIG. 3, and may be configured to apply some preprocessing techniques on the contextual data in textual format before the text is fed to the downstream components in creating or updating the knowledge graph. In some example implementations, the entity recognition step may be similar to the information extraction component 310 in the query processing pipeline 300 of FIG. 3. The entity recognition step, for example, may be configured to extract specific entities/signals and/or events from the preprocessed contextual data. The relation extraction step may be configured to extract relationships between different entities, signals, and/or events in the text after their recognition. The combination of these steps as one of the parallel processing flows may facilitate the extraction of information from the textual contextual data and before the information is used to build/update the knowledge graphs.

Another example of the various parallel processing flows may include preprocessing of the contextual information followed by information extraction using transformer-based models.

A third example of the various parallel processing flows may include preprocessing of the contextual information followed by open/closed information extraction for distilling information from the textual contextual data. For example, open information extraction (OIE) techniques may be used to extract subject-predicate-object (SPO) triples from the processed textual contextual data, whereas closed information extraction (CIE) techniques may be used extract a predefined set of entities and relationships from the textual contextual data as preprocessed.

In some implementations, the extracted information via any one of or any combination of the example parallel information extraction processes above may then be relied upon for building the knowledge graph. The knowledge graph may be built and/or updated by ingesting the extracted information. Building of the knowledge graph, for example, may further utilize other existing general or domain specific knowledge graphs and also domain-specific ontologies as a basis or starting point to build upon and to create a more complete and accurate knowledge graph for particular domains for the knowledge pattern machine to use. In some implementations, the knowledge graph, once being built, may be continuously updated based on the information extracted from the contextual data.

In some specific example implementations, existing knowledge graphs of external sources may be combined with domain specific ontologies for building an initial version of the knowledge graph suitable for the knowledge pattern machine. Other accessible knowledge graphs built on top of other information can also be used to facilitate the identification of events. Once the initial knowledge graph is built, it may be updated in real-time relying on newly scraped data and on updates of the previously scraped data.

The knowledge graph or knowledge graphs for the pattern machine may be generated in various formats and may be based on a knowledge graph platform using RDF format or labeled property graphs.

The pre-processing step for the contextual data as described above may be configured based on the downstream flows of contextual information extraction. Various libraries may be used for preprocessing the contextual data in a manner similar to those used for query preprocessing 302 described above and in FIG. 3. The entity recognition step and relation extraction step may be implemented using, for example, named entity recognition techniques. Particularly for relation extraction, a combination of pattern-based techniques as well as neural-network-based techniques may be implemented. Models for joint entity and relation extraction may also be implemented.

In some implementations, the transformer-based information models referred to above, for example, may utilize one or more transformer-based language models (LMs). Pretrained LMs may be utilized to extract information from documents. The LMs may be further tuned based on custom relations and entities to be extracted. Other techniques may be used to build transformer-based information models.

In some implementations, the open/closed information extraction processes referred to above may be configured to extract entities and relationships from the query. Other various techniques and libraries/tools may be utilized to build this component. For more accurate extraction, custom models can be trained and tuned that are able to extract specific entities and relationships.

The building and updating processes for the knowledge graph may be configured to achieve at least two main tasks. The first task is to combine existing knowledge graphs to create an initial knowledge graph for use by other components of the pattern machine, as described above. The second task is to update the information in the knowledge graph using the extracted information from the contextual data crawled from the data sources, in real-time, periodically, or as triggered. The knowledge graph building and updating processes may be configured to use multiple information extraction techniques to reliably extract entities and relationships from the contextual dataset and add them to the knowledge graph. The process may additionally use the existing ontologies to populate the knowledge graph with domain-specific information.

The knowledge graph, either initially created or updated in real-time, may then be used by the various components of the knowledge pattern machine as described above.

For example, the knowledge graph may be used in any of the natural language processing components to identify whether or not a query is complete and to create recommended multiple-choice options for the user if the query is incomplete, as described above in the query refinement process of FIG. 4. This may be implemented through mapping the identified entities of a freeform user-typed query to associated/related entities within the knowledge graph and then suggesting certain entities that might be helpful to complete the query. For example, if the user asks “What regulations will be effective in reducing carbon emissions?” and it may be helpful to narrow the scope by identifying a timeline and location, the knowledge graph could be used to suggest timelines and locations relevant to the query for the user’s selection, such as “by 2025” and “in the United States.”

For another example, the knowledge graph may also be used to later identify relevant and related entire queries for further recommendation to the user, as described above in relation to the query recommendation process of FIG. 8. For example, once the user has received the answer to a question, the knowledge graph may be used to identify further related follow-up questions that may be of interest to the user.

For another example, the knowledge graph may be used to match the relevant signals and events to one another and to those extracted from the query, as described above in the signal processing flow of FIG. 5 and the event processing flow described above. The knowledge graph may identify signals and/or events and corresponding relevant signal/event features, properties, and/or conditions as nodes and identify relevant correlations between these different entities of information as the relationships between the nodes.

For yet another example, within the backend preprocessing of the knowledge pattern machine system, the knowledge graph may also be used as one the sources for building the signal and events databases (SQL or non-SQL) used by various components of the knowledge pattern machine system. Specifically, when the scraped data is processed, the knowledge graph may be updated and at the same time, the knowledge graph may be used in junction with various parameters to identify events and signals and the corresponding relevant features, conditions, and other properties, which are then matched with corresponding data and timestamps.

The knowledge graph may be built as a stateful knowledge base and may optionally be configured with a machine learning layer. For example, the knowledge graph may be built and configured to operate on a time series and be predictive in nature. In other words, the knowledge graph components may be associated with different versions with different time stamps. This means that the knowledge graph may be configured to record entities and their relationships at different points in time. As such, the knowledge graph would correlate information across time and thus correlate patterns over time (and thus become stateful). This design accommodates a history of updates in information recorded in the knowledge graph over time so that the categorization and structural information recorded in the knowledge graph can be dynamic. Once the knowledge graph has sufficiently developed over a large enough timescale, the patterns of correlating entities over time may be discovered and be used for the knowledge graph to be predictive, e.g., for numerical values of signals at future times and under hypothetical conditions. For another example, the knowledge graph may be configured to self-expand to predict new nodes, entities, and relationships based on time information. For example, when events or signals are represented by nodes, then the knowledge graph may be able to predict future signals and events through the prediction and creation of future nodes. In some implementations, for different time-versions, the knowledge graph may be stored as a series of versioned or timestamped knowledge graphs. Alternatively, the time information may be integrated as time series properties for the nodes. In some implementations, the knowledge graph may be constructed or configured so that the nodes themselves might function as different timestamps and the other corresponding information identified with a given timestamp might be related to the timestamp nodes.

Pre-computed Models

As shown in FIG. 6, a model database 640 may be established for providing various intelligent models that may be used by the various components of the knowledge pattern machine. These models may be pre-computed or pretrained and stored in the model database 640. These pre-computed models may be capable of, for example, predicting the value for a target signal in FIG. 6 and thus be used by the prediction engine 610. These models may be created using, for example, tabular datasets crawled from the web by the data scraper described above for each signal. The pre-computed or pre-trained models and other components associated with training these models may perform functionalities including but not limited to:

● Extract/identify relationships between signals and events; aggregate and organize the signals related to a target signal. This may add quantifiable mathematics to semantics and identify training features for the pre-computed models.
● Aggregate datasets related to a signal and combine them in a way that is useful for training the pre-computed models.
● Run auto machine learning (AutoML) pipelines to build pre-computed machine learning models. The pipelines could include feature engineering, feature selection, model hyperparameter tuning, model evaluation, and deployment.

The final pre-computed models may capture various features and characteristics including mathematical and structural information and other correlations between and within datasets, signals, and events. The term “pre-computed” is used to indicate that the various model may be pre-trained.

In some implementations, the pre-computed model building process may start with the scraped data described above and end with storing the pre-computed model(s) in the model database 640 of FIG. 6.

The scraped data may first be extracted and organized according to various signals. Such extraction and organization may be based on information from the knowledge graph. Alternatively or additionally, a predefined set of signals and events may be used via API. This process can relate signals and events together and may help to identify features that can be used to train the various machine learning models. The extracted signals may be stored in the signals database.

The extracted signals and events may be organized into, for example, a tabular data format and time stamped. The time stamping can associate the signals and events with time in order to capture time features associated with the signals and events by the final pre-computed models. Each signal could have multiple related tabular datasets which may be processed and combined. The tabular datasets may then be combined and converted into a format that can be used as training datasets for building/training the pre-computed models. A dataset builder may be configured to direct the extraction of information from the knowledge graph for generating the training datasets. In particular, the knowledge graph can help to identify relevant signals and events and may also provide information about how to guardrail specific signals. Guard-railing of events and signals may particularly help to avoid predicting unrealistic values for the events and signals.

The training dataset(s) generated by the dataset builder may then be used for generating machine learning models. Specifically, an automated machine learning (AutoML) component may be configured to perform, for example:

● Data preprocessing: Pre-process the tabular data related to the signals and apply required changes to the data. Examples of data preprocessing techniques include converting signals, signals related to the target signal, and signal features to the correct type (e.g. numeric, categorical), filling the N/A values and finding outliers. The features refer to additional properties (informational in both quantitative and/or qualitative respects) and/or additional signals related to the target signal that might accompany a target signal and/or be mapped to a specific signal within the knowledge graph(s).
● Feature engineering and feature selection: Feature engineering and feature selection are two techniques that can augment the features in the dataset as well as select the most informative features to be used by the models for predictions. Features engineering can be done both via full automation using AutoML feature generations solutions as well as via customization and manual feature generations. As for the feature selection, different statistical methods as well as model-based techniques may be used to select the most important features for the predictions. Given enough computation power, this stage can be fully automated using AutoML solutions.
● Model selection and hyperparameter tuning: Model selection and hyperparameter tuning may help to select the best model architecture that can predict the target signal values as well as fine-tune the parameters of that model to yield the best performance possible. Model architecture selection may heavily depend on the type of task. For forecasting the value of the target signals given all other related features and/or signals, regression models as well as recurrent models to capture the temporal aspect of the datasets, for example, may be selected. Hyperparameter tuning in machine learning (ML) refers to the process of finding the parameters of a specific model to make sure it has optimized performance. Techniques such as grid search, random search, and cross-validation methods may be used. In some implementations, Bayesian models to create probabilistic models may be used to not only predict the value of the signals but also to provide uncertainty levels for the predictions. Libraries may be used for training Bayesian models. Margin of error for prediction(s) may be calculated based on uncertainty levels from the Bayesian models.
● Model evaluation: Model evaluation is the process in which the performance of trained models is evaluated using several metrics. These metrics may vary based on the task type. Some metrics used in regression tasks may include but is not limited to MAE, RMSE, and R².

The output of the AutoML component may be a trained model for each signal. The trained models may be stored in the model database 640.

Event Extraction

The various event information stored, for example, in the event database described above and used as event data input 604 to the simulation engine 620 of FIG. 6, may be extracted when data sources are scraped or crawled. The event extraction may be configured to detect/predict/extract major events (whether singular in occurrence or repeated/repeating occurrences) that could affect the predictive and simulation models. Events and signals have a many-to-many relationship in a sense that a specific event could cause/affect many signals and/or be caused/affected by signals and a signal could be caused/affected by many events and/or a signal could cause/affect many events. Each event may be associated with a type of impact on or from various signals.

The event extraction from data scraping may be configured to perform the following functionalities:

● Extract events from the crawled data based on predefined sets/types of events.
● Identify relevant signals and events from the data source by using the event database(s), signal database(s), and knowledge graph(s).
● Estimate the impact of each event on the related signal(s) and/or the impact of each signal on the related event(s) using intelligent models. Such impact information may be further used to obtain a relationship between events and signals and use them accordingly for the modeling.
● Predict the likelihood of an event as well as the timeline it could happen in the future as properties or part of the metadata of the extracted events.
● Store the events and their metadata in the event database that can be easily updated/queried.

In some implementations, the example event extraction process may begin with the scraped data and end with extracted events and their metadata stored in the event database.

The scraped data may be first processed by an event extraction engine based on predefined event types. The event types may be predefined, for example, by subject matter expert(s) for particular domains. The extracted events (raw events) may then be stored in the event database. The extracted events may be further processed by a relevant event/signal extraction component, which may contain various intelligent models and draw on the knowledge graph and the event and signal databases to identify signals related to the extracted events based on the relationship between the extracted raw events and the signals or other information as embedded in the knowledge graph, the event database, and the signal database. Such related signal(s), once identified, may be used by other downstream components of the pattern machine, as described elsewhere in this disclosure. Further, based on the identified relationship of the events and the signals, the impact of the extracted events on the signals and/or the impact of the signals on the extracted events may be further estimated (using an event impact estimator) and the likelihood of the events occurring in the future may be predicted (using an event likelihood predictor). Such event impact and likelihood estimation and predictions may then be associated with the extracted events and added/stored in the event database.

The event impact estimator may be particularly configured to quantitatively estimate the impact of an event on a specific signal and thus allow for identification of the most impactful events for a particular signal such that the selected impactful events may be focused on in the subsequent modeling. In some implementations, the causal relationship might be reversed and the event impact estimator might be configured to quantitatively estimate the impact of a signal on a specific event.

The event likelihood predictor may particularly be configured to include models that are able to predict the likelihood of an event happening in the future as well as create a timeline for the event. This component may rely on historical datasets from a past timeline and use statistical models for predicting the likelihood of an event based on the past data as well as finding patterns in the timeline of the event for future timeline prediction.

The knowledge graph may identify relevant signals and events and may also provide information about how to guardrail specific signals and events. As described above, guard-railing of events and signals may particularly help to avoid predicting unrealistic values for the events and signals. The signals and events may also contain other features. For a purely hypothetical and illustrative example, the knowledge graph might map a given event as containing a feature that identifies certain conditions which must also occur to cause the event. This feature might also contain further specific information about these conditions and a time series indicating a timeline. This information accompanying the event might also contain mathematical data that allows for calculations of the probabilities of the event occurring under certain conditions. When the extracted event and its accompanying feature(s) is processed, the machine learning models could potentially generate the probabilities of different series of conditions and events occurring over different future timelines. The guard-railing of this event would use the information accompanying the event and its features to filter the least probable situations from the rest of the modeled data so that the resulting pre-computed model would predict the most probable outcomes.

Real-time Aspects of the Knowledge Pattern Machine

As described above and throughout this disclosure, the knowledge pattern machine system is configured to be dynamic and real-time in various aspects. For example, data scraping is performed continuously and in real-time, capable of catching new data and updating the old data as becomes available. The knowledge graph and the various signal and event databases are also updated continuously and in real-time as the underlying data changes. In addition, the predictive answer and reports generated for user queries and stored in the answer/report database are also updated continuously and in real time as the underlying data, including the information in the knowledge graph and the event/signal databases, changes over time. Further, the knowledge graph may be configured to be dynamic in that it records changing information entities and relationships/correlating information on a time series, as described in more detail above.

Example Graphical User Interface

Example graphical user interfaces (GUIs) described above (e.g., 260 of FIG. 2, 720 of FIG. 7, and 820 of FIG. 8) are provided below. The various GUIs may be designed to be user friendly, convenient, and capable of visually explaining the prediction process. For example, motion graphics (for instance, particles and intersecting lines that resemble DNA) connecting each component of the question to its related answer may be depicted. Rather than loading independent static pages, the GUI may contain greater animation and visually connect each user interaction point to the next to enhance the explainability of the predictive analytics process.

FIG. 10 shows sample main elements of an example main home landing page GUI of the knowledge pattern machine. Users can type their free-form query in the User Query search bar 1002 and once the Predict button 1004 is clicked, the PM system will process the query and provide optional intermediate interactive GUIs and/or an answer to the user query, as described above and in the other graphical user interface below.

FIG. 11 illustrates an example “What’s New page” GUI. Users can be provided with the trending queries and topics (1102) on this page, as well as new queries recommended by the knowledge pattern machine system (1104) and answers to the new query or queries (1106). Users can select new queries and answers and see the content for each. The knowledge pattern machine system can also create a news UI page to cover a variety of interesting topics and presenting these updating reports on the page.

FIG. 12 shows an example user profile page. Users may need to login in order to access their previously answered queries and their corresponding reports. In this page, users will be able to select any of their past reports from the report list 1202 and see what updates have been made to them. The reports here are dynamic and are automatically updated as new data related to the query and answer becomes available in the system. Users will also have the option to collaborate and create reports within a group and/or to share the predicted answers and reports with other users on various levels by using the “groups” button 1204. Users may upload their own data to ask questions about the information they have entered by using the “Upload Data” button 1206.

FIG. 13 shows an example Query Refinement UI page. Users may be redirected to this page after they type their query in the home page of FIG. 10. Here the system may match the initial user query to one of the query templates that the knowledge pattern machine system supports by asking the user to provide more context and information to their query, if needed. As an example, the knowledge pattern machine may determine that the current user query lacks information about signal or event, or time and/or location associated with the signal or event, and thus the query refinement component may be configured to detect the missing information and provide some suggestions (1302) to the user for each of the missing pieces. Users may be able to select an information entity from a dropdown menu (1304) and once the user provides this additional information, the knowledge pattern machine system can use the refined query as a formalized query to generate and provide an answer to the user. In refining the initial query, users may be able to provide further selections of scopes and clarify their initial question without having to enter any new type.

FIG. 14 shows an example Answer Page UI. Here, users are provided their refined query alongside the results of the simulation engine. The simulation engine results may consist of a visual/graphical representation of the simulation outcome (1402 and 1404), as well as the textual representation of the graphics/visuals (1406) explaining the prediction result to the user in the form of a sentenced answer. Users may be able to modify the format of the report and edit the visual/graphical representation of the predictive answer. Users may also return to the refinement selection to change or modify the question by using the “Refine Query” button 1408 or the “Edit” button 1414. If users are satisfied with the results, they can save the answer as a report in their profile for future reference by using the “Save” button 1412. Note that users may need to have an account in the PM system in order to save their results. Through the query recommendation system and by using the example “Explore” button 1410, other queries will be generated and presented to the user as further recommended queries in an exploration section (not shown) and the users may then be able to explore other recommended queries that are relevant to their initial query, to the answer provided to them, to other topics relevant to that individual user, or to topics that are trending within the pattern machine system as a whole.

Image and Visual Pattern Recognition and Visual Entity Database

In some applications, an image and visual pattern recognition component may be implemented independently or as part of the knowledge pattern machine described above. This component may be configured to perform intelligent recognition, classification, and organization of visual data items. These visual data items, as described in further detail below, may be cataloged in a visual entity database (analogous to the database configuration of the signal and event databases described above), which may be used independently for assisting in processing a “visual query” or in combination with the knowledge pattern machine to provide a predictive “visual answer” in addition to the normal answer described above. Such a “visual answer” may be generated in the form of, for example, predictive 3D visual content, animation, virtual reality, interactive motion graphics, and augmented reality, as described in further detail below.

In some implementations, the image and visual pattern recognition component may be configured to process various input visual data into representation entities or component parts. The input visual data items may be of various formats. For example, input visual data items may include images, videos, animations, presentations, slides, illustrations, charts, figures, and the like. Input visual data may be scraped from various public and private data sources for building the visual entity database, similar to the data scraping process for textual data, numeric data, and other information described above. The representation entities or component parts may be generated using an intelligent visual data recognition module of the visual pattern recognition component. The representation entities or component parts may include visual characteristic information entities such as dimension, form, color, material, size, pixilation, texture, lighting, scale, distance, depth, focus, foreground, background, and content information entities such as content classes, categories, visual features, textual descriptions, topics, headlines, people, places, and the like. The set of representation entities may be predefined and may also be made automatically expandable.

The visual recognition module may be based on a combination of trained and untrained machine learning models and neural networks. The information related to the visual pattern recognition component, including the input and the output representation entities, may be maintained in the visual entity database, similar to the language model databases and the signal and event databases described above.

For example, the recognition of these representation entities from input visual data items may be based on machine learning models that are pretrained. Domain specific models may be trained for recognizing different types of representation entities. The representation entities may be recognized or extracted from part of a visual dataset (e.g., part of an image) or may be extracted from a visual dataset as a whole. For example, representation entities may be recognized from an entire video, a section of the video, a frame of the video, or a portion of the frame. Each visual data item may be extracted into one or more representation entities. For example, an image may be recognized as any of “people”, “sports”, “New York”, “sunset”, etc. For another example, a picture of a dog could be identified not only as a picture of a dog, but also a picture of a certain breed of dog with specific features such as a certain type of nose, eyes, and ears, as well as color and type of fur and proportions of its body parts. The background of the picture might be further recognized and distinguished from the foreground, allowing the system to identify from a number of pictures of the same dog what type of environment the dog lives in. The models used in the visual recognition module for recognizing the representation entities may be independent of the knowledge pattern machine above or may also be trained jointly feeding into and relying upon on the knowledge graph of the knowledge pattern machine descried above.

The input to the visual pattern recognition component may further include metadata in addition to the content of the input visual data items. The various machine learning models thus may be developed to process the metadata and to correlate the metadata with the visual content for more expansive and more accurate visual pattern recognition. The metadata for visual data items may include, for example, time stamps, GPS location, textual description, tags, URLs, file directory path, and the like. The processing of the metadata may be an integral part of the visual content recognition model. Alternatively, the metadata processing may be performed separately. The output of such metadata processing may be further analyzed and correlated with the visual content recognition results using other machine learning models.

Further, in some implementations, each input visual data item may be divided into parts and separately processed, and each part may be recognized as being associated with one or more representation entities. These visual data item parts may be used as smaller units for the organization of representation entities in the visual entity database. The processed visual data and its parts may also be embedded in a vector model or assigned labels.

In some implementations, the generation of the representation entities from scraped input visual data items with metadata may be implemented as a formalization process, analogous to the formalization of queries described above. Such a formalization process, for example, may generate standardized representation entities from the input data items. The recognized representation entities such as the formalized visual input data may then populate the visual entity database.

In some other implementations, these recognized representation entities may also populate the knowledge graph of the knowledge pattern machine described above. Specifically, while the visual pattern recognition component may be implemented as a general module independent of the knowledge pattern machine system above, it may be tailored to be compatible with the knowledge pattern machine, particularly the knowledge graph, and with the three-dimensional (3D) experience builder and virtual reality/augmented reality builder(s) described in more detail below. The representation entities or component parts of the visual data, regardless of format, may be generated in a manner that is compatible with the knowledge graph, the 3D experience builder, and the virtual reality/augmented reality builder(s).

This processed information including the representation entities, their properties, features, and the corresponding visual data items or portions of visual data items may be stored in the visual entity database. As the processed output as generated by the visual pattern recognition component may be further compatible with the knowledge pattern machine above, these output data items including the representation entities, their properties, features, and visual data items may accordingly be used to populate the Knowledge Graph, and the formalized, processed visual data will also be stored in an “Image Database.”

Once the visual data is processed and formalized and is compatible with the knowledge graph and the rest of the pattern machine system, the visual data items may become associated among themselves and with different signals, events, and other features. For instance, the processed information about the picture of the dog might generate a signal for “Golden Retriever,” which could be connected to another signal for “Dog,” and could be related to a location signal that was identified in the background of the picture. Similarly, an event in an image data item could also be identified and it may relate to other signals or events in the knowledge graph. This same process would work for a video, which could be broken down by frame, or any other format of visual information. Within the Knowledge Graph, signals, events, features, and relationships populated and/or generated from the processed visual information may also be mapped together with the other signals, events, features, and relationships within the Knowledge Graph, which may have been derived from other types of sources, such as numerical or textual data. In other words, the data items recognized or extracted by the visual pattern recognition component may be integrated and fused with, and related to other signals and events in the knowledge graph described above.

AI Visual Information Generation and Visual Data Prediction

A visual prediction component may further be configured to generate predictive visual items. Similar to the textual information described above, the visual data items processed by the visual pattern recognition component may, for example, be time-stamped to form historical visual data on a time series. Such time stamped visual data items may thus be used for predicting visual data items for a future time. Such predictions may be performed by a trained AI model. For example, such a model may process an input visual data item and then predict how that visual data item would evolve in the future, similar to the prediction of the signals and events based on historical time series described above. This predictive capability can be separate from the signal and event prediction above. The predicted visual items may be pre-computed and stored in the visual item database. The precomputed predictive visual data items may include newly generated visual items based on historical visual data items stored in the visual item database.

3D Experience Builder and Virtual Reality/Augmented Reality Builder

As one of further examples, predicted visual content would then be generated from the various visual entities including the predicted visual entities via artificial intelligence algorithms. Such visual content may include 2D or 3D visual content/media that can be generated according to user’s question. In particular, the visual entities (e.g., predictive visual entities created as described above) that may be used for 2D or 3D content generation may be determined by processing the user query to generate signals, events, and other information entities that link to visual entities via the knowledge graph. Different identifying features and relationships would be accordingly used to build such visual content to accurately represent the answer to the user’s query. Different formats of visual content would be compatible but would require different components to generate. An example AI-image generator may be constructed to build still images, pictures, illustrations, etc. A time-series might be implemented to build videos, motion graphics, moving illustrations, etc. A 3D builder may be further configured to generate virtual and augmented reality (VR and AR) outputs. VR and AR outputs, for example, may be displayed using appropriate VR/AR equipment/accessories.

“Visual Answer” Generation

As such, the visual pattern recognition component and the visual item prediction component may function separately or independently from the knowledge pattern machine above to answer a “visual query” for a user. In other words, the user may input a visual item and the system would predict and generate visual content and otherwise build 3D content (e.g., virtual or augmented reality) as an answer to the query.

In some other implementations, because the various visual input data have been processed into representation entities or component parts and have been interrelated as well as related to other signals and events in the knowledge graph, a normal freeform textual query may be processed in a “visual pipeline” which generates predictive visual output based on processing the textual query to generate visual output via the overlap between information extracted from the input query and the representation entities or component parts generated by the visual pattern recognition component from historical input visual data. This would function separately from the normal prediction/simulation engine described above for the knowledge pattern machine.

In some other implementations, a freeform user query may be processed along both the normal textual data pipeline described above for the knowledge pattern machine and the “visual” processing pipeline. As such, both a normal predictive answer and a visual predictive answer may be generated independently using these two pipelines.

In yet some other implementations, the two pipelines above may be cascaded. For example, a free-form input query may first be processed by the normal knowledge pattern machine pipeline to generate a predictive answer via the prediction engine and simulation engine described above for the knowledge pattern machine. The predictive answer may then be processed by the “visual pipeline” to generate visual predictions for the user to visually “see” the answer. For example, predictive visual information such as videos, images, and 3D media may be rendered along with the normal predictive answer. Visual information may be maintained in the answer database by itself or along with the normal answer of the knowledge pattern machine described above.

In yet some other implementations, an input freeform query may be processed by the query pipeline to generate a formalized query as described above. One or more signals and/or events may be further extracted from the formalized query. These signals and events may be passed on to both the normal answer generation pipeline (using the signal event prediction and simulation) and the visual information processing pipeline, which may be configured to turn the extracted signals and/or events into visual information items using an intelligent model by using the correlation between signals/events and the visual representation entities and their properties as stored in the knowledge graph and the visual entity database.

In some implementations, the predictive visual information, either independently generated or generated also based on the normal predictive answer may be generated as a predictive time sequence which may further form an animation or video with a prediction timeline. As such, a user may visually view an evolution and the answer to the query in video/animation format. Each of frame of the video/animation may be predictively generated for that particular time point in the timeline.

Like the various components of the knowledge pattern machine, the visual pattern recognition component and the visual information prediction and generation components may be configured to operate in real-time. For example, as the input visual data changes over time, the corresponding visual data time sequences would evolve. The visual entity database and/or the knowledge graph may be updated in real-time, similar to the signal and event databases described above. The precomputed prediction of the visual representation entities and their properties may also update in real-time and accordingly, the “visual answers” to previous user queries and the visual information generated along with the normal answer to the previous user queries may also be updated.

Examples: Query-Answer Process Combining the Knowledge Pattern Machine and Visual Information Prediction and Generation

As one example, and separate from the question-answer flow of the normal knowledge pattern machine, but within the same graphical user interface of the knowledge pattern machine where the user may enter input data, the user may also have the option to enter visual information as an input. As such, the visual processing pipeline may be invoked to process the visual input to generate the representation entities. These entities may then be linked to other information entities, signals, and events in the knowledge graph, which may then be used to generate the normal answer (meaning the report and textual answer normally generated through the knowledge pattern machine system described above) using the signal/event prediction and simulation pipeline. A visual answer may be additionally and/or optionally generated in parallel. Alternatively, a visual answer may be generated based on the normal answer in series.

As another example, the user may ask a question (e.g., free form textual query), and proceed through the regular question-answer flow of the knowledge pattern machine. If the user enables AI-generation of visual content as part of the report format, then the user would receive the answer to their question through visual representation. The visual representation format may be selected by the user. For example, the user may select from video, animation, virtual reality, augmented reality, synthetic media, pictures, illustration, motion graphics, and the like. According to the format selected by the user, the visual pipeline is invoked to correlate the answer to related visual items through the simulation of precomputed visual items to generate visual content in the desired format, so that the user can “visualize” the answer.

Accordingly, the user would be able to “see” the answer to their question, as the system would in effect visually predict and illustrate the future. For example, an image might accompany predicted statistics to more clearly depict the answer; a video could provide a demonstration of the predictive answer and show an event occurring; augmented reality might allow a user to see how an entity or object might fit into their current life; virtual reality could allow the user to enter the experience of the answer to their question. The user could receive more interactive guidance to assist with strategizing and planning, more vivid visual forecasting, and/or a more immersive user experience. For instance, if a user inputs his/her phone’s camera roll and other data such as their calendar and email, enables VR reporting, and asks “What will my day tomorrow look like?”, the user might be able to experience a 3-dimensional simulation of their life the following day through a VR headset in answer to the query.

Finally, in FIG. 15, an example computing architecture 1500 that may be implemented in any of the processing and storage component above is shown, including communication interfaces 1502, system circuitry 1504, input/output (I/O) interfaces 1506, storage 1509, and display circuitry 1508 that generates machine interfaces 1510 (such as the user interfaces described above) locally or for remote display, e.g., in a web browser running on a local or remote machine. The machine interfaces 1510 and the I/O interfaces 1506 may include GUIs, touch sensitive displays, voice or facial recognition inputs, buttons, switches, speakers and other user interface elements. Additional examples of the I/O interfaces 1506 include microphones, video and still image cameras, headset and microphone input/output jacks, Universal Serial Bus (USB) connectors, memory card slots, and other types of inputs. The I/O interfaces 1506 may further include magnetic or optical media interfaces (e.g., a CDROM or DVD drive), serial and parallel bus interfaces, and keyboard and mouse interfaces.

The communication interfaces 1502 may include wireless transmitters and receivers (“transceivers”) 1512 and any antennas 1514 used by the transmitting and receiving circuitry of the transceivers 1512. The transceivers 1512 and antennas 1514 may support Wi-Fi network communications, for instance, under any version of IEEE 802.11, e.g., 802.11n or802.11 ac. The communication interfaces 1502 may also include wireline transceivers 716. The wireline transceivers 1516 may provide physical layer interfaces for any of a wide range of communication protocols, such as any type of Ethernet, data over cable service interface specification (DOCSIS), digital subscriber line (DSL), Synchronous Optical Network (SONET), or other protocol. Computers using the computing architecture of 1500 may communicate with one another via the communication interface 1502 and the communication network 101 as shown in FIG. 1.

The storage 1509 may be used to store various initial, intermediate, or final data or model for implementing the functionalities of the knowledge pattern machine and the various other computing components described above. The storage 1509 may be centralized or distributed. For example, the storage 1509 may be hosted remotely by a cloud computing service provider.

The system circuitry 1504 may include hardware, software, firmware, or other circuitry in any combination. The system circuitry 1504 may be implemented, for example, with one or more systems on a chip (SoC), application specific integrated circuits (ASIC), microprocessors, discrete analog and digital circuits, and other circuitry. The system circuitry 1504 is part of the implementation of any desired functionality related to the knowledge pattern machine. As just one example, the system circuitry 1504 may include one or more instruction processors 1518 and memories 1520. The memories 1520 may store, for example, control instructions 1524 and an operating system 1522. In one implementation, the instruction processors 1518 may execute the control instructions 1524 and the operating system 1522 to carry out any desired functionality related to the functionalities of the knowledge pattern machine described above.

The methods, devices, processing, and logic described above may be implemented in many different ways and in many different combinations of hardware and software. For example, all or parts of the implementations may be circuitry that includes an instruction processor, such as a Central Processing Unit (CPU), microcontroller, or a microprocessor; an Application Specific Integrated Circuit (ASIC), Programmable Logic Device (PLD), or Field Programmable Gate Array (FPGA); or circuitry that includes discrete logic or other circuit components, including analog circuit components, digital circuit components or both; or any combination thereof. The circuitry may include discrete interconnected hardware components and/or may be combined on a single integrated circuit die, distributed among multiple integrated circuit dies, or implemented in a Multiple Chip Module (MCM) of multiple integrated circuit dies in a common package, as examples.

The circuitry may further include or access instructions for execution by the circuitry. The instructions may be stored in a tangible storage medium that is other than a transitory signal, such as a flash memory, a Random Access Memory (RAM), a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM); or on a magnetic or optical disc, such as a Compact Disc Read Only Memory (CDROM), Hard Disk Drive (HDD), or other magnetic or optical disk; or in or on another machine-readable medium. A product, such as a computer program product, may include a storage medium and instructions stored in or on the medium, and the instructions when executed by the circuitry in a device may cause the device to implement any of the processing described above or illustrated in the drawings.

The implementations may be distributed as circuitry among multiple system components, such as among multiple processors and memories, optionally including multiple distributed processing systems. Parameters, databases, and other data structures may be separately stored and managed, may be incorporated into a single memory or database, may be logically and physically organized in many different ways, and may be implemented in many different ways, including as data structures such as linked lists, hash tables, arrays, records, objects, or implicit storage mechanisms. Programs may be parts (e.g., subroutines) of a single program, separate programs, distributed across several memories and processors, or implemented in many different ways, such as in a library, such as a shared library (and, may store instructions that perform any of the processing described above or illustrated in the drawings, when executed by the circuitry.

In general, terminology may be understood at least in part from usage in context. For example, terms, such as “and”, “or”, or “and/or,” as used herein may include a variety of meanings that may depend at least in part on the context in which such terms are used. Typically, “or” if used to associate a list, such as A, B or C, is intended to mean A, B, and C, here used in the inclusive sense, as well as A, B or C, here used in the exclusive sense. In addition, the term “one or more” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features, structures or characteristics in a plural sense. Similarly, terms, such as “a,” “an,” or “the,” may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context. In addition, the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context.

Reference throughout this specification to features, advantages, or similar language does not imply that all of the features and advantages that may be realized with the present solution should be or are included in any single implementation thereof. Rather, language referring to the features and advantages is understood to mean that a specific feature, advantage, or characteristic described in connection with an embodiment is included in at least one embodiment of the present solution. Thus, discussions of the features and advantages, and similar language, throughout the specification may, but do not necessarily, refer to the same embodiment.

Furthermore, the described features, advantages and characteristics of the present solution may be combined in any suitable manner in one or more embodiments. One of ordinary skill in the relevant art will recognize, in light of the description herein, that the present solution can be practiced without one or more of the specific features or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments of the present solution.

Claims

1. A knowledge pattern system, comprising:

a real-time-updated signal database comprising quantifiable and/or qualifiable metrics and signal properties thereof;

a real-time-updated event database comprising singular or repeated occurrences and event properties thereof;

a real-time-updated knowledge graph comprising information entities including a set of signals, a set of events, properties thereof, and relationships between the set of signals, the set of events, and the properties;

a model library comprising a set of trained artificial intelligence models; and

a circuitry configured to: automatically update the set of artificial intelligence models; automatically update the signals, signal properties, events, and event properties in the knowledge graph and time-stamp the update; compute and timestamp in real-time future-time values of the signals in the real-time-updated knowledge graph by automatically selecting, fetching, and running artificially intelligent models from the model library; automatically store the future-time values of the signals in the signal database; execute a user query process in combination with the real-time updated knowledge graph, the signal database, and the event database to process a free-form user query to extract at least one signal with corresponding future-time values and/or at least one event; automatically generate a predictive answer to the free-form user query via a simulation process based on the future-time values corresponding to the extracted at least one signal, the predictive answer comprising a future occurrence probability of the at least one event; and automatically convert the predictive answer into a report by selecting a reporting format according to the predictive answer.

2. The system of claim 1, wherein the circuitry is configured to automatically perform data crawling from at least one data source to obtain crawled data items to generate, populate, update, and/or time-stamp the real-time-updated signal database and the real-time updated event database.

3. The system of claim 2, wherein the circuitry is configured to automatically update the signals, events, properties thereof, and relationships therebetween in the real-time updated knowledge graph from the crawled data items and/or the real-time updated signal database and/or the real-time updated event database.

4. The system of claim 3, wherein the real-time updated knowledge graph associates the information entities to time-stamps or timelines to operate as a time series and to identify evolving patterns of the information items over time.

5. The system of claim 4 wherein the real-time updated knowledge graph is capable of self-expansion and the self-expansion is performed by an auto machine learning layer within the real-time-updated knowledge graph.

6. The system of claim 5, wherein the real-time-updated knowledge graph is configured to self-expand to discover previously unknown information items including qualitative or quantitative signals, events, or relationships therebetween.

7. The system of claim 1, wherein to compute in real-time the future-time values of the signals comprises:

to identify at least one signal;

to identify a corresponding property associated with the at least one signal;

to select at least one artificial intelligence model from the model library;

to generate test datasets based on the at least one signal and the corresponding property; and

to generate the future-time values of the signalsof the at least one signal associated with the corresponding property using the at least one artificial intelligence model and the test datasets.

8. The system of claim 1, wherein the model library contains at least one pre-computed artificial intelligence model that is real-time updated when underlying training data of the at least one pre-computed artificial intelligence model changes.

9. The system of claim 1, wherein to execute the user query process comprises:

to receive the free-form user query by a user from a graphical user interface; and

to automatically process the free-form user query to extract the at least one signal and/or the at least one event and corresponding signal properties and/or event properties based on the free-form query, the real-time-updated signal database, the real-time updated event database, and the real-time-updated knowledge graph.

10. The system of claim 9, wherein to automatically generate the predictive answer to the free-form user query via the simulation process based on signal predictions and/or event predictions comprises:

to automatically identify correlations between the future-time values of the at least one signal and the future occurrence probability of the at least one event; and

to automatically generate a simulation of the correlation as the predictive answer to the free-form user query.

11. The system of claim 9, wherein to automatically process the free-form user query further comprises:

to preprocess the free-form user query to determine at least one domain for the free-form user query and to perform at least one of text normalization, stemming/lemmatization, tokenization, and text cleansing; and

at least one of: determine a type of the free-form user query; refine the free-form user query; and deduplicate the free-form user query based on a historical query database; and

wherein to automatically process the free-form user query, the circuitry is configured to extract one or more entities based on pre-trained semantic or syntactic modes and to generate one or more embedding vectors.

12. The system, of claim 11, wherein the circuitry is configured to refine the free-form user query by:

automatically determining a completeness of the free-form user query based on the preprocessed free-form query and the type of free-form query; and

in response to determining that the free-form user query is incomplete, automatically generating an interactive prompt through the graphical user interface to obtain supplemental information from the user to complete the free-form user query.

13. The system of claim 11, wherein to determine the type of the free-form user query comprises configuring the circuitry to detect the free-form user query as one of normal, out of scope, and biased types.

14. The system of claim 11, wherein the circuitry is configured to extract the at least one signal and/or the at least one event by performing an automatic signal matching and an automatic event matching based on the one or more entities, the one or more embedding vectors, and the real-time-updated signal database and the real-time-updated event database.

15. The system of claim 11, wherein the circuitry is further configured to automatically generate and display at least one recommended follow-up query based on the free-form user query, the predictive answer, and the historical query database.

16. The system of claim 1, wherein the circuitry is configured to generate the report based on a predefined report template automatically selected based on at least one domain of the free-form user query, the at least one signal, and/or the at least one event.

17. The system of claim 1, wherein the report comprises an automatically generated textual information item summarizing the predictive answer to the free-form user query and/or visual presentation comprising graphs, tables, charts, spreadsheets, images, animations, videos, 3D models, virtual reality images/videos, and/or augmented/virtual reality images/videos of the future-time values associated with the at least one signal associated with the at least one event.

18. The system of claim 1, wherein the knowledge pattern system further comprises a visual entity database and the circuitry is further configured to:

generate a set of visual entities from a set of visual data items; link the visual entities to the set of signals and events via the real-time-update knowledge graph;

execute an intelligent visual information processing pipeline to generate predicted visual entities;

extract a subset of visual entities and predicted visual entities based on the at least one signal, the at least one event, or the predictive answer; and

generate predictive visual content in at least one form of virtual reality, augmented reality, interactive graphics, images, or videos, based on the subset of visual entities to supplement the predictive answer.

19. The system of claim 1, wherein the model library comprises a set of pre-trained semantic or syntactic models to generate one or more embedding vectors for preprocessing the free-form user query, for computing the future-time values, for generating the predictive answer, and/or for converting the predictive answer into the report.

20. A method for generating and processing knowledge patterns performed by a system comprising a real-time-updated signal database comprising quantifiable and/or qualifiable metrics and signal properties thereof, a real-time-updated event database comprising singular or repeated occurrences and event properties thereof, a real-time-updated knowledge graph comprising information entities including a set of signals, a set of events, properties thereof, and relationships between the set of signals, the set of events, and the properties, a model library comprising a set of trained artificial intelligence models, a memory for storing computer instructions, and at least one processor, the method comprising:

automatically updating the set of artificial intelligence models;

automatically updating the signals, signal properties, events, and event properties in the knowledge graph and time-stamp the update;

computing and timestamping in real-time future-time values of the signals in the real-time-updated knowledge graph by automatically selecting, fetching, and running artificially intelligent models from the model library;

automatically storing the future-time values of the signals in the signal database;

executing a user query process in combination with the real-time updated knowledge graph, the signal database, and the event database to process a free-form user query to extract at least one signal with corresponding future-time values and/or at least one event;

automatically generating a predictive answer to the free-form user query via a simulation process based on the future-time values corresponding to the extracted at least one signal, the predictive answer comprising a future occurrence probability of the at least one event; and

automatically converting the predictive answer into a report by selecting a reporting format according to the predictive answer.