MACHINE LEARNING APPLICATIONS FOR DYNAMIC, QUANTITATIVE ASSESSMENT OF HUMAN RESOURCES

Info

Publication number: 20170262809
Type: Application
Filed: Mar 14, 2017
Publication Date: Sep 14, 2017
Applicant: PreSeries Tech, SL (Valencia)
Inventors: Francisco J. Martin (Corvallis, OR), Luis Javier Placer Mendoza (Madrid), Alvaro Otero Perez (Valencia), Javier S. Alperte Pérez Rejón (Barcelona), Xavier Canals Orriols (Valencia), Francisco J. Garcia Moreno (Valencia), Jim Shur (Corvallis, OR), Candido Zuriaga Garcia (Valencia)
Application Number: 15/458,796

Abstract

Systems and methods that include characterizing individual team members in terms of specified “character elements” or attributes, and combining the individual characterizations into an aggregate characterization of the team. The team evaluation does not merely sum individual attributes; rather, it analyzes the composition of the team relative to predetermined metrics, taking into account what combinations of individual team member attributes are more likely to lead to success of the team.

Description

Description

PRIORITY

This application is a non-provisional of and claims priority benefit to U.S. provisional patent application 62/307,918, filed Mar. 14, 2016, and U.S. provisional patent application 62/308,095, filed Mar. 14, 2016, both of which are incorporated herein by reference in their entirety.

COPYRIGHT NOTICE

© 2016 PreSeries Tech, SL. A portion of the present disclosure may contain material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the present disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

TECHNICAL FIELD

This disclosure pertains to software-driven machine learning and, more specifically, to systems and methods that utilize machine learning technologies to quantitatively evaluate multiple types of data about individuals and teams to predict the likelihood a corresponding enterprise will be successful.

BACKGROUND

Early-stage investors are frequently characterized as following a “gut-driven”, “term driven”, or “lemming-like” approach. In summary, gut-driven investors primarily follow their instincts about specific companies in making investment decisions. Term driven investors focus on maximizing potential returns by focusing on companies that offer better financial terms than others. Lemming-like investors let others identify promising opportunities and follow them, frequently co-investing in companies that others feel are promising. It would be advantageous to have a technical tool for evaluating investment opportunities to identify those entities that are likely to be more successful than others.

SUMMARY OF THE INVENTION

We disclose a system and methods that include characterizing individual team members in terms of specified “character elements” or attributes, and combining the individual characterizations into an aggregate characterization of the team. The team evaluation does not merely sum individual attributes; rather, it analyzes the composition of the team relative to predetermined metrics, taking into account what combinations of individual team member attributes are more likely to lead to success of the team. The system and methods described here use machine learning technologies to evaluate multiple types of data about individuals and teams to predict the likelihood the company will be successful and therefore be a good investment. These individual and team characterizations can be combined with other measures of company performance relevant to predicting whether the company will succeed or fail. This brief summary is not intended to limit the scope of the more detailed description that follows, nor does it limit the scope of the claims. It is provided as a convenience to the reader.

BRIEF DESCRIPTION OF THE DRAWINGS

The included drawings are for illustrative purposes and serve to provide examples of possible structures and operations for the disclosed inventive systems, apparatus, methods and computer-readable storage media. These drawings in no way limit any changes in form and detail that may be made by one skilled in the art without departing from the spirit and scope of the disclosed implementations.

FIG. 1A is a simplified conceptual diagram of a system and method for applying machine learning to objectively and systematically evaluate a team of individuals.

FIG. 1B illustrates details of a feature vector of the system of FIG. 1A.

FIG. 2 is a simplified conceptual diagram illustrating examples of personal characteristics of an individual team member that can be used in evaluation of a team.

FIG. 3 is a conceptual diagram illustrating a computer-implemented process for generating a team character score [aggregate or team profile] based on combining character scores across a plurality of individual team members.

FIG. 4 is a simplified conceptual diagram illustrating a computer-implemented process for generating an overall team score based on combining team member character components and assessing the team composition based on corresponding predetermined character component distribution data.

FIG. 5 is a simplified conceptual diagram illustrating analysis of a team profile relative to a preferred distribution.

FIG. 6 is a simplified flow diagram of an illustrative process consistent with aspects of the present disclosure.

DETAILED DESCRIPTION

Examples of systems, apparatus, computer-readable storage media, and methods according to the disclosed implementations are described in this section. These examples are being provided solely to add context and aid in the understanding of the disclosed implementations. It will thus be apparent to one skilled in the art that the disclosed implementations may be practiced without some or all of the specific details provided. In other instances, certain process or method operations, also referred to herein as “blocks,” have not been described in detail in order to avoid unnecessarily obscuring the disclosed implementations. Other implementations and applications also are possible, and as such, the following examples should not be taken as definitive or limiting either in scope or setting.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.

Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which are implemented via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which are implemented on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

In the following detailed description, references are made to the accompanying drawings, which form a part of the description and in which are shown, by way of illustration, specific implementations. Although these disclosed implementations are described in sufficient detail to enable one skilled in the art to practice the implementations, it is to be understood that these examples are not limiting, such that other implementations may be used and changes may be made to the disclosed implementations without departing from their spirit and scope. For example, the blocks of the methods shown and described herein are not necessarily performed in the order indicated in some other implementations. Additionally, in some other implementations, the disclosed methods may include more or fewer blocks than are described. As another example, some blocks described herein as separate blocks may be combined in some other implementations. Conversely, what may be described herein as a single block may be implemented in multiple blocks in some other implementations. Additionally, the conjunction “or” is intended herein in the inclusive sense where appropriate unless otherwise indicated; that is, the phrase “A, B or C” is intended to include the possibilities of “A,” “B,” “C,” “A and B,” “B and C,” “A and C” and “A, B and C.”

FIG. 1A is a simplified conceptual diagram of a system and method for applying machine learning to objectively and systematically evaluate a team of individuals. We refer to a “team” in the general sense of a group of individuals who work together for a common purpose. More specifically, this disclosure uses a business, especially a start-up entity, as an example of a common purpose. This system infers an entrepreneurial character (EC) of an individual (team member), and a group of individuals working as a team, using disperse and heterogeneous sources of data. The system can be fully automated.

In FIG. 1A, an example of a system 100 is shown. The system comprises a set of software components executable on one or more processors. Preferably, the processors may be provisioned locally or in a scalable network of computers or processors, aka “the cloud.” The system 100 is configured to input or receive data from data sources 102. For example, data used can be extracted from web pages 104, structured forms 106, or documents 108 in different formats (pdf, doc, etc). Other data formats may include CSV, ARFF, JSON, etc. Data sources may be public, paid, or private domain. These examples of data sources are merely illustrative and not limiting. Details of collecting electronic data, for example, over a network, are known. In some cases, automated processes may be used to search the web and collect potentially useful data. The data preferably is collected over an extended period of time, at more or less regular intervals, although the frequency and number of data collections is not critical.

A feature synthesizer component 110 is arranged to process the received data from sources 102. Such processing may include identifying data fields, field types, and corresponding data values, etc. The feature synthesizer may determine which fields of data to import into a dataset, and which to ignore. The processing may include cleaning or “scrubbing” the data to remove errors or anomalies. In some cases, text analysis may be applied to text fields, for example, tokenizing, stop word processing, stemming, etc. to make the data more usable. Further, various types of input data besides text can be used; for example, sound or image files. As a simple example, “wears a tie” can be synthesized from an image of a person wearing a tie, such as from a profile posted in social media.

More importantly, the feature synthesizer 110, although illustrated as a single entity for simplicity, actually comprises N individual feature synthesizer components. Each individual feature synthesizer is arranged to provide data, derived from the input data sources 102, and store it in a corresponding Dataset 112 for use with a corresponding specialized model builder 120. The system is initialized or configured for processing a given set of attributes of interest.

A feature synthesizer for a given attribute is configured to recognize, and extract from the input data, information that is indicative of the attribute of interest. Some examples are given in Table 1 below. It then stores the extracted data in the corresponding dataset 112. As discussed below, the process is repeated periodically over time. To illustrate, a feature synthesizer directed to technology understanding, for example, might look for data on a person's education, technical degrees, patents, and work experience. It may collect what degrees were earned at what schools, and when. It might even look for grade reports or special awards or designations such as cum laude. It may evaluation technical publication in which the person was an author. All of this data is collected into a dataset for the technology understanding attribute. As another example, a feature synthesizer for an attribute attention to detail may collect writings authored by the person on interest, and determine a frequency of misspellings or grammatical errors in those writings. Or, inconsistencies within the same writing may be an indicator of lack of attention to detail. Again, the corresponding feature synthesizer component gleans data relevant to its task from the input data sources and stores it in a dataset.

The dataset must also include an assessment or score for the particular attribute or variable of interest, at least for some of the records. In some cases, this evaluation may be conducted programmatically. In other cases, records may be evaluation by an expert with regard to the attribute of interest, and the evaluation results input to the dataset in association with the records reviewed. The evaluation may be expressed as a binary result (detail oriented or not detail oriented; high level of technical understanding, or not). In some embodiments, these evaluations may take the form of an analog value, say between 0 and 1.

Referring again to FIG. 1A, a plurality (N) of specialized model builder components 120₁. . . 120_Nare provided. Each model builder is arranged to create a machine-usable model of the data in its dataset 112₁-112_N. For each attribute of interest #1-#N, some data in the corresponding dataset that includes an evaluation of the attribute of interest may be used to train the model. In this way, the model 130 can then be used to evaluate additional data in the corresponding dataset that does not include an explicit assessment of the attribute or variable of interest. The trained model can then predict that assessment and provide a score for the corresponding attribute. In the FIG. 1A, each score output from an individual model provides a corresponding character element 142. The character element scores #1-#N together form an individual team member character vector or profile. As noted, Boolean or analog values may be used. Analog values preferably are normalized in the normalizer 140, and then the vector 150 stored in memory. Details of building machine learning models such as classifiers are known. For example, see Francisco J. Martin et al., U.S. Pat. No. 9,269,054 incorporated herein by this reference.

Example types of information (attributes) about team members that could be included in an entrepreneurial team member's profile could include background and experience data, such as that shown below.

TABLE 1 Example attributes and data tending to support them. 1. Perseverance - Has finished a long-term endeavor? (e.g. 5-year degree plus Ph.D.) 2. Adaptability - Has lived in more than one city, country, and for how long? 3. Competitiveness - Has won a significant prize or award? Involved in other competitive activities like sports? 4. Creativity - Has invented something special? 5. Communicativeness - Has presented at many conferences? 6. Detail Orientation - Misspellings in the resume, paragraphs indented irregularly, etc. 7. Market Understanding - How many years of experience? 8. Technology Understanding - Holds a tech degree? Practical tech experience? 9. Other Experience - Business experience? Startup experience? 10. Network - Number of connections? MBA? 11. Customer Orientation - Held a sales role? 12. Design Orientation - Attended design school? Practical design experience?

Other information that could be included in a profile might address character attributes such as “nonconformist?”, “dissenter?”, or “maverick?”, or aggregate attributes such as “rebel” for the preceding distinct attributes. Suitable feature synthesizers can be configured to collect the data for model building.

In some systems, data may be collected for a mature organization, as distinguished from a startup. Here we mean an entity that has reached an “outcome” indicative of success or failure (conveniently, a binary variable). Preferably, such data may be collected from thousands of organizations so that it is statistically meaningful. Further, detailed information for each such entity may include attribute data for each team member in that entity, such as described herein. That data may be processed, and the actual outcomes included in appropriate datasets. This information may be used to further train or “tune” the attribute models by taking into account the eventual outcomes of actual companies.

Referring again to FIG. 1A, a feature vector 150 is stored in a memory, the feature vector comprising a series of elements or fields numbered 1 to N for illustration, each element storing one or more values for a correspond attribute or feature for an individual team member as discussed. It is not required that there be a value for every element in the vector. In some cases there may be insufficient data in the dataset for analysis. The number of features N is not critical; generally a greater number of features analyzed will tend to generate more reliable results. Tens of features may be sufficient, while hundreds may provide more accurate scoring, both for comparing individual team members, and for comparing aggregate team scores.

FIG. 2 is a simplified conceptual diagram illustrating examples of personal characteristics or attributes of an individual team member that can be used in evaluation of the team member and the team. The attributes shown are merely illustrative and not limiting. Each of these (and other) attributes can be the target of a corresponding model, built from the datasets as described above or equivalent kinds of data. The example attributes listed are positive, that is, generally desirable attributes. Other data can be acquired and analyzed to consider generally negative attributes, say felony convictions or bankruptcy filings. This data too can be quantified, and the influence used to reduce overall team member scores. Both positive and negative attributes, i.e., a combination, can be analyzed in a system of the type disclosed.

FIG. 1B illustrates an example of a feature vector in more detail. The feature vector 150 may comprise a plurality of individual character elements or features 152, again numbered 1-N. In some embodiments, each individual feature can be either of type Boolean (preferably with a confidence level associated), or of type Numeric. Feature field 154 is expanded to show a Boolean field 156, along with a corresponding confidence field 158. In addition, the feature 154 may include a mutability field 160. Preferably, the mutability field 160 comprises a pair of values: 1) a number that represents the level of mutability of the character and 2) a sign. They are respectively used to indicate to what extent the value associated with an individual feature is expected to change over time (mutability) and in which direction (positively or negatively). The level of mutability may conveniently be scaled, for example, to 0-1.

In other embodiments, mutability may be a single Boolean value (indicating mutable or not). For example, whether a person (team member) speaks English might take a Boolean value, or it may have a scaled value from 0 (not at all) to 1 (fluent). Referring again to FIG. 1B, another feature 166 is expanded to illustrate a numeric field type 168. In an embodiment, a numeric type of feature may have a value normalized within a certain scale. This feature (attribute) 166 also may include a mutability value pair 170 as just described.

Referring again to FIG. 1A, the first character vector 150 is stored in memory, associated with a first time t₁. Before doing so, a determination of mutability may be made by a mutability component 144, so that mutability data is included in the character vector. At a later time t₂, additional data is collected from the data sources 102, and processed by the feature synthesizers 110 for the same team member TM1. The new (later) data is processed by the same models 130 and a new feature vector formed, as before. Line 132 represents new data being modeled to create a new feature vector without modifying the existing models. The new feature vector is added to memory, labeled as 160. Subsequently, addition input data can be acquired and processed as described, at times t₃, t₄, etc. These data collections may be periodic or more random. Each of them is stored in memory, illustrated as vectors 160 and 170. The number is not critical.

The same process is repeated for each team member, or any selected subset of a team. Thus, the feature synthesizer, as part of collecting raw data, will identify the other team members of interest, and collect data associated with each of them. Accordingly, a dataset may include records for each team member of interest, or separate datasets may be provisioned. Details of the data storage are a matter of design choice. In FIG. 1A, it shows a first set of vectors 150, 160 and 170 at times t1, t2, t3, respectively, all for a first team member TM1. A second set of vectors 180, 182, 184 are also collected at different times for a second team member (TM2). The collection times t1, t2 etc. may or may not be the same for each team member. Finally, a third set of vectors illustrated at 190, 192, 194 are shown for a third team member (TM3). All of this data is input to a multiple character combiner component 200 to develop a team score.

Individual team member profiles may be combined by formal mathematical rules into an aggregate profile for the team as represented in FIG. 3. The figure shows a simplified flow diagram illustrating a computer-implemented process for generating a team character score [aggregate or team profile] 310 based on combining character scores across a plurality of individual team members. Individual profiles may be cast as elements of an abstract algebraic structure (e.g. a group, ring, module, field, or vector space) in those cases where the profile and rules for combining them have sufficient structure. They could also be characterized and combined in a more ad hoc fashion. In FIG. 3, a Team Member Character Score #1 labeled 302 is combined with Team Member Character Scores #2 (304) through #M (306) to form the Team Character Score 310. Each individual team member score, for example, 302, comprises a plurality of elements #1-#N. The team score may comprise a feature vector as described above.

Each vector may correspond to a vector such as those described with regard to FIG. 1A for a given team member. For example, the vectors in FIG. 3 may correspond to vectors 150, 180 and 190 in FIG. 1A each associated with a different team member. For each vector, an aggregate score 312, 314, 316 is determined by combining the individual attribute values in the corresponding vector. The aggregate scores may be determined by any suitable operation such as arithmetic sum, mean value, etc. Operations may be applied to combine numeric as well as Boolean values. These aggregate scores for each team member can be used to compare or rank team members. A reporting component (not shown) can generate these results to a user interface or API.

An EC (character score) represents, and quantifies objectively, whether or to what extent an individual is appropriate to start or continue leading a company and if the character is predicted to evolve positively or negatively. More specifically, the mutability metrics stored in feature vectors such as 160, 170 can be acquired and analyzed over time in the vectors from T=0 to T=M. With these metrics, average values, rates of change, and other statistical measures can be used to assess and predict where each attribute is moving for those that are mutable. Increasing values of a positive attribute may be contribute to a higher overall team member score 312, 314, 316 and to a higher team score 333.

Individual team member profiles may be combined by formal mathematical rules into an aggregate profile for the team as represented in FIG. 3. The figure shows a simplified flow diagram illustrating a computer-implemented process for generating a team character score [aggregate or team profile] 333 based on combining character scores across a plurality of individual team members. Individual profiles may be cast as elements of an abstract algebraic structure (e.g. a group, ring, module, field, or vector space) in those cases where the profile and rules for combining them have sufficient structure. They could also be characterized and combined in a more ad hoc fashion. In FIG. 3, a Team Member Character Score #1 labeled 302 is combined with Team Member Character Scores #2 (304) through #M (306) to form the Team Character Score 333. Each individual team member score, for example, 302, comprises a plurality of elements #1-#N. The team score may comprise a feature vector as described above. A mutability function 330 may be used to access the mutability metrics stored in the vectors and apply them to the team score 333 to provide a final team character score 310.

The combiner 200 (FIG. 1A) used to compute the overall EC of an a team “TCS” 310 can be adapted to reflect the type of company the team will operate. The same applies to other attributes about a company (market, stage, funding, etc). Actually, a combiner for an individual EC or team EC may be a combination of combiners.

Distribution of Character Components

Some character components or attributes are generally positive for every individual in which they are found, for example, hard working or well educated, and they remain positive when these attributes are found from the input data to exist across multiple members of a team. In a sense, they may be considered additive contributions to the overall team score. In some cases, attributes such at assertiveness, strong leader, authoritarian may be positive for an individual, but may not be positive where found in multiple members on the same team. For this reason, our system may implement a preferred distribution (or composition) in assessing a team. For some attributes, a very small number of instances (team members) may be preferred. For other attributes, the more team members that exhibit the attribute, the better for overall team function. To that end, we create a preferred distribution for each character component. Then the process assesses how closely the distribution for a given attribute matches the preferred distribution. Mathematically, this can be done in various ways, for example, summing the differences between the actual distribution and the preferred distribution, or using a sum of squares, etc. In some embodiments, correlation coefficients may be used to assess this “closeness” or deviation from the preferred distribution. Preferred distributions may be created (or inferred) based on historical data that describes teams that were successful.

FIG. 4 is a simplified conceptual diagram illustrating a computer-implemented process for generating an overall team score based on combining team member character components and assessing the combined team member character components based on corresponding predetermined character component distribution data for each component. For example, in FIG. 4, a first row of elements 402 may comprise the attribute values for a selected attribute #1 across all M team members, that is TM1 value #1 through TMM value #1. These values are combined by applying a selected operator 410, to form an overall score for the team for that attribute #1. In the second row 404, the element values are collected for a second attribute #2, again for all M team members. A second operator 412 is applied to this data to form the team result 422. Similarly, additional operators are applied for each element or attribute, across all team members, finally determining the last team attribute TM#N at 426. The team scores TM#1-TM#N may be combined to form an overall team score T shown at 430.

The operators 410, 412, 414 may be selected according to the specific attribute of interest. To illustrate, if the team is going to work together in the English language, it would be important for all members of the team to speak English. Here, we will use English language skill for attribute #1, and assume it is a Boolean variable. Thus we apply the Boolean AND operator for operator 410 so that the team result at 420 will be true (Boolean 1) only if all team members speak English.

As another example, suppose the team is going to build a web application for consumers to use. It would be important for at least one team member to be skilled at user building user interfaces (UX). Here, we will use UX skill for attribute #2, and again assume it is a Boolean variable (the skill is present or it is absent in each team member, as ascertained from the input data by a corresponding feature synthesizer and model. Assuming that one person skilled in UX is enough, we apply the Boolean OR as operator 412 in the drawing, to determine the team result 422. If one or more team members have that UX skill, it will result in the result 422 true.

Suppose that attribute #N is a strong leader and authoritarian. It would be helpful to have exactly one person on the team with that attribute. Again, for now, we assume it is a Boolean variable. For the operator 414 we apply the Boolean XOR operator across the team members. If there is one team member with that attribute, the output at 426 will be true. In general, Boolean logic can be applied to realize any desired composition of the team. Further, compound expressions can be used in forming the team values for a given attribute. A compound expression here refers to a Boolean operation where at least one of the operands is itself a Boolean function of the team member's data.

The results at 420, 422, 426, that is the Boolean output for the team for each attribute, together form a team profile—a vector of Boolean values. The number of “ones” can be counted to form a team score. This score will improve in proportion to the number of elements or attributes for which the team “fits” the preferred distribution. This score can be used to compare teams or subsets of team quite readily. Different sets of attributes can be used by creating a desired or paradigm distribution and processing the data with correspondingly selected operators. Comparison of the team's resulting profile to the paradigm distribution will immediately identify where the team misses the mark. As explained above, some attributes are not simply input data from the input data sources. Rather, some attributes must be inferred, or estimated, by the feature synthesizer and model building processes described above.

We have discussed several examples of Boolean attributes. Other attributes, or some of the same attributes, may have numeric values, for example, in a range of 0 to 1. For example, English language proficiency or UX programming skills can be assessed on a numeric scale. A team can be evaluated using these metrics as well. FIG. 5 is a simplified conceptual diagram illustrating analysis of a team profile relative to a preferred numeric distribution. Here, numeric values are scaled and quantized from 0 to 1. A team EC (profile) 510 shows values (from 0 to 1) for each attribute a, b, c etc. For example, if an attribute of interest is years of formal education, the average or median number of years of education across the team members can be scaled and indicated in a vector. Other attributes like language skills can be used as well as numeric data types. The team attribute values may be collected in a vector 520, where we illustrate the values graphically like a histogram. A preferred or paradigm distribution for the same set of attributes can be provided, shown as histogram 530. The preferred distribution may be generated by analysis of a large collection of data, for example, that reflects startup entities' teams and their success or failure several years after they started. The team vector 520 may be compared to the preferred distribution vector 530. Here, we see that attribute 522, for example, in the team vector 520 has the same value as the corresponding value 532 in the preferred distribution 530. The attribute 524 in the team vector has a lower value 534 in the preferred distribution 530. The attribute with value 526 has a higher value 536 in the preferred distribution, etc. These differences or “delta” are illustrated as a delta histogram 540. The closeness or “fit” of the team vector 520 to the preferred distribution 530 can be quantified by the delta data. In an embodiment, an area of the rectangles 542, 544, and 546 can be calculated to determine a team score 550.

The team score can be used for comparison to other teams. Importantly, the delta data can quickly identify where the team attributes depart from the preferred values. Further, the size of those departures can be reported to help to build a better team.

In viewing and using these metrics, the mutability values discussed above may be taken into consideration. Where a team score is relatively low, but the attributes that contribute to lowering the score are mutable in a positive direction, the score may improve over time. On the other hand, where the mutability values are low or negative, improvement over time is less likely.

FIG. 6 is a simplified flow diagram that summarizes the processes described above in one embodiment. Identify data sources and configure data collection, block 602. Refer to sources 102 in FIG. 1A for example. Upload raw data for a team member, block 604. Process the data to synthesize feature data, block 606. Use the feature data to populate a dataset, block 610. The dataset may correspond to a dataset 112 in FIG. 1A. If prediction models were previously provisioned, apply the models to the dataset to generate a score for each attribute or character element for the current team member, Block 620. If such models have not been provisioned, decision 612, then provide the data to a specialized model builder for each attribute, block 616 to then generate or update the models, block 618, and then apply them, block 620.

Further with regard to FIG. 6, apply a mutability analysis to add mutability metrics to the data, 622. For some cases, the mutability may be predetermined. For example, a date of birth or bachelor degree grant are immutable. A language skill may improve over time. In some embodiments mutability may be inferred by changes in the team member data over time, as data is collected repeatedly over time (see decision 642 and FIG. 1A). Store the team member feature vector in memory, block 624. If there are more members not yet processed, decision 630, loop back to block 604 to collect data for the next team member. After all team members are processed, proceed path 640 to a decision 642 whether to update the input data. In some embodiments, the input data may be updated periodically or according to some other schedule. If an update is indicated, continue to block 604 to repeat the foregoing steps and acquire new data for each team member.

Otherwise, proceed to block 644 to combine team member feature vectors to form a team (aggregate) feature vector. Next, compare the team vector to a preferred distribution or composition, block 646, as described in more detail above. The differences between the team vector and the preferred composition may be assessed, block 650, which may include generating an overall team score for ready comparison to other teams. Finally, results reporting, block 652, may include final team score, problematic attributes, mutability assessment, and other metrics which can be used to predict success of the team, and to improve its composition. The process concludes at terminator 660.

One of skill in the art will recognize that the concepts taught herein can be tailored to a particular application in many other ways. In particular, those skilled in the art will recognize that the illustrated examples are but one of many alternative implementations that will become apparent upon reading this disclosure. It will be obvious to those having skill in the art that many changes may be made to the details of the above-described embodiments without departing from the underlying principles of the invention. The scope of the present invention should, therefore, be determined only by the following claims.

Claims

1. A computer-implemented method comprising:

uploading digital input data associated with a member of a team from at least one data source;

synthesizing feature data from the input data for each one of a plurality of selected attributes to form at least one dataset of character feature data associated with the team member;

provisioning a plurality of specialized model builder components, each model builder component configured to build a prediction model for a corresponding one of the plurality of attributes;

training at least some of the specialized model builder components based at least in part on the synthesized feature data to form respective prediction models for the corresponding attributes;

separately applying each of the trained prediction models to at least a portion of the character feature data associated with the team member to form a corresponding score for each one of the selected attributes for the team member;

storing the team member scores to form a first feature vector;

combining the first feature vector with a plurality of additional feature vectors, each of which is associated with a different member of the same team, so as to form an aggregate team score vector;

deriving a team score based on the aggregate score vector; and

reporting the team score.

2. The method of claim 1 wherein deriving the team score includes comparing the aggregate team score vector to a predetermined preferred distribution.

3. The method of claim 1 wherein combining the team member feature vectors includes, separately for each attribute, combining the corresponding individual team member scores based on a predetermined Boolean operator selected to reflect a desired composition of the team.

4. The method of claim 1 wherein each of the attribute scores for a team member comprises one of a numerical value and a Boolean value.

5. The method of claim 4 wherein the attribute score includes a Boolean value and a confidence value.

6. The method of claim 1 including:

identifying attributes of the team aggregate score vector that deviate from the preferred distribution; and

reporting which attributes of the team aggregate score vector deviate from the preferred distribution.

7. The method of claim 1 and further including assessing a mutability metric for at least one of the selected attribute for a team member, based on changes in the corresponding attribute score over time.

8. The method of claim 7 wherein the mutability metric comprises a numeric value and a sign indicating a direction of change of the corresponding scores.

9. The method of claim 7 and further comprising including the mutability metric in the corresponding attribute score in the first feature vector.

10. The method of claim 1 and further including: separately applying each of the trained prediction models to the new data to form new scores for each one of the selected attributes for the team member; and comparing the new scores to the first feature vector to determine mutability of at least one of the selected attributes.

repeating the uploading and synthesizing steps to acquire new data;

11. A system comprising:

a network interface to acquire input data over a network;

a plurality of feature synthesizer components, each configured to process input data including extracting input data relevant to a corresponding attribute of a team member;

a plurality of specialized prediction models, each prediction model configured to generate a corresponding attribute score based on the input data extracted by the corresponding feature synthesizer component;

a datastore arranged to collect the attribute scores from the prediction models for a given team member to form a feature vector associated with the team member;

control logic to cause the system to acquire additional input data associated with different team members, process the additional input data in the feature synthesizer components, generate additional attribute scores based on the additional input data for each of the different team members, and add the additional attribute scores to the datastore to form additional feature vectors for each of the different team members, and combine the feature vectors for the team member and the different team members to form an aggregate team score vector.

12. The system of claim 11 and further including control logic to acquire later input data and to generate and store in the datastore updated team member feature vectors based on the later data.

13. The system of claim 12 and further comprising:

a mutability component arranged to assess changes in the team member feature vectors over time, based on comparing the feature vectors to the to the updated team member feature vectors, and to store mutability metrics in the corresponding vectors based on the assessment.

14. The system of claim 13 wherein: the control logic to combine the feature vectors to form the aggregate team score vector utilizes a selected Boolean operator to combine the individual team member scores for a given attribute.

15. The system of claim 11 and further including comparing the aggregate team score vector to a preferred distribution, and reporting a result of the comparison for evaluating the team.

16. A non-volatile machine-readable memory device storing a series of instructions executable on a processor to cause the processor to:

upload digital input data associated with a member of a team from at least one data source;

synthesize feature data from the input data for each one of a plurality of selected attributes to form at least one dataset of character feature data associated with the team member;

provision a plurality of specialized model builder components, each model builder component configured to build a prediction model for a corresponding one of the plurality of attributes;

train at least some of the specialized model builder components based at least in part on the synthesized feature data to form respective prediction models for the corresponding attributes;

separately apply each of the trained prediction models to at least a portion of the character feature data associated with the team member to form a corresponding score for each one of the selected attributes for the team member;

store the team member scores to form a first feature vector;

combine the first feature vector with a plurality of additional feature vectors, each of which is associated with a different member of the same team, so as to form an aggregate team score vector;

derive a team score based on the aggregate score vector; and

report the team score via an interface.

17. The device of claim 16 wherein the stored instructions are arranged to further cause the processor to: compare the aggregate team score vector to a predetermined preferred distribution; and

output a result of the comparison.

18. The device of claim 16 wherein the stored instructions are arranged to further cause the processor to: in forming the team score, separately for each attribute, combine the corresponding individual team member scores based on a predetermined Boolean operator selected to reflect a desired composition of the team with regard to the corresponding attribute.

19. The device of claim 18 wherein a Boolean AND operator is applied to combine the corresponding individual team member scores for an attribute that is required to be true for all team members according to a predetermined preferred distribution.

20. The device of claim 18 wherein a Boolean XOR operator is applied to combine the corresponding individual team member scores for an attribute that is required to be true for only one team member according to a predetermined preferred distribution.