Instant Lending Decisions

Info

Publication number: 20210272195
Type: Application
Filed: May 18, 2021
Publication Date: Sep 2, 2021
Applicant: Intuit Inc. (Mountain View, CA)
Inventors: Siddharth Ram (Menlo Park, CA), Richard N. Preece (San Diego, CA), Joesph Timothy Callinan, JR. (Campbell, CA), Kathy Tsitovich (Mountain View, CA), Eva Diane Chang (Mountain View, CA)
Application Number: 17/323,879

Abstract

A method including training a machine learning algorithm by iteratively adjusting, by a computer processor, adjusted matching parameters to increase a correlation between approval statistics of lending decisions and risk profiles. The risk profiles represent probabilities of businesses defaulting on a loan. The probabilities are derived from usage statistics of a business management application (BMA) used by the businesses. Iteratively adjusting continues until reaching a threshold correlation between the approval statistics and the lending decisions and the risk profiles. Training generates an updated machine learning algorithm. An updated risk score for a business entity is generated using a number of logins to the BMA made by the business entity.

Description

Description

RELATED APPLICATIONS

This application is a continuation application of U.S. application Ser. No. 16/198,599, filed Nov. 21, 2018, now U.S. Pat. No. ______; which is a continuation application of U.S. application Ser. No. 13/956,281 filed Jul. 31, 2013; all of which are hereby incorporated by reference.

BACKGROUND

Banks often have trouble lending to a small business because they do not have an effective approach to assess the quality of a small business, and often default to using the small business proprietor's credit scores.

SUMMARY

In general, in one aspect, the one or more embodiments relate to a method. The method includes training a machine learning algorithm by iteratively adjusting, by a computer processor, adjusted matching parameters of the machine learning algorithm to increase a correlation between approval statistics of lending decisions and risk profiles. The risk profiles represent probabilities of businesses defaulting on a loan. The probabilities are derived from usage statistics of a business management application (BMA) used by the businesses. The lending decisions are received from a computing device of a first lender and represent decisions made by the first lender whether to extend the loan to the businesses based on the risk profiles. Iteratively adjusting continues until reaching a threshold correlation between the approval statistics and the lending decisions and the risk profiles. Training generates an updated machine learning algorithm. The method also includes updating a risk score of a risk profile for a business entity in the businesses to generate an updated risk score. The risk score of the risk profile for the business entity is updated using a number of logins to the BMA made by the business entity.

The one or more embodiments also relate to a system for generating a risk profile of a business entity. The system includes a computer processor. The system also includes a business management application (BMA) configured to obtain and store usage statistics of businesses that use the BMA. The system also includes memory storing instructions executable by the processor. The instructions include a risk profile generator configured to update a risk score of a risk profile for a business entity in the businesses to generate an updated risk score. The risk score of the risk profile for the business entity is updated using a number of logins to the BMA made by the business entity. The instructions include a machine learning algorithm configured to be trained by iteratively adjusting adjusted matching parameters of the machine learning algorithm to increase a correlation between approval statistics of lending decisions and risk profiles. The risk profiles represent probabilities of business entities defaulting on a loan. The probabilities are derived from usage statistics of a business management application (BMA) used by the business entities. The lending decisions are received from a computing device of a lender and represent decisions made by the lender whether to extend the loan to the businesses based on the risk profiles. Iteratively adjusting continues until reaching a threshold correlation between the approval statistics and the lending decisions and the risk profiles. The system also includes a repository configured to store the trained machine learning algorithm.

The one or more embodiments also provide for a non-transitory computer readable medium storing instructions which, when executed by a computer processor, perform functionality. The functionality includes training a machine learning algorithm by iteratively adjusting, by a computer processor, adjusted matching parameters of the machine learning algorithm to increase a correlation between approval statistics of lending decisions and risk profiles. The risk profiles represent probabilities of businesses defaulting on a loan. The probabilities are derived from usage statistics of a business management application (BMA) used by the businesses. The lending decisions are received from a computing device of a first lender and represent decisions made by the first lender whether to extend the loan to the businesses based on the risk profiles. Iteratively adjusting continues until reaching a threshold correlation between the approval statistics and the lending decisions and the risk profiles. Training generates an updated machine learning algorithm. The functionality also includes updating a risk score of a risk profile for a business entity in the businesses to generate an updated risk score. The risk score of the risk profile for the business entity is updated using a number of logins to the BMA made by the business entity.

Other aspects of the invention will be apparent from the following description and the appended claims.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows a block diagram of a system in accordance with one or more embodiments of the invention.

FIG. 2 shows a flow chart of a method in accordance with one or more embodiments of the invention.

FIG. 3 shows an example in accordance with one or more embodiments of the invention.

FIG. 4 shows a computer system in accordance with one or more embodiments of the invention.

FIGS. 5A, 5B, 5C, 5D, 5E, 5F, 5G, 5H, 5I, 5J, 5K, and 5L show Table 1 in accordance with one or more embodiments.

FIGS. 6A, 6B, 6C, 6D, 6E, 6F, 6G, 6H, 6I, 6J, 6K, 6L, 6M, and 6N show Table 2 in accordance with one or more embodiments.

DETAILED DESCRIPTION

Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.

In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.

In general, embodiments of the invention provide a method, system, and computer readable medium to generate a risk profile of a small business (SMB) based on accounting data and other third party business management application (BMA) data of the SMB. In particular, the accounting data and other third party BMA data are retrieved from a business management application (e.g., accounting application, payroll application, tax preparation application, personnel application, etc.) as a software-as-an-service (SaaS) used by the SMB. Specifically, the risk profile represents the likelihood of the SMB to be delinquent and/or to default on a loan. In one or more embodiments, the risk profile is provided to a lender for making an expedient lending decision with respect to the SMB. In one or more embodiments, statistics of lenders' lending decisions based on provided risk profiles are analyzed to generate a correlation. Accordingly, the algorithm(s) used to generate the risk profile from the accounting data and other third party BMA data are adjusted to maximize the correlation.

FIG. 1 shows a block diagram of a system (100) for generating a risk profile based on third party BMA data for instant lending decisions in accordance with one or more embodiments of the invention. Specifically, the system (100) includes business entities (e.g., business entity A (101a)), lenders (e.g., lender X (102x)), a BMA (105) used by the business entities, and a risk profile generation tool (160) that are coupled via a computer network (110). In one or more embodiments of the invention, the risk profile generation tool (160), or a portion thereof, may be integrated with the BMA (105). In one or more embodiments of the invention, one or more of the modules and elements shown in FIG. 1 may be omitted, repeated, and/or substituted. Accordingly, embodiments of the invention should not be considered limited to the specific arrangements of modules shown in FIG. 1.

In one or more embodiments of the invention, the computer network (110) may include a cellular phone network, a wide area network, a local area network, a public switched telephone network (PSTN), or any other suitable network that facilitates the exchange of information from one part of the network to another. In one or more embodiments, the computer network (110) is coupled to or overlaps with the Internet.

In one or more embodiments, each of the business entities (e.g., business entity A (101a), business entity M (101m), business entity N (101n)), the lenders (e.g., lender X (102x), lender Y (102y)), the BMA (105), and the risk profile generation tool (160) may include any computing device configured with computing, data storage, and network communication functionalities. In one or more embodiments, the BMA (105) may be an accounting application, a tax preparation application, a payroll application, a personnel application, or any business management application. In one or more embodiments, the BMA (105) is provided by an application service provider, such as a software as a service (SaaS). For example, the BMA (105) may be operated by the application service provider (ASP) and accessed by the business entities (e.g., business entity A (101a), business entity M (101m), business entity N (101n)) on a subscription basis.

In one or more embodiments, BMA data (e.g., BMA data (105b) including user entered data (105c) and usage statistics (105d) of the business entity A (101a)) is generated in response to the business entities accessing the BMA (105). For example, the user entered data (105c) may include profile/configuration information specified by the business entity A (101a). In particular, such profile/configuration information may be entered into the BMA (105) by a user associated with the business entity A (101a), who may be an employee, a consultant, a business owner, etc. of the business entity A (101a). In one or more embodiments, at least a portion of the user entered data (105c) represents a measure of business activities performed by the business entity A (101a). In addition, the usage statistics (105d) may include statistics or other behavioral information representing how the BMA (105) is used by the business entity A (101a). Examples of the BMA data (105b) are shown in TABLE 1 and TABLE 2 below. In particular, TABLE 1, shown in FIGS. 5A, 5B, 5C, 5D, 5E, 5F, 5G, 5H, 5I, 5J, 5K, and 5L, lists a number of example BMA data each corresponding to a category of BMA data items. TABLE 2, shown in FIGS. 6A, 6B, 6C, 6D, 6E, 6F, 6G, 6H, 6I, 6J, 6K, 6L, 6M, and 6N, provides definitions of each BMA data item. Although the BMA data (e.g., BMA data (105b)) is shown in FIG. 1 as stored within the BMA (105), in one or more embodiments, the BMA data (e.g., BMA data (105b)) may not persist within the BMA (105). In one or more embodiments, the user entered data (105c) and usage statistics (105d) of the business entity A (101a) are stored in a repository (123) of the risk profile generation tool (160) as the user entered data A (140a) and usage statistics A (141a). Similarly, the BMA data (105b) of the business entity M (101m) and business entity N (101n) may also be stored in the repository (123) as the user entered data M (140m)/usage statistics M (141m) and user entered data N (140n)/usage statistics N (141n), respectively. For example, information stored in the user entered data A (140a)/usage statistics A (141a), user entered data M (140m)/usage statistics M (141m), and user entered data N (140n)/usage statistics N (141n) may be retrieved and used by the risk profile generation tool (160), as needed, instead of persisting within the BMA (105).

As shown in FIG. 1, the risk profile generation tool (160) includes a risk profile generator (107), an adaptive matching analyzer (108), and the repository (123) storing information used and/or generated by the risk profile generator (107) and the adaptive matching analyzer (108).

In one or more embodiments, the risk profile generator (107) is configured to obtain the BMA data (105b) from the BMA (105) for storing in the repository (123). For example, the user entered data (105c)/usage statistics (105d) included in the BMA data (105b) may be stored as the user entered data A (140a) and usage statistics A (141a) in the repository (123). Similarly, other BMA data (105b) associated with the business entity M (101m) and business entity N (101n) may be stored as the user entered data M (140m)/usage statistics M (141m) and user entered data N (140n)/usage statistics N (141n), respectively in the repository (123).

In one or more embodiments, the user entered data A (140a)/usage statistics A (141a), user entered data M (140m)/usage statistics M (141m), and user entered data N (140n)/usage statistics N (141n) are analyzed by the risk profile generator (107) to generate the risk profile A (142a) of the business entity A (101a), the risk profile M (142m) of the business entity M (101m), and the risk profile N (142n) of the business entity N (101n), respectively. Specifically, the risk profile A (142a), risk profile M (142m), and risk profile N (142n) represent a predicted probability of the business entity A (101a), business entity M (101m), and business entity N (101n), respectively, to be delinquent on any loan payment or to default on a loan. In one or more embodiments, the risk profile (e.g., risk profile A (142a), risk profile M (142m), and risk profile N (142n)) includes one or more of a probability of default, a probability of non-default, a probability of delinquency, a probability of non-delinquency, a probability of loan approval, and a probability of loan declination, each represented by a number score, a percentage score, a letter score, or other suitable type of score. For example, payment delinquency (i.e., late payment) and/or loan default (i.e., late payment exceeding a pre-determined duration and/or frequency) may occur when the loan is serviced by one of the lenders (e.g., lender X (102x), lender Y (102y)) or a loan service entity associated with these lenders.

In one or more embodiments, the risk profiles (e.g., the risk profile A (142a), risk profile M (142m), risk profile N (142n)) are generated by the risk profile generator (107) using an adaptively-determined matching algorithm such that the risk profiles correlate with actual occurrences of payment delinquency and/or loan default by the corresponding business entities (e.g., business entity A (101a), business entity M (101m), business entity N (101n)) as borrowers, for example during a particular time period. Accordingly, these risk profiles also indicate probabilities that future payment delinquency and/or loan default by the corresponding business entities may also occur. Generally, actual occurrences of payment delinquency and/or loan default by the borrowers are tracked and compiled by lenders (e.g., lender X (102x), lender Y (102y)) as loan delinquency statistics. In one or more embodiments, these loan delinquency statistics are obtained by the risk profile generator (107) and stored in the repository (123) as loan default statistics A (144a), loan default statistics M (144m), and loan default statistics N (144n) corresponding to the business entity A (101a), business entity M (101m), and business entity N (101n), respectively. Note that each of the loan default statistics A (144a), loan default statistics M (144m), and loan default statistics N (144n) may be compiled over the same time period for some business entities (e.g., business entity M (101m), business entity N (101n)) and compiled or over different time periods for other business entities (e.g., business entity A (101a)).

In one or more embodiments, the aforementioned adaptively-determined matching algorithm includes a machine learning algorithm, such as a rule ensemble algorithm known to those skilled in the art. For example, the risk profile A (142a) may be generated by the risk profile generator (107) using the machine learning algorithm that has been trained based on risk-profile-to-loan-default correlation of other business entities. As shown in FIG. 1, the risk profile M (142m), risk profile N (142n), loan default statistics M (144m), and loan default statistics N (144n) are generated/obtained prior to generating the risk profile A (142a) and are used as part of a training data set (140) for iteratively adjusting the machine learning algorithm before generating the risk profile A (142a) therewith. “Iteratively adjusting” is referred to as “training” in the context of machine learning algorithm. In one or more embodiments, the risk profile generator (107) is configured to iteratively adjust (i.e., train) the adaptively-determined matching algorithm during a training phase by at least (i) providing, during an initial iteration of the training phase, the risk profile M (142m) and risk profile N (142n), among other risk profiles in the training data set (140) to one or more lenders (e.g., lender X (101x), lender Y (101y)) for making lending decisions (e.g., approved or declined), such as represented by the loan approval status M (143m), loan approval status N (143n), etc. with respect to the respective business entity M (101m), business entity N (101n), etc., (ii) obtaining the loan default statistics M (144m), loan default statistics N (144n), etc. in response to these lending decisions leading to an approval and initiation of the loans for the business entity M (101m), business entity N (101n), etc., (iii) analyzing the loan default statistics M (144m), default statistics N (144n), etc. in relationship to the risk profile M (142m), risk profile N (142n), etc. to generate a risk-profile-to-loan-default correlation, and (iv) adjusting, prior to a subsequent iteration of the training phase, the matching parameters (143) of the adaptively-determined matching algorithm to increase (e.g., optimize or maximize) the risk-profile-to-loan-default correlation for the subsequent iteration of the training phase.

In one or more embodiments, the training data set (140) may further include the corresponding user entered data, usage statistics, and loan approval statistics. In one or more embodiments, in response to a pre-determined result of iteratively adjusting (i.e., training) the adaptively-determined matching algorithm based on the training data set (140), the risk profile generator (107) is configured to analyze the user entered data A (140a) and the usage statistics A (141a), using the adjusted adaptively-determined matching algorithm, to generate the risk profile A (142a) of the business entity A (101a). For example, the pre-determined result may include an incremental change in the risk-profile-to-loan-default correlation between two contiguous iterations of the training phase being less a pre-determined amount (e.g., less than 0.1% of the final risk-profile-to-loan-default correlation). In other words, the matching parameters (143) may be iteratively adjusted until any incremental percentage improvement of the risk-profile-to-loan-default correlation is less than 0.1% before the adaptively-determined matching algorithm is used to analyze the user entered data A (140a) and the usage statistics A (141a) for generating the risk profile A (142a) of the business entity A (101a).

In one or more embodiments, once generated, the risk profile A (142a) is provided by the risk profile generator (107) to the business entity A (101a). Accordingly, the business entity A (101a) may submit the risk profile A (142a) to one or more lenders (e.g., lender X (102x), lender Y (102y)) to apply for a loan. If such loan application is approved and initiated, the corresponding loan servicing history may be tracked for compiling the payment delinquency and/or default statistics to generate the loan default statistics A (144a) associated with the business entity A (101a). In one or more embodiments, the user entered data A (140a), the usage statistics A (141a), the risk profile A (142a), the corresponding loan approval status A (143a), and the resultant loan default statistics A (144a) may be further included in the training data set (140) to generate an updated version of the training data set (140). Subsequently, this updated version of the training data set (140) may be used to generate additional risk profiles for other business entities and/or to update existing risk profiles (e.g., the risk profile A (142a), risk profile M (142m), risk profile N (142n), etc.) as references for future loan applications.

In one or more embodiments, the matching parameters (143) of the adaptively-determined matching algorithm are further adjusted to maximize the correlation between the risk profiles (e.g., the risk profile A (142a), risk profile M (142m), risk profile N (142n), etc.) and the corresponding loan approval status (e.g., loan approval status A (143a), loan approval status M (143m), loan approval status N (143n)). In one or more embodiments, the adaptive matching analyzer (108) is configured to analyze approval statistics in relationship to the risk profiles to generate a risk-profile-to-loan-approval correlation, which is maximized during the training phase of the adaptively-determined matching algorithm by adjusting the matching parameters (143).

Returning to the discussion of the risk profile generator (107), in one or more embodiments, the risk profile generator (107) is further configured to generate a loan proposal based on similar risk profiles shared by a group of business entities. Such loan proposal may then be sent to one or more lenders that may be interested in initiating loans based on the anticipated risk/return characteristics represented by such loan proposal. Details of generating the loan proposal based on similar risk profiles shared by a group of business entities are described in reference to FIG. 2 below.

In one or more embodiments, the risk profile generator (107) is further configured to identify a group of business entities matching a target risk profile requested by a lender. Details of identifying business entities matching a target risk profile are described in reference to FIG. 2 below.

FIG. 2 shows a flow chart for generating a risk profile based on third party business management application data for instant lending decision in accordance with one or more embodiments of the invention. In one or more embodiments of the invention, the method of FIG. 2 may be practiced using the system (100) described in reference to FIG. 1 above. In one or more embodiments of the invention, one or more of the steps shown in FIG. 2 may be omitted, repeated, and/or performed in a different order than that shown in FIG. 2. Accordingly, the specific arrangement of steps shown in FIG. 2 should not be construed as limiting the scope of the invention.

Initially in Step 201, business management application (BMA) data of business entities is obtained from the BMA. In one or more embodiments, the BMA may be an accounting application, a tax preparation application, a payroll application, a personnel application, or any business management application. In one or more embodiments, the BMA is provided by an application service provider, such as a software as a service (SaaS). For example, the BMA may be operated by the application service provider (ASP) and accessed by the business entities on a subscription basis. In one or more embodiments, the BMA data include user entered data and usage statistics described in reference to TABLE 1 above.

In Step 202, loan approval status and loan default statistics of the business entities are obtained from lenders providing loans to the business entities. Generally, business entities apply for business loans from such lenders who may approve or decline the loan application. For those loan applications that are approved, actual occurrences of loan payment delinquency and loan default are tracked and compiled by the lenders as loan default statistics. In one or more embodiments, the loan approval status and loan default statistics of the business entities are obtained from the lenders based on certain business agreements. For example, the business entities may have the ability to opt-in as part of the loan application to release such information to business partners of the lenders.

In Step 203, an adaptively-determined matching algorithm is iteratively adjusted to match risk profiles of the business entities to the corresponding loan approval status and loan default statistics. In one or more embodiments, the risk profile includes one or more of a probability of default, a probability of non-default, a probability of delinquency, a probability of non-delinquency, a probability of loan approval, and a probability of loan declination, each represented by a number score, a percentage score, a letter score, or other suitable type of score.

In one or more embodiments, the risk profiles are modeled as a function of the BMA data of the business entities using the adaptively-determined matching algorithm. In other words, the adaptively-determined matching algorithm is used to analyze the BMA data and generate the corresponding risk profiles. In one or more embodiments, the adaptively-determined matching algorithm includes a machine learning algorithm, such as a rule ensemble algorithm known to those skilled in the art. For example, the training data set of the machine learning algorithm includes the BMA data, loan approval statistics, and loan default statistics of the business entities. Accordingly, various parameters of the machine learning algorithm are iteratively adjusted during a training phase to match the modeled risk profile (e.g., predicted loan approval/declination, predicted loan delinquency, and predicted loan default) to the actual loan approval status and actual loan default statistics in the training data set. Iteratively adjusting the parameters of the machine learning algorithm is referred to as “training” the machine learning algorithm. For example, training the machine learning algorithm may be as described in reference to the risk profile generator (107) depicted in FIG. 1 above.

In Step 204, subsequent to the training phase of the adaptively-determined matching algorithm the adaptively-determined matching algorithm is used to generate the risk profile of a particular business entity based on the BMA data of the particular business entity. In one or more embodiments, this particular business entity is one of the business entities whose BMA data are included in the training data set of the adaptively-determined matching algorithm. In such embodiments, the risk profile generated in the Step 204 is a updated version of a previous risk profile of this particular business entity that was used as part of the training set in the Step 203. In one or more embodiments, this particular business entity is separate from those other business entities whose BMA data are included in the training data set of the adaptively-determined matching algorithm.

In Step 205, a determination is made as to whether the particular business entity uses the risk profile to apply for a loan. If the determination is YES, i.e., the particular business entity submit a loan application based on the risk profile generated in Step 204, the method returns to Step 202 where loan approval status and any subsequent loan default statistic are added to the training data set of the adaptively-determined matching algorithm. If the determination is NO, i.e., the particular business entity has not submitted any loan application based on the risk profile generated in Step 204, the method proceeds to Step 206.

In Step 206, a loan proposal is generated based on similar risk profiles of a group of business entities. In one or more embodiments, a cluster of similar risk profiles are extracted from a risk profile collection using a pre-determined clustering algorithm and based on a pre-determined similarity measure. Accordingly, a loan proposal is generated based on the cluster of similar risk profiles. For example, the loan proposal may include a range of loan amounts, interest rate terms, maturity time period, borrower covenants, and other conventional financial parameters of a loan. In one or more embodiments, a statistical return for a lender is computed for the loan proposal based on characteristics (e.g., probability of default, probability of non-default, etc. each represented by a number score, a percentage score, a letter score, etc.) of the similar risk profiles in the cluster. For example, an effective average rate of return for a simple example loan proposal may be computed by deducting a defaulted loan amount multiplied by the probability of default from the anticipated interest collection of a non-defaulted loan amount multiplied by a simple fixed rate and the probability of non-default over the maturity time period.

In one or more embodiments, the loan proposal is presented to one or more lenders and the group of business entities corresponding to the cluster of similar risk profiles. For example, a lender may decide to offer a loan program based on the loan proposal. In another example, the group of business entities may jointly request a loan program from a lender based on the loan proposal.

In Step 207, a target risk profile specified by one or more lenders may be matched to business entities sharing similar risk profiles. In one or more embodiments, one or more clusters of similar risk profiles are extracted from a risk profile collection using a pre-determined clustering algorithm and based on a pre-determined similarity measure. In addition, at least one of these clusters is selected as being similar to the target risk profile. Accordingly, a list of business entities corresponding to the selected at least one cluster are presented to the one or more lenders. For example, a lender may decide to offer a loan program based on the target risk profile and market the loan program to the business entities on the list.

FIG. 3 shows an example flow (300) of generating a risk profile based on third party business management application data for instant lending decision in accordance with one or more embodiments of the invention. Specifically, the flow (300) uses business management application (BMA) data to build a model (303) to predict delinquent behavior with a training data set. As shown in FIG. 3, the flow (300) uses both user-entered data and usage/behavioral data of the BMA data (301) to predict whether a company has defaulted on a loan or has been past due at some point during the life of the loan. The training data set includes a large number (e.g., hundreds) of companies for whom historical delinquent status (302) on a loan are known. Further, a large number of user-entered data and usage/behavioral data (e.g., over one hundred) are included for each company in the training set.

A rule ensemble algorithm is used to build the predictive model (303) that is used to score a company on its likelihood of exhibiting delinquent behavior. A “rules ensemble” is a particular form of the machine learning methodology referred to as “ensembling,” where multiple simple models (base learners) are combined into one complex model to improve accuracy. This type of model can be described as an additive expansion of the form F(x)=a₀+a₁*b₁(x)+a₂*b₂(x)+ . . . +a_M*b_M(x) where the b_j(x)'s are the base-learners and x is a vector [x₁, x₂, . . . x_N] representing the BMA data items (301). As noted above, N is a large number, such as a number over one hundred.

In the case of a rules ensemble, the b_j(x) terms are conjunctive rules of the form “if x₁>22 and x₂>27 then 1 else 0” or linear functions of a single variable—e.g., b_j(x)=x_j. Using base-learners of this type is efficient because they constitute easily interpretable statements about attributes x_j. They also preserve the desirable characteristics of Decision Trees such as efficient handling of categorical attributes, robustness to outliers in the distribution of x, etc.

The example rules ensemble used in the flow (300) builds a model (303), represented as F(x), in a three-step process:

- a. Build a tree ensemble (one where the b_j(x)'s are decision trees),
- b. Generate candidate rules from the tree ensemble, and
- c. Fit coefficients a_jvia regularized regression.

The BMA data items are categories into several types of variables and are evaluated to see which are most predictive of default risk. These variable types include:

- a. Raw QBO user-entered data (e.g., transactions, number of customers, . . . ),
- b. BMA usage behavior (e.g., browser used, number of logins, length of time a QBO customer, . . . ),
- c. Computed financial-health variables (e.g., net worth, EBITDA, inventory days turnover, . . . ), and
- d. Summary data (e.g., total capital dollar amount coming in to the company, total dollar amount going out of the company, number of distinct vendors paid in last 12 months, . . . ).

For example, the following BMA data items are selected from the above variable types as the most predictive power (based on the training data set):

- a. Current ratio (current assets/current liabilities),
- b. Year-over-year sales growth,
- c. Number of online banking automatic downloads in a given month,
- d. Number of transactions with money leaving the company (e.g., bills paid) in a given month,
- e. Whether the company is a current BMA subscriber or not, and
- f. Whether the company is a customer for financial supplies (e.g., checks, accounting forms, etc.) or not.

The output result of the model (303) includes a risk score (313) from 0 to 1 that may be interpreted as the probability that the company may default on a loan, the probability that the company may be delinquent for one or more payments, and/or the probability the company may be approved by a particular lender. Specifically, the risk score (313) of a particular company is generated by using the numerous BMA data items (311) of the particular company as input variables of the model (303). The risk score (313) may be used in a number of ways:

- a. Kept in its raw, continuous format to be used in conjunction with other data to make a lending decision by a lender,
- b. By trading off the relative “cost” of incorrectly categorizing a business as risky when it is not, versus incorrectly categorizing a business as not risky when it is, a break point maybe determined where a company above that point is categorized as risky and below is categorized as not risky. Similarly, a number of breakpoints may be determined to create tiers for low, medium, and high risk companies.

The risk score (313) may be given to a lender directly or given to the particular company as a borrower and used at the borrower's discretion when applying for a loan from the lender. In addition, the risk score (313) may be dynamically update in real time during the life of the loan as a leverage for the borrower to negotiate better terms with the lender if the borrower's business is doing well. Further, the risk score (313) may be dynamically update in real time during the life of the loan for the lender to measure the ongoing risk of the loan with respect to the borrower's business reflected by the BMA data of the borrower.

Embodiments of the invention may be implemented on virtually any type of computing system regardless of the platform being used. For example, the computing system may be one or more mobile devices (e.g., laptop computer, smart phone, personal digital assistant, tablet computer, or other mobile device), desktop computers, servers, blades in a server chassis, or any other type of computing device or devices that includes at least the minimum processing power, memory, and input and output device(s) to perform one or more embodiments of the invention. For example, as shown in FIG. 4, the computing system (400) may include one or more computer processor(s) (402), associated memory (404) (e.g., random access memory (RAM), cache memory, flash memory, etc.), one or more storage device(s) (406) (e.g., a hard disk, an optical drive such as a compact disk (CD) drive or digital versatile disk (DVD) drive, a flash memory stick, etc.), and numerous other elements and functionalities. The computer processor(s) (402) may be an integrated circuit for processing instructions. For example, the computer processor(s) may be one or more cores, or micro-cores of a processor. The computing system (400) may also include one or more input device(s) (410), such as a touchscreen, keyboard, mouse, microphone, touchpad, electronic pen, or any other type of input device. Further, the computing system (400) may include one or more output device(s) (408), such as a screen (e.g., a liquid crystal display (LCD), a plasma display, touchscreen, cathode ray tube (CRT) monitor, projector, or other display device), a printer, external storage, or any other output device. One or more of the output device(s) may be the same or different from the input device. The computing system (400) may be connected to a network (412) (e.g., a local area network (LAN), a wide area network (WAN) such as the Internet, mobile network, or any other type of network) via a network interface connection (not shown). The input and output device(s) may be locally or remotely (e.g., via the network (412)) connected to the computer processor(s) (402), memory (404), and storage device(s) (406). Many different types of computing systems exist, and the aforementioned input and output device(s) may take other forms.

Software instructions in the form of computer readable program code to perform embodiments of the invention may be stored, in whole or in part, temporarily or permanently, on a non-transitory computer readable medium such as a CD, DVD, storage device, a diskette, a tape, flash memory, physical memory, or any other computer readable storage medium. Specifically, the software instructions may correspond to computer readable program code that when executed by a processor(s), is configured to perform embodiments of the invention.

Further, one or more elements of the aforementioned computing system (400) may be located at a remote location and connected to the other elements over a network (412). Further, embodiments of the invention may be implemented on a distributed system having a plurality of nodes, where each portion of the invention may be located on a different node within the distributed system. In one embodiment of the invention, the node corresponds to a distinct computing device. Alternatively, the node may correspond to a computer processor with associated physical memory. The node may alternatively correspond to a computer processor or micro-core of a computer processor with shared memory and/or resources.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.

Claims

1. A method, comprising:

training a machine learning algorithm by iteratively adjusting, by a computer processor, adjusted matching parameters of the machine learning algorithm to increase a correlation between approval statistics of a plurality of lending decisions and a plurality of risk profiles, wherein: the plurality of risk profiles represent probabilities of a plurality of businesses defaulting on a loan, the probabilities derived from usage statistics of a business management application (BMA) used by the plurality of businesses, the plurality of lending decisions are received from a computing device of a first lender and represent decisions made by the first lender whether to extend the loan to the plurality of businesses based on the plurality of risk profiles, iteratively adjusting continues until reaching a threshold correlation between the approval statistics and the plurality of lending decisions and the plurality of risk profiles, and training generates an updated machine learning algorithm; and

updating a risk score of a risk profile for a business entity in the plurality of businesses to generate an updated risk score, wherein the risk score of the risk profile for the business entity is updated using a number of logins to the BMA made by the business entity.

2. The method of claim 1, further comprising:

executing the updated machine learning algorithm, taking as input the updated risk score, and generating as output a probability that the business entity will default on a loan.

3. The method of claim 1, wherein:

the usage statistics comprises at least one category selected from the group consisting of business statistics, business financial data, online banking usage statistics, accounting software trial details, marketing interaction data, general setup statistics, payroll setup statistics, customer support data, firmographics, product usage, subscription details, subscription billing details, payroll processing details, attrition details, customer statistics, pattern changes, transaction statistics, chargebacks statistics, and age statistics, and

the machine learning algorithm comprises a rule ensemble algorithm.

4. The method of claim 1, further comprising:

obtaining loan default statistics of the plurality of businesses;

analyzing the loan default statistics in relationship to the plurality of risk profiles to generate a second correlation; and

adjusting the machine learning algorithm to increase the second correlation.

5. The method of claim 1, further comprising:

providing the risk profile to the business entity,

wherein the business entity submits the risk profile to a second lender to apply for a loan.

6. The method of claim 1, further comprising:

extracting, using a pre-determined clustering algorithm and based on a pre-determined similarity measure, a cluster of similar risk profiles from the plurality of risk profiles, wherein the cluster of similar risk profiles corresponds to a subset of the plurality of businesses;

generating a loan proposal based on the cluster of similar risk profiles; and

presenting the loan proposal to at least one entity selected from the group consisting of the first lender, a second lender, and the subset of the plurality of businesses.

7. The method of claim 1, further comprising:

obtaining a target risk profile from a second lender;

extracting, based on the target risk profile, a cluster of similar risk profiles from the plurality of risk profiles, wherein the cluster of similar risk profiles corresponds to a subset of the plurality of businesses; and

presenting the cluster of similar risk profiles and the subset of the plurality of businesses to the second lender,

wherein the second lender offers a loan program to the subset of the plurality of businesses.

8. A system for generating a risk profile of a business entity, comprising:

a computer processor;

a business management application (BMA) configured to obtain and store a plurality of usage statistics of a plurality of businesses that use the BMA;

memory storing instructions executable by the processor, wherein the instructions comprise: a risk profile generator configured to update a risk score of a risk profile for a business entity in the plurality of businesses to generate an updated risk score, wherein the risk score of the risk profile for the business entity is updated using a number of logins to the BMA made by the business entity. a machine learning algorithm configured to be trained by iteratively adjusting adjusted matching parameters of the machine learning algorithm to increase a correlation between approval statistics of a plurality of lending decisions and a plurality of risk profiles, wherein: the plurality of risk profiles represent probabilities of a plurality of business entities defaulting on a loan, the probabilities derived from usage statistics of a business management application (BMA) used by the plurality of business entities, the plurality of lending decisions are received from a computing device of a lender and represent decisions made by the lender whether to extend the loan to the plurality of businesses based on the plurality of risk profiles, iteratively adjusting continues until reaching a threshold correlation between the approval statistics and the plurality of lending decisions and the plurality of risk profiles, and

a repository configured to store the trained machine learning algorithm.

9. The system of claim 8, wherein:

the usage statistics comprises at least one category selected from the group consisting of business statistics, business financial data, online banking usage statistics, accounting software trial details, marketing interaction data, general setup statistics, payroll setup statistics, customer support data, firmographics, product usage, subscription details, subscription billing details, payroll processing details, attrition details, customer statistics, pattern changes, transaction statistics, chargebacks statistics, and age statistics, and

the machine learning algorithm comprises a rule ensemble algorithm.

10. The system of claim 8, wherein the risk profile generator is further configured to:

obtain loan default statistics of the plurality of businesses;

analyze the loan default statistics in relationship to the plurality of risk profiles to generate a second correlation; and

adjust the machine learning algorithm to increase the second correlation.

11. The system of claim 8, wherein the risk profile generator is further configured to:

provide the risk profile to the business entity, wherein the business entity submits the risk profile to a second lender to apply for a loan.

12. The system of claim 8, wherein the risk profile generator is further configured to:

extract, using a pre-determined clustering algorithm and based on a pre-determined similarity measure, a cluster of similar risk profiles from the plurality of risk profiles, wherein the cluster of similar risk profiles corresponds to a subset of the plurality of businesses;

generate a loan proposal based on the cluster of similar risk profiles; and

present the loan proposal to at least one entity selected from the group consisting of the first lender, a second lender, and the subset of the plurality of businesses.

13. The system of claim 8, wherein the risk profile generator is further configured to:

obtain a target risk profile from a second lender;

extract, based on the target risk profile, a cluster of similar risk profiles from the plurality of risk profiles, wherein the cluster of similar risk profiles corresponds to a subset of the plurality of businesses; and

present the cluster of similar risk profiles and the subset of the plurality of businesses to the second lender,

wherein the second lender offers a loan program to the subset of the plurality of businesses.

14. The system of claim 8, further comprising:

an adaptive matching analyzer configured to execute the updated machine learning algorithm, taking as input the updated risk score, and generating as output a probability that the business entity will default on a loan.

15. A non-transitory computer readable medium storing instructions which, when executed by a computer processor, comprise functionality for:

training a machine learning algorithm by iteratively adjusting, by a computer processor, adjusted matching parameters of the machine learning algorithm to increase a correlation between approval statistics of a plurality of lending decisions and a plurality of risk profiles, wherein: the plurality of risk profiles represent probabilities of a plurality of businesses defaulting on a loan, the probabilities derived from usage statistics of a business management application (BMA) used by the plurality of businesses, the plurality of lending decisions are received from a computing device of a first lender and represent decisions made by the first lender whether to extend the loan to the plurality of businesses based on the plurality of risk profiles, iteratively adjusting continues until reaching a threshold correlation between the approval statistics and the plurality of lending decisions and the plurality of risk profiles, and training generates an updated machine learning algorithm; and

updating a risk score of a risk profile for a business entity in the plurality of businesses to generate an updated risk score, wherein the risk score of the risk profile for the business entity is updated using a number of logins to the BMA made by the business entity.

16. The non-transitory computer readable medium of claim 15, wherein the instructions further comprise functionality for:

executing the updated machine learning algorithm, taking as input the updated risk score, and generating as output a probability that the business entity will default on a loan.

17. The non-transitory computer readable medium of claim 15, wherein the instructions further comprise functionality for:

obtaining loan default statistics of the plurality of businesses;

analyzing the loan default statistics in relationship to the plurality of risk profiles to generate a second correlation; and

adjusting the machine learning algorithm to increase the second correlation.

18. The non-transitory computer readable medium of claim 15, wherein the instructions further comprise functionality for:

providing the risk profile to the business entity,

wherein the business entity submits the risk profile to a second lender to apply for a loan.

19. The non-transitory computer readable medium of claim 15, wherein the instructions further comprise functionality for:

extracting, using a pre-determined clustering algorithm and based on a pre-determined similarity measure, a cluster of similar risk profiles from the plurality of risk profiles, wherein the cluster of similar risk profiles corresponds to a subset of the plurality of businesses;

generating a loan proposal based on the cluster of similar risk profiles; and

presenting the loan proposal to at least one entity selected from the group consisting of the first lender, a second lender, and the subset of the plurality of businesses.

20. The non-transitory computer readable medium of claim 15, wherein the instructions further comprise functionality for:

obtaining a target risk profile from a second lender;

extracting, based on the target risk profile, a cluster of similar risk profiles from the plurality of risk profiles, wherein the cluster of similar risk profiles corresponds to a subset of the plurality of businesses; and

presenting the cluster of similar risk profiles and the subset of the plurality of businesses to the second lender,

wherein the second lender offers a loan program to the subset of the plurality of businesses.