AUTOMATED MINING AND PROCESSING OF DATA ASSOCIATED WITH REAL ESTATE
Computer-based processes are disclosed for efficiently mining and analyzing information associated with particular mortgage loans, borrowers and properties. The disclosed processes use property-related data aggregated from multiple sources and jurisdictions to, among other tasks, identify properties owned by an individual.
Latest CoreLogic Solutions, LLC Patents:
- Artificial intelligence-based land and building development system
- USE OF A CONVOLUTIONAL NEURAL NETWORK TO AUTO-DETERMINE A FLOOR HEIGHT AND FLOOR HEIGHT ELEVATION OF A BUILDING
- Flood footprint estimation system
- Residential robotic device-based living area estimation
- Use of a convolutional neural network to auto-determine a floor height and floor height elevation of a building
This application is a continuation of U.S. application Ser. No. 12/876,581, filed Sep. 7, 2010.
TECHNICAL FIELDThe present disclosure relates to data processing methods for automatically mining and analyzing information associated with mortgages, property owners, and properties.
BACKGROUNDFinancial institutions and other entities involved in the mortgage industry often enter into mortgage-related transactions without accurate ownership data that allows them to properly investigate of the accuracy of representations made by the other entities involved. As one example, a business entity that is considering granting a loan to an individual, or purchasing or investing in such a loan, will typically have access to little or no data to perform an analysis of whether the individual already owns one or more other properties and what their equity positions are. Instead, the business entity may simply rely on a credit report with limited data or the representations of the borrower or, in the case of a securitized loan, the issuer or seller of the loan. As a result, many in the industry have made risky investments that have resulted in significant monetary losses due to decreased profitability or material misrepresentation related to defaulted loans.
SUMMARYComputer-based processes are disclosed for efficiently mining and analyzing information associated with particular mortgage loans, borrowers and properties. These processes may be incorporated into one or more tools or applications that are accessible via a web site or other interface to lenders, financial institutions, and/or other types of entities. The disclosed processes preferably use property-related data aggregated from multiple sources and jurisdictions to, among other tasks, assess property ownership.
One disclosed process uses information collected from various data sources to automatically identify properties owned by an individual, such as a potential borrower. This process may, as one example, be used by a lender to assess whether a loan applicant has accurately disclosed the real estate properties he or she owns. In one embodiment, multiple databases are initially searched for property addresses corresponding to the name and social security number of the individual. These databases may, for example, include a database of credit header data (provided by one or more of the credit bureaus), and one or more databases of mortgage loan application data. A database of aggregated public recorder data is also used to determine the ownership status of each identified property. In addition, to search for other properties possibly owned by the individual, a search of aggregated public assessor data is conducted to search for other properties for which the mailing address is that of a property owned by the individual.
Another disclosed process uses aggregated data to build a profile of a securitized mortgage loan offered to investors. Typically, the issuer or seller of a securitized loan will only disclose high level information regarding the loan (e.g., the loan amount, loan date, and property zip code) to the potential investors. The issuer or seller will also commonly provide warranties and representations regarding other characteristics of the loan, such as whether the loan is for an owner-occupied property. In one embodiment, the disclosed process uses aggregated public recorder data, and/or other sources of data, to attempt to match the high level loan information to a specific property and borrower. The process also searches for other properties owned by the identified borrower, such as by searching for other properties whose mailing address matches an address associated with the identified borrower. The results of the process may be used to generate one or more reports that can be used by the investors or others to assess the accuracy of the warranties and representations.
Neither this summary nor the following detailed description purports to define the invention. The invention is defined by the claims.
Specific, non-limiting embodiments will now be described with reference to the drawings. Nothing in this description is intended to imply that any particular feature, component or step is essential. The inventive subject matter is defined by the claims.
I. SYSTEM OVERVIEW (FIG. 1)As illustrated, analytics applications 22 use a set of data repositories 30-36 to perform various types of analytics tasks, including tasks associated with risk assessments. In the illustrated embodiment, these data repositories 30-36 include a database of credit bureau header data 30, a database of loan data 32 (preferably aggregated/contributed from multiple lenders, as described below), a nationwide database of aggregated public recorder data 34, and a nationwide database of aggregated public assessor (tax roll) data. Although depicted as separate databases, some of these data collections may be merged into a single database or distributed across multiple distinct databases. Further, additional databases containing other types of information may be maintained and used by the analytics applications 22. As shown in
The credit bureau header database 30 contains header data obtained from one or more of the U.S. credit bureaus (Experian, Equifax, Transunion, Teletrack, Credco, SafeRent, etc.). As is known in the art, credit header data (also referred to as “credit bureau header data”) is the non-financial data commonly included at the top of an individual's credit report. The credit header data for an individual typically includes some or all of the following information regarding the individual: name, social security number, date of birth, current and previous addresses, and AKAs (“also known as” names). The AKAs are based on the name variations in the data reported to the credit bureaus by lenders and/or other institutions. For example, if an individual applies for a loan using the name “Jonathan R. Banks,” and later applies for a credit card using the name “John Banks,” both variations may show up in individual's credit header data. Credit header data is generally available from the credit bureaus and from data aggregators such as CoreLogic, Merlin, and LexisNexis. Because no trade lines or other credit information is included in the credit header data, the data is not governed by the Fair Credit Reporting Act.
The database of loan data 32 preferably includes aggregated mortgage loan data collected by lenders from mortgage loan applications of borrowers. The analytics provider may obtain the loan application in various ways. For example, lenders and other customers 26 of the analytics system 20 may supply such data to the system 20 in the course of using the analytics applications 22. The customers may supply such data according to an agreement under which the analytics provider and system can persistently store the data and re-use it for generating summarized analytics to provide to the same and/or other customers 26. Such a database is maintained by CoreLogic, Inc. As another example, the analytics provider may obtain such loan data through partnership agreements. As yet another example, the analytics provider may itself be a mortgage lender, in which case the loan data may include data regarding its own loans. Loan data obtained by the analytics provider from lenders is referred to herein as “contributed loan data.”
The public recorder database 34 depicted in
The public assessor database 36 shown in
As further shown in
The analytics applications 22 also include a “securitized loan assessment” application or application component 44. As explained in section III below, this application or component 44 uses some or all of the data sources described above to build profiles of securitized loans given high level information provided by the security's issuer or seller. These profiles may be used to check the accuracy of warranties and representations made by the issuer or seller of the securitized loan.
II. PROCESS FOR IDENTIFYING PROPERTIES OWNED BY AN INDIVIDUAL (FIG. 2)As depicted by block 50 of
As shown in block 52 of
To address these deficiencies in the credit header data, one or more additional data sources are preferably included in the scope of the search in block 52. The one or more additional data sources may, for example, include the database of loan data 32 (
As will be apparent, data sources other than those identified above may additionally or alternatively be used to conduct the search in block 52. For example, publicly aggregated bankruptcy or tax lien records can be used to obtain address information for the individual. Thus, the particular types of data sources discussed above are not critical.
The result of step 52 is a list of addresses associated with the individual's social security number, as determined (preferably) from credit header data and one or more sources of loan data aggregated across multiple lenders. If any duplicates exist in the list (as the result in overlap between the searched databases), they are reduced to a single entry. The following is an example of such a list prior to de-duplication, with revisions to the actual addresses to protect privacy.
The “source” column specifies the source of each entry: credit header data, contributed loan data, or MERS registry. In this example, entries 5-8 are duplicates of other entries, and can therefore be eliminated. Additionally, PO Box addresses are eliminated at this point as well. As reflected by the “ownership” column, the ownership of the properties has not yet been determined at this stage of the process. In the present embodiment, this step is especially useful for locating additional addresses associated to an individual that were not already included in their credit history (credit header data). For example, the property at 3100 Palm Dr. would not have been associated to the individual if aggregated loan application data had not been added as an additional address query.
As depicted by blocks 54 and 56 of
To determine the current or past ownership status of each property, the individual's name and AKAs, as received in step 50, are compared to the buyer (or borrower) and seller names included in the retrieved transaction data for each property. A conventional and/or proprietary name matching algorithm may be used for this purpose, and will account for minor name variations (e.g., Steve versus Stephen, etc.). In one embodiment, a conventional name matching algorithm, such as the Jaro-Winkler algorithm, measures the weighted sum or percentage of matched characters in a name. In addition, data cleansing is performed (to remove titles such as “MR” or “MRS”, etc), and names are compared in different order to handle first and last name swaps in input data. The results of block 56 are shown below for the example scenario described above.
In block 58 of
If any additional properties are located via this mailing address search of block 58, the public recorder data for each such property is retrieved and is used to confirm the individual's ownership. As in block 56, this involves comparing the individual's name (preferably using a name matching algorithm) to the buyer (or borrower) and seller names listed in the property's transaction history as maintained by the relevant public recorder. If an additional owned property is identified, step 58 may optionally be repeated to search for other properties whose mailing address matches that of this newly identified property. In other words, multiple iterations of step 58 may be executed.
The following table illustrates the results of step 58 for the example scenario. In this example, entry 5 is an additional property located in the mailing address search. The individual's ownership of this property was confirmed from the public recorder data for this property.
As will be apparent, the mailing address search of block 58 can additionally or alternatively be performed using other sources of mailing address information. For example, reverse phone data queries or data provided by rental data aggregators could be used as supplemental or alternate mailing address sources.
As depicted in block 60, the resulting list of owned properties (three in the above example), together with an indication of the source of each address (e.g., “credit header data” or “public assessor mailing address search”), is stored in computer storage in association with the received name and social security number for subsequent use. This information may also be output for display to the customer 26 via the customer interface 24 together with information about the owned properties.
As depicted by block 62, the “owned properties” list may also be used as an input to other analytics applications 22. For example, an owner occupancy application 22 may analyze the characteristics of each property (e.g., homeowners and tax exemptions, whether the mailing address matches the property address, etc.) to identify the property most likely occupied by the individual. As another example, the total number and dollar amount of the liens on the owned properties may be compared to the property value and used to assess a risk level associated with the individual.
III. PROCESS FOR ASSESSING SECURITIZED LOANS (FIG. 3)An automated process will now be described for determining information regarding securitized loans, including the identities of the borrowers, and the addresses of any properties own by these borrowers. As mentioned above, this process may be embodied in a “securitized loan assessment/due diligence” application or application component 44 (hereinafter “application 44”) that provides tools and reports for assisting customers in assessing securitized loans and the associated warranties and representations.
By way of background, when a mortgage loan or group of mortgage loans is made available to investors as a security (e.g., a bond), the issuer or seller of the security typically does not provide the investors with the property addresses and borrower names associated with the loan(s). Instead, the security issuer or seller typically provides the investors with generalized information about each loan, typically via a prospectus. This generalized information typically includes the loan amount, zip code, and loan date, and may also include other details such as the lender name and/or sales price. The issuer also ordinarily provides warranties and representations to the investors regarding other characteristics of the securitized loan(s). The following are examples of the types of warranties and representations that may be provided in connection with a security that includes multiple loans: (1) all (or a specified minimum percentage) of the loans are for owner-occupied properties; (2) all of the borrowers have FICO scores between X and Y, (3) all of the loans have loan-to-value ratios falling in the range of X to Y.
In recent years, securities investors have realized severe losses in connection with their investments in securitized loans. In some cases, the investors can recoup such losses by establishing that the warranties and representations associated with the security were materially false. For example, for a group of loans represented as “at least 80% owner occupied,” the investor may be able to show that significantly less than 80% of the loans actually involved owner-occupied properties.
The first step in assessing whether the securitized loan was misrepresented is to identify the property and borrower associated with the loan, to assess the specifics of the borrower, and to identify any other properties owned concurrently by that borrower at the time the security was offered.
As illustrated in block 70 of
In block 72, the application performs a search of the aggregated public recorder data 34 (
If a single match is found in block 72, the borrower name and property address are obtained from the matching mortgage transaction record or grant deed. In one embodiment, if the process does not find a matching mortgage transaction, it searches for a matching grant deed from which to obtain this information. The grant deed will typically include information about the associated transaction, including the property and mailing address and the recording date. If multiple matches are found in block 72, additional information about the securitized loan, if provided by the issuer or seller, may be used to attempt to resolve the search to a single mortgage transaction and property. For example, if the information from the security issuer identifies the lender or sales price associated with the securitized loan, one or both of these pieces of information may be used to select one of the matching mortgage transactions.
As depicted by block 74 of
In block 78 of
As an alternative or supplement to the nationwide search of block 78, a search may be conducted for the borrower Social Security Number (SSN) associated with the securitized loan; if this search is successful, the SSN-based process of
As indicated in block 78, if the mailing address search reveals one or more additional owned properties, a second pass of the search is preferably performed in an attempt to locate additional owned properties. For instance, continuing the example above, if the borrower is determined to have owned property C, a second-pass search would be performed for additional properties (other than A, B and C) for which the borrower received tax bills at property C. If, for example, this second-pass search reveals that the borrower received tax bills for property D at property C, the recorded transaction history of property D would be used to determine whether the borrower owned property D during the relevant time period. The second pass of the search may alternatively be omitted, or may be supplemented with one or more additional passes.
In block 80, additional data regarding the located property or properties (and possibly the borrower) may be collected from various sources and used to perform one or more types of analytics. For example, if the borrower concurrently owned two or more properties when the securitized loan was issued, the application 44 (or a different analytics application) may determine which, if any, of these properties was most likely the primary residence of the borrower. This determination may be made based on various criteria, such as (1) whether the mailing and property addresses match, and (2) the tax and homeowners exemptions associated with the properties.
Finally, in block 82, the results of the preceding steps are incorporated into one or more electronic reports that can be used, among other purposes, to assess whether the warranties and representations associated with the securitized loan were accurate. For example, if multiple concurrently-owned properties were found, a concurrent ownership report may be generated; this report may, for example, include the following and other types of information for each owned property: address, purchase and sale dates to/from the borrower, sale prices, whether the property likely served as the borrower's primary residence, property value, square footage, mortgage details, loan-to-value ratio, number and dollar amount of any liens, and tax exemption status. The generated report or reports may also include information, including the borrower's FICO score at the security origination date, and the amount of any valuation misrepresentation. In some cases, the auto-generated reports may be manually reviewed and modified by human personnel before they are made available to the customer.
In some use cases, the customer/user 26 of the analytics system may already know the details of the loan or loans at issue, and may be able to provide such details to the system. In these scenarios, the matching process of blocks 72 and 74 of
As will be apparent, the process shown in
All of the processes and process steps described above (including those of
Thus, all of the methods and tasks described herein may be performed and fully automated by a programmed or specially configured computer system. The computer system may, in some cases, include multiple distinct computers or computing devices (e.g., physical servers, workstations, storage arrays, etc.) that communicate and interoperate over a network to perform the described functions. Each such computing device typically includes a processor (or multiple processors) that executes program instructions or modules stored in a memory or other computer-readable storage medium. Where the system includes multiple computing devices, these devices may, but need not, be co-located.
With further reference to
Components and functions of the analytics system 20 that are not important to an understanding of the inventive subject matter are omitted from the drawings. For example, the analytics system 20 may also include components for handling such tasks as authenticating customers 26, charging customers for services, and otherwise managing customer accounts.
The foregoing description is intended to illustrate, and not limit, the inventive subject matter. The scope of protection is defined by the claims. In the following claims, any reference characters are provided for convenience of description only, and not to imply that the associated steps must be performed in a particular order.
Claims
1. A computer-implemented process of identifying properties owned by an individual, the process comprising:
- (a) receiving input specifying a name and social security number of the individual;
- (b) searching a plurality of data repositories for properties associated with the name and social security number, said plurality of data repositories including a data repository of credit header data and a data repository of aggregated loan data, said credit header data representing non-financial information maintained by one or more credit bureaus, said aggregated loan data representing mortgage loans offered by a plurality of lenders; and
- (c) for each property identified in step (b), assessing, based at least partly on public recorder data that comprises property sales transaction data, whether the property is currently owned by the individual, said assessing comprising comparing the name of the individual to buyer and seller names included in the sales transaction data associated with the property to assess the individual's past and present ownership of the property;
- wherein steps (a)-(c) are performed by a computerized analytics system that comprises one or more computing devices.
2. The process of claim 1, wherein the aggregated loan data comprises data supplied to the analytics system by at least some of said lenders.
3. The process of claim 1, wherein the data repository of aggregated loan data is a loan registry with which a plurality of lenders register loans.
4. The process of claim 1, further comprising, for at least a first property identified in step (b), executing a mailing address search for other properties whose mailing address matches an address of the first property, to thereby search for other properties potentially owned by the individual.
5. The process of claim 4, wherein executing the mailing address search comprises searching aggregated tax assessor data obtained from tax assessors in multiple jurisdictions.
6. The process of claim 4, further comprising, when an additional property is identified via the mailing address search, assessing, based at least partly on the public recorder data, whether the additional property is currently owned by the individual.
7. The process of claim 1, wherein step (c) comprises using a name matching algorithm that accounts for name variations to compare the name of the individual to the buyer and seller names included in the sales transaction data.
8. The process of claim 1, wherein step (c) further comprises comparing at least one AKA (also known as) name associated with the individual to the buyer and seller names included in the sales transaction data.
9. An analytics system, comprising:
- a data repository of credit header data of a plurality of individuals, said credit header data consisting of non-financial data maintained by one or more credit bureaus;
- a data repository of aggregated loan data, said aggregated loan data representing mortgage loans granted by multiple lenders;
- a data repository of aggregated public recorder data, said aggregated public recorder data including property-related data maintained by each of a plurality of public recorders of each of a plurality of jurisdictions; and
- a computing system programmed to search for properties owned by an individual by searching at least the data repositories of credit header data and aggregated loan data for properties associated with a name and social security number of the individual, and by using the data repository of aggregated public recorder data to assess the individual's ownership of said properties, said computing system comprising one or more computing devices, wherein the computing system is programmed to use the data repository of aggregated public recorder data to assess the individual's ownership of said properties by a process that comprises retrieving, from the data repository of aggregated public recorder data, sales transaction data associated with a property, and comparing a name of the individual to buyer and seller names included in the sales transaction data.
10. The analytics system of claim 9, wherein the data repository of aggregated loan data comprises loan data supplied to the analytics system at least some of the lenders.
11. The analytics system of claim 9, further comprising a data repository of public assessor data aggregated across multiple tax assessors, wherein the computing system is additionally programmed to use mailing addresses stored in the data repository of public assessor data to search for additional properties owned by the individual.
12. The analytics system of claim 9, wherein the computing system is programmed to use a name matching algorithm that accounts for name variations to compare the individual's name to the buyer and seller names.
13. The analytics system of claim 9, wherein the computing system is additionally programmed to compare at least one AKA (also known as) name associated with the individual to the buyer and seller names included in the sales transaction data.
14. A computer-implemented process of generating a profile of a securitized loan, the method comprising:
- (a) receiving information specifying a loan amount, a loan date, and a property zip code associated with the securitized loan;
- (b) conducting a search of at least aggregated public recorder data and/or aggregated loan data for a mortgage transaction that matches the loan amount, loan date, and property zip code;
- (c) in response to identifying a matching mortgage transaction in step (b), determining a borrower name and property address associated with the securitized loan from recorded information regarding the matching mortgage transaction;
- (d) looking up a mailing address associated with said property address; and
- (e) using said mailing address to search for additional properties owned by the borrower;
- wherein steps (a)-(e) are performed by a computerized analytics system that comprises one or more computing devices.
15. The process of claim 14, wherein step (e) comprises searching aggregated tax assessor data for an additional property whose mailing address matches the mailing address identified in step (d), said aggregated tax assessor data comprising data maintained by respective tax assessors in each of a plurality of jurisdictions.
16. The process of claim 14, wherein step (e) comprises, when the mailing address looked up in step (d) is different from the property address associated with the securitized loan, determining, based on the public recorder data, whether the mailing address looked up in step (d) is that of an additional property owned by said borrower.
17. The process of claim 14, further comprising generating a concurrent ownership report that identifies a plurality of properties identified via steps (a)-(e) as owned by said borrower.
Type: Application
Filed: May 16, 2013
Publication Date: Sep 26, 2013
Applicant: CoreLogic Solutions, LLC (Irvine, CA)
Inventors: Dianna L. Serio (Irvine, CA), Felice J. Kesselring (El Dorado Hills, CA)
Application Number: 13/896,053
International Classification: G06Q 40/02 (20120101);