INTERACTIVELY ENTERING DATA INTO THE DATABASE
A system and method for facilitating the accurate entry of information into a highly structured database by initially extracting information from a plurality of nonuniformly formatted source data streams, e.g., documents/files, and subsequent interactions with users before storing the accepted and/or modified information into the database. Embodiments of the present invention provide an interactive path for each user (e.g., the author of the source document/file) to interactively modify the extracted data, e.g., according to the source document/file. Preferably, this interactive path is provided via the Internet and the extracted information can be modified by editing and/or selectively copying portions of the source documents/files to supplement and/or modify the extracted information.
This application is a continuation of, and claims priority to, U.S. application Ser. No. 13/107,699, filed on May 13, 2011, which claims priority to and is a continuation of U.S. application Ser. No. 11/191,898, filed on Jul. 28, 2005, which claims priority to and is a continuation in part of U.S. application Ser. No. 09/019,948, filed on Feb. 6, 1998, which claims the benefit of U.S. Provisional Application No. 60/068,404, filed on Dec. 21, 1997, all of which are incorporated herein by reference in their entirety.
BACKGROUND OF THE INVENTIONThe present invention relates to data processing systems for entering information into and accessing information from large structured databases and in particular to those systems which allow multiple independent users to enter information from nonuniformly formatted documents/files and to interact with the system to assure the accuracy of the database entries.
The use of databases for storing data records which can be readily searched is well known. A typical application of large structured databases would be a system for matching jobs and applicants. When used in conjunction with a search engine, a program that can search for matches between inquiry data and data stored within the database, such a system significantly reduces the manual efforts required to match the needs of employers (job providers) and applicants (job seekers). In order to enter applicant data into the database, source documents/files (typically, nonuniformly formatted resume) can be used. Since the format of text data contained within a resume is typically not standardized, text data extraction software is used to retrieve data for entry into the database. Typical of such data extraction software is that described in U.S. Pat. Nos. 5,164,899 and 5,197,004.
SUMMARY OF THE INVENTIONThe present invention is directed to a system for facilitating the accurate transfer of information from a source data stream, e.g., a document/file, to a highly structured database and more particularly to such systems capable of accepting nonuniformly formatted documents, e.g., text documents such as resumes, advertisements, and medical records, from a plurality of users via a remote communication interface, e.g., the Internet, and for extracting information therefrom via a procedure which includes user participation to assure the transfer of appropriate entries into the database.
Embodiments of the present invention provide an interactive path for a user (typically, the author of the source document/file) to interactively modify the extracted information. In a preferred embodiment, this interactive path is provided via. the Internet and the extracted information can be altered by editing and/or selectively copying portions of the source document/file to supplement and/or modify the extracted information.
A preferred system for facilitating the accurate transfer of information from each of a plurality of nonuniformly formatted source data streams into: a ‘structured database comprises (1) means for supplying digital data representing each of a plurality of source data streams from a plurality of users, each source data stream containing data corresponding to multiple discernible source data strings, (2) data extraction means for extracting selected ones of the source data strings and generating related target data strings, (3) means for displaying a structured form comprised of multiple fields, each field capable of accommodating a data string and wherein one or more of the fields have the target data strings inserted within, (4) means for enabling each user to modify the target strings inserted within the displayed form corresponding to the source data stream originating from the user before accepting the form, and (5) means for storing data corresponding to the data strings from the form fields into the database.
In a further aspect of the present invention, the providing means uses a remote communication interface, preferably using the Internet, to supply the source document/file to the data extraction means and, subsequently, to return the form having target data strings within its fields.
An additional embodiment of the present invention also comprises a means for providing one or more supplemental inquiry forms to a user, receiving data strings in response to the supplemental inquiry forms, and for providing the data strings back to the user along with the target data strings in a structured form.
In a further aspect of the present invention, the providing means enables a user to submit digital data in the form of an audio stream. Data processing includes the conversion of the audio stream to a text string. The text string is then processed in the same manner as a user submitted source string containing text.
In an alternative embodiment of the present invention, the providing means uses traditional mail to supply the source document/file to the data extraction means. Alternatively, the data extraction means, structured form generating means, supplemental inquiry form generator, and structured form editing means are supplied to the user's computer as a self executing piece of software.
The novel features of the invention are set forth with particularity in the appended claims. The invention will be best understood from the following description when read in conjunction with the accompanying drawings.
The present invention is directed to a system for facilitating the accurate transfer of information from a source data stream, e.g., a document/file, to a highly structured database and more particularly to such systems capable of accepting nonuniformly formatted documents, e.g., text documents such as resumes, from a plurality of users via a remote communication interface, e.g., the Internet, and for extracting information therefrom via a procedure which includes user participation to assure the transfer of appropriate entries into the database.
In a preferred embodiment of the present invention, the system is used to place nonuniformly formatted advertisements into a structured database. In an additional embodiment of the present invention, the system is used to place nonuniformly formatted medical records into a structured database. Embodiments of the present invention provide an interactive path for a user (typically, the author of the source document/file) to interactively modify the extracted information, e.g., according to the source document/file. In a preferred embodiment, this interactive path is provided via the Internet and the extracted information can be altered by editing and/or selectively copying portions of the source document/file to supplement and/or modify the extracted information.
As shown in
As shown in
Table I shows an exemplary partial list of definitions of the information stored in the data fields 28 of the database 14 of
First, the data extractor 22 extracts source data strings, e.g., text strings 24a-2d, from the resume 12. Optionally, the text format of one or more of the source text strings 24 are then altered by the data extractor 22 to generate target data strings, e.g., text strings 32, of a standardized format. For example, a date text string could be standardized (e.g, Mar. 12, 1993 could be changed to 3/12/93). Otherwise, the stored target text string 32 is essentially identical to the source text string 24. As described further below, each target text string 32 preferably directly corresponds to the data fields 28 in the database 14 (e.g., the target string 32 corresponding to source text string 24a corresponds to 28j) and thus, following the modification/acceptance process described below, target text strings 32 are stored via path 34 into the database 14 (following any conversions required by the format of the database 14 and its fields 28).
However, due to lack of structure of the resume 12, the data extractor 22 (also referred to as a natural language processor) is susceptible to making an incomplete or erroneous correlations. Accordingly, the present invention provides an interactive path 36 that enables the applicant 16, generally the individual most acquainted with the contents of the resume 12, to modify the target text strings 32 to best correspond to the resume 12 and, thus, enhance the accuracy of the data stored in the database 14.
Preferably, each user site 38 is comprised of the computer 40, e.g., a personal computer, having a display control output 54 that drives a display monitor 56 to generate a displayed output 58 and a data entry device, e.g., a keyboard/mouse 60, that directs operation of the computer 40 via control path 62. In contrast, while the database service provider site 42 may typically also include a monitor and a keyboard/mouse, it only requires a computer 64 that interfaces to the Internet 46.
Initially, the user 16 at user site 38 generates the source document/file, i.e., resume 12, at step 66 of
As a next step, the user 16 requests a first web page form (seep 68) via the Internet 46 to begin the process of interactively transferring the resume 12 to the database 14. The first web page form 68 (see
Next, STEP TWO of the process commences by the computer 64 at the database service provider site 42 sending a second web page form 78 (see
STEP THREE of the process commences by the third web page form generator 90 at the database service provider site 42 generating a third web page form 92 (see
The user 16 can now use the view the displayed form 92 to determine its accuracy. If the displayed data, including target text strings 32 and supplemental text strings 86, are accurate the user 16 sends back form 92 to the database service provider site 42 where the accepted text strings are extracted in block 98 and stored in database 14. However, as previously discussed, the displayed data is not always accurate. Accordingly, the user 16 can edit data supplied in the third web page form 92 (preferably including using the supplied resume 12) to cause the fields 94 of form 92 to more accurately represent the applicant's resume information. Using features of the web browser 50, the user 16 can in step 100 edit fields 94 and/or paste information from resume 12′ (now part of form 92) to modify the data-fields 94. The user in step 102 then sends the modified form 92 back to the database service provider site 42 where accepted text strings 104 from fields 94 are stored in the database 14 in step 98.
As-an example of the modification process, it is noted that field 94n corresponding to the third “Company” under “Experience” has been filled in with the target text string 32 “Los Angeles”. This is inaccurate since the data extractor 22 has apparently missed the company name, i.e., Nordstroms, and instead extracted the city name as the target text string 32. Therefore, the user/applicant 16 can identify this inaccuracy and either (1) edit the field 94n by typing in the correct entry or (2) select the source text string 24′ from the copy 12′ of resume 12 included on the third web page form and paste the proper text (Nordstroms) into field 94n. Accordingly, the user/applicant 16 has been given the opportunity to verify and correct the data before entering it into the database 14, thus assuring the accurate transfer of information into the database 14.
Once the information has been stored in the database 14, a search engine 106, preferably a software program that executes on the computer 64 at the database service provider site 42, can be used to match inquiries, e.g., from one or more employer sites 108 (preferably via the remote communication interface 44) to look for applicants 16 with specific attributes. For example, since the highly structured database 14 contains fields 28 corresponding to the schools attended by each applicant 16, the search engine 106 can, in response to a request from the employer site 108, search for applicants 16 who graduated from specific schools or any other criteria stored in the fields 28 of the database 14.
In another embodiment of the current invention, the user 16 at user site 38 generates an audio file to be used as the source file, i.e., resume 12, at step 66 of
Once received by the database service provider site 42 a speech to text conversion program is used to convert the audio file attached to first web page form 68 into a text file 12. The text file 12 is also stored in resume storage 74. The text file is then used the same way a user generated text file 12 is used as described above to generate an entry in database 14.
In another embodiment of the invention (see
Initially, the user 16 at user site 38 generates the source document/file, i.e., resume 12, at step 66 of
As a next step, the user 16 requests a first web page form and the associated software 69 (step 68) via the Internet 46 to begin the process of interactively transferring the resume 12 to the database 14. The first web page form 68 (see
The user 16 then preferably provides the existing resume 12 into the software 69 either by entering it directly or via a pasting operation used in conjunction with the web browser 50.
Next, STEP TWO of the process commences by the software on the users computer 40 which generates a second form 78 (see
STEP THREE of the process commences by the software 69 on the user's computer 40 generating a third form 92 (see
The user 16 can now use the software 69 to view the displayed form 92 to determine its accuracy. If the displayed data, including target text strings 32 and supplemental text strings 86 are accurate, the user 16 sends back form 92 using the software to the database service provider site 42 via the Internet, where the accepted text strings are extracted in block 98 and stored in database 14. However, as previously discussed, the displayed data is not always accurate. Accordingly, the user 16 can edit data supplied in the third form 92 (preferably including using the supplied resume 12) to cause the fields 94 of form 92 to more accurately represent the applicant's resume information. Using features of the web browser 50, the user 16 can in step 100 edit fields 94 and/or paste information from resume 12′ (now part of form 92) to modify the data fields 94. The user in step 102 then uses their browser to send the modified form 92 back to the database service provider site 42 using the software where accepted text strings 104 from fields 94 are stored in the database 14 in step 98.
In another embodiment of the invention the text extractor, structured form generator, supplemental question page generator, and structured form editor is supplied to the user's computer as self executing piece of software 69 by the database service provider. In this embodiment the user would not need to have an Internet connection at all.
The user contacts the database service provider using for example e-mail, telephone or traditional mail requesting the software 69. The software 69 is sent to the user on portable storage media through traditional mail and is executable as a stand alone program on the user's computer 40.
The functionality is similar to the above embodiments except that once the process is complete the user is prompted to save the completed resume 12 to portable storage media. The user then sends the storage media to the database service provider using traditional mail. Once received, the database service provider takes the resume 12 off of the portable storage media and places the resume contents into the database 14.
Although the present invention has been described in detail with reference only to the presently-preferred embodiments, those of ordinary skill in the art will appreciate that various modifications Can be made without departing from the invention. For example, while a job search environment has been primarily described, the present invention can be useful in other environments where the source document is essentially unstructured relative to a highly structured database. Accordingly, the invention is defined by the following claims.
Claims
1. A method for facilitating via an interactive path a transfer of resume data to a service provider, the method comprising:
- (a) receiving, by a service provider via an interactive path with a job applicant, digital data comprising a resume of the job applicant, the interactive path provided via a remote communication interface;
- (b) extracting, by the service provider, a plurality of data strings from the resume;
- (c) sending, by the service provider via the interactive path to the job applicant, an inquiry form asking the job applicant a supplemental question;
- (d) receiving, by the service provider via the interactive path from the job applicant, a data string responsive to the supplemental question; and
- (e) sending, by the service provider via the interactive path to the job applicant, a structured form comprising a plurality of fields, a first field of the plurality of fields accommodating an extracted data string from the resume and a second field of the plurality of fields accommodating the data string responsive to the supplemental question.
2. The method of claim 1, wherein step (a) further comprises receiving, by the service provider via the interactive path, digital data representing a source data stream from the job applicant, the source data stream containing data corresponding to multiple discernable data strings.
3. The method of claim 1, wherein step (a) further comprising receiving, by the service provider, a request from the job applicant for a web page to begin a process via the interactive path of interactively transferring the resume to the service provider.
4. The method of claim 2, further comprising sending, by the service provider via the interactive path, the web page for display by a browser used by the job applicant.
5. The method of claim 1, wherein step (a) further comprises storing, by the service providers, the resume to a database.
6. The method of claim 1, wherein step (b), further comprises extracting, by the service provider, the plurality of data strings according to syntactical rules.
7. The method of claim 1, wherein step (b), further comprises storing, by the service provider, the extracted plurality of data strings to a database.
8. The method of claim 1, wherein step (c) further comprises sending, by the service provider via the interactive path, a web page comprising the inquiry form.
9. The method of claim 1, wherein step (d) further comprises receiving, by the service provider via the interactive path from the job applicant, a filled-in inquiry form.
10. The method of claim 1, wherein step (e) further comprises sending, by the service provider via the interactive path to the job applicant, a web page comprising the structured form.
11. The method of claim 1, wherein step (e) further comprises receiving, by the service provider via the interactive path from the job applicant, one of acceptance or modification of data of the plurality of fields of the structured form.
12. The method of claim 1, wherein step (e) further comprises receiving, by the service provider via the interactive path from the job applicant, modification of data in the one or more fields of the plurality of fields of the structured form.
13. A system for facilitating via an interactive path a transfer of resume data to a service provider, the system comprising:
- a remote communication interface of a service provider receiving digital data comprising a resume of a job applicant, the remote communication interface providing an interactive path with the job applicant;
- an extractor extracting a plurality of data strings from the resume;
- a form generator sending via the interactive path to the job applicant, an inquiry form asking the job applicant a supplemental question; and
- wherein the service provider receives via the interactive path from the job applicant, a data string responsive to the supplemental question; and
- wherein the form generator sends via the interactive path to the job applicant, a structured form comprising a plurality of fields, a first field of the plurality of fields accommodating an extracted data string from the resume and a second field of the plurality of fields accommodating the data string responsive to the supplemental question.
14. The system of claim 13, wherein the service provider receives via the interactive path digital data representing a source data stream from the job applicant, the source data stream containing data corresponding to multiple discernable data strings.
15. The system of claim 13, wherein the remote communication interface receives a request from the job applicant for a web page to begin a process via the interactive path of interactively transferring the resume to the service provider.
16. The system of claim 15, wherein the form generator sends via the interactive path the web page for display by a browser used by the job applicant.
17. The system of claim 13, further comprising a database to store the resume.
18. The system of claim 13, wherein the extractor extracts the plurality of data strings according to syntactical rules.
19. The system of claim 13, wherein a database stores the extracted plurality of data strings.
20. The system of claim 13, wherein the form generator sends, via the interactive path, a web page comprising the inquiry form.
21. The system of claim 13, wherein the service provider receives via the interactive path from the job applicant a filled-in inquiry form.
22. The system of claim 13, wherein the form generator sends via the interactive path to the job applicant a web page comprising the structured form.
23. The system of claim 13, wherein the service provider receives via the interactive path from the job applicant, one of acceptance or modification of data of the plurality of fields of the structured form.
24. The system of claim 13, wherein the service provider receives via the interactive path from the job applicant, modification of data in the one or more fields of the plurality of fields of the structured form.
Type: Application
Filed: Apr 25, 2016
Publication Date: Nov 24, 2016
Inventors: David S. de Hilster (Long Beach, CA), Alan G. Porter (Huntington Beach, CA), John Reese (Los Angeles, CA)
Application Number: 15/137,436