SYSTEM AND METHOD FOR ANALYZING A RESUME AND DISPLAYING A SUMMARY OF THE RESUME

A computer implemented method for generating a summary of one or more resume from one or more of resumes to analyze insights of the one or more resume is provided. The computer implemented method includes (i) processing a first input includes a first indication to select a first resume from one or more of resumes, (ii) extracting, from the first resume, a first information, (iii) obtaining, from the first resume, a second information, (iv) generating a first table based on the first information and the second information, and (v) generating a first summary based on the first table, the first summary indicates a first correlation between (i) the one or more events associated with the first section and (ii) the one or more events associated with the second section over years.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to Indian patent application no. 350/CHE/2012 filed on Jan. 30, 2012, the complete disclosure of which, in its entirety, is herein incorporated by reference.

BACKGROUND

1. Technical Field

The embodiments herein generally relate to a resume summarizer tool, and more particularly, to a system and method for summarizing one or more resumes using the resume summarizer tool using natural language processing and weighted formal concept analysts (wFCA).

2. Description of the Related Art

Recruitment is the process of attracting, screening and selecting a qualified person for a job. Irrespective of the organization size, the entire organization needs the right candidate who suits for their needs. The process of recruitment is not at all an easy task. It has always been a challenge for any organization due to the high number of candidate resumes coming in for a specific job description.

At present, recruiter has to manually check these candidates resume for their relevancy with respect to the job description. Thus, for preliminary screening of the candidate, one has to manually check the resume. Usually, this process is time consuming and also increases labor costs.

Further, there are many existing job portals that provide recruiters a way for searching candidates in their database. The recruiters can search for the resumes using the keywords associated with the job. The job portal retrieves the number of resumes which are matches the keywords. The recruiters need to download and analyze each resume to identify a best resume. The recruiters may also interest to know the key insights of each resume. Accordingly there remains a need for analyzing one or mote resumes to identify key insights in the resume.

SUMMARY

In view of a foregoing, the embodiment herein provides a computer implemented method for generating a summary of one or more resume from one or more of resumes to analyse insights of the one or more resume. The comparer implemented method includes (i) processing a first input includes a first indication to select a first resume from the one or more of resumes, (ii) extracting, from the first resume, a first information, the first information includes (a) a first section, (b) a second section, (c) one or more events associated with the first section, and (d) one or more events associated with the second section, (iii) obtaining, from the first resume, a second information, (iv) generating a first table based on the first information and the second information, and (v) generating a first summary based on the first table, the first summary indicates a first correlation between (i) the one or more events associated with the first section and (ii) the one or more events associated with the second section over years. In one embodiment, the first information further includes (a) one or more first date range of the first section, (b) one or more second date range of the second section, (c) one or more third date range of the one or more events associated with the first section, and (d) one or more fourth date range of the one or more events associated with the second section. In another embodiment, the one or more first date range, the one or more second date range, the one or more third date range and the one or more fourth date range each includes a start date and an end date, the start date and the end date include a year, a date or a month. The second information may includes (a) a first period associated with the first section, (b) a second period associated with the second section, (c) a third period associated with the one or more events associated with the first section, and (d) a fourth period associated with the one or more events associated with the second section.

In yet another embodiment, (a) the one or more third date range of the one or more events associated with the first section and (b) the one or more fourth date range of the one or more events associated with the second section overlaps with each other. The first summary may includes a first graphical representation that illustrates the first correlation. The first graphical representation may illustrates (a) the one or more events associated with the first section using first color and (b) the one or more events associated with the second section using second color. The first graphical representation may display at least one of the first information when a cursor moves over the first graphical representation. The computer implemented method may further includes (i) processing the first input includes a second indication to select a second resume from the one or more of resumes, (ii) extracting, the second resume, a third information, the third information includes (a) a third section, (b) a fourth section, (c) one or more events associated with the third section, and (d) one or more events associated with the fourth section, (iii) obtaining, from the second resume, a fourth information, (iv) generating a second table based on the third information and the fourth information, and (v) generating a second summary based on the second table, the second summary indicates a second correlation between (i) the one or more events associated with the third section and (ii) the one or more events associated with the fourth section over years, the first summary and second summary indicates the insights of the first resume and the second resume.

In one aspect, a non-transitory program storage device readable by computer, and includes a program of instructions executable by the computer to generate a summary of one or more resume from one or more of resumes is provided. The method includes (i) processing a first input includes a first indication to select a first resume from the one or more of resumes, (ii) extracting, from the first resume, a first information, the first information includes (a) a first section, (b) a second section, (c) one or more events associated with the first section, and (d) one or more events associated with the second section, (iii) obtaining, from the first resume, a second information, (iv) generating the first, table based on the first information and the second information, and (v) generating a first summary, the first summary includes a first graphical representation that is generated based on the table. The first graphical representation illustrates a first correlation between (i) the one or more events associated with the first section and (ii) the one or more events associated with the second section over years. The first summary may further include a second graphical representation that is generated based on the table. The second graphical representation illustrates an overall summary of the first section and the second section.

In one embodiment, the computer implemented method further includes (i) processing a second input includes a second indication to select one or more sections from the first section and the second section, and (ii) generating a second, summary for the one or more sections selected by the second input, the second summary includes a third graphical representation that is generated based on the table. The third representation illustrates a third correlation between the one or more events associated with the one or more sections selected by the second input over years. The method may further include (i) processing a third input include an indication to select at least one of (a) the first resume and (b) the second resume based on a comparison between the first summary of the first resume and the second summary of the second resume, and (ii) identifying a duration within at least one of (i) the first section and (i) the second section of at least one of (a) the first resume and (b) the second resume, wherein the duration does not comprise an event.

In another aspect, a system for summarizing one or more resume from one or more of resumes is provided. The system includes (a) a memory unit that stores a database and a set of modules, the database stores the one or more of resumes, (b) a display unit, and (e) a processor that executes the set of modules. The set of modules includes (i) a content extracting module that extracts, from the one or more resume, a first information and a second information, and (ii) a report generation module that generates a summary based on the first information and the second information, the summary is displayed in the display unit. The first information includes (a) a first section, (b) a second section, (c) one or more events associated with the first section, (d) one or more events associated with the second section and (e) one or more first date range of the first section, (f) one or more second date range of the second section, (g) one or mote third date range of the one or more events associated with the first section and (h) one or more fourth date range of the one or more events associated with the second section. The second information may includes (a) a first period associated with the first section, (b) a second period, associated with the second section, (e) a third period associated with the one or more events associated with the first section, and (d) a fourth period associated with the one or more events associated with the second section.

In one embodiment, the set of modules farther include (i) a table generating module that generates a table based on the first information and the second information, the summary is generated based on the table, (ii) a duration determining module that extracts one or more of date ranges from the one or more resume; and (ii) a boundary annotation module determines, from the one or more of date ranges, (i) the one or more first date range of the first section, (ii) the one or more second date range of the second section, (iii) the one or more third date range of the one or more events associated with the first section, and (iv) the one or more fourth date range of the one or more events associated with the second section.

These and other aspects of the embodiments herein will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. It should be understood, however, that the following descriptions, while indicating preferred embodiments and numerous specific details thereof, are given by way of illustration and not of limitation. Many changes and modifications may be made within the scope of the embodiments herein without departing from the spirit thereof, and the embodiments herein include ail such modifications.

BRIEF DESCRIPTION OF THE DRAWINGS

The embodiments herein will be better understood font the following detailed description with reference to the drawings, in which:

FIG. 1 illustrates a system view of users communicating with a user system for summarizing one or more resumes using a resume summarizer tool according to an embodiment herein;

FIG. 2 illustrates an exploded view of the user system with a memory storage unit for storing the resume summarizer tool of FIG. 1 and an external database according to an embodiment herein;

FIG. 3 is an exploded view of the resume summarizer tool of FIG. 1 illustrating a process of analysing the one or more resumes according to an embodiment herein;

FIG. 4 illustrates a user interface view of the content collection module of FIG. 3 of the resume summarizer tool of FIG. 1 according to an embodiment herein;

FIG. 5 illustrates a user interface view of a resume provided to the resume summarizer tool of FIG. 1 according to an embodiment herein;

FIG. 6 illustrates an exploded view of the content annotation module of FIG. 3 of the resume summarizer tool of FIG. 1 according to an embodiment herein;

FIG. 7 illustrates a table that is generated using the table generating module of the resume summarizer tool of FIG. 3 according to an embodiment herein;

FIG. 8 illustrates a graphical representation generated using the report generation module of FIG. 3 according to an embodiment herein;

FIG. 9 illustrates a user interface view of intent selection by the users of FIG. 1 according to an embodiment herein;

FIG. 10 illustrates a graphical representation of the event line of FIG. 8 that indicates one or more overlapping events according to an embodiment herein;

FIG. 11A is a user interface view illustrating a comparison two or more resumes using the resume comparison module of FIG. 3 according to an embodiment herein;

FIG. 11B is the user interface view illustrating a first graphical representation associated with the resume R1 and a second graphical representation associated with the resume R5 according to an embodiment herein; and

FIG. 12 is a flow diagram illustrating a method for generating a summary of at least one resume from a set of resumes to analyze insights of the at least one resume using the resume summarizer tool of FIG. 1 according to an embodiment herein; and

FIG. 13 illustrates a schematic diagram of a computer architecture used in accordance with the embodiments herein.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The embodiments herein and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.

As mentioned, there remains a need for a resume summarizer tool that analyzes resume(s) and generates a summary of the resumes in graphical representation to illustrate the key insights of the resumes. The resume summarizer tool performs overall resume summarization, compares one or more resumes irrespective of their formats, extracts the events and provides key insights. Referring now to the drawings, and more particularly to FIGS. 1 through 13, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments.

FIG. 1 illustrates a system view of users 102A-B communicating with a user system 104A-N for summarizing one or more resumes using a resume summarizer tool 106 according to an embodiment herein. The user system 104A-N may be a personal computer (PC) 104A, a tablet 104B and/or a smart phone 104N. A user 102A is one or more recruiters and a user 102B is one or more job seekers. The user system 104A-N includes the resume summarizer tool 106 that summarizes the one or more resumes. The resume summarizer tool 106 receives one or more resumes from the user 102B, in one example embodiment. The resume summarizer tool 106 may (i) obtain the one or more resumes from one or more job portals, and/or (ii) fetch the one or more resumes from emails, etc., in another example embodiment.

FIG. 2 illustrates an exploded view of the user system 104A-N with a memory storage unit 202 for storing the resume summarizer tool 106 of FIG. 1 and an external database 216 according to an embodiment herein. The user system 104A-N includes a memory storage unit 202, a bus 204, a communication device 206, a processor 208, a cursor control 210, a keyboard 212 and a display 214. The memory storage unit 202 stores the resume summarizer tool 106. The resume summarizer tool 106 includes one or more software modules to perform various functions on an input content and assists one or more recruiters 102A in choosing the right candidate for a given job description. The external database 216 includes a knowledge base 218 that is populated with a set of categories based on one or more concepts of linked data. The set of categories correspond to various keywords.

FIG. 3 is an exploded view of the resume summarizer tool 106 of FIG. 1 illustrating a process of analyzing the one or more resumes according to an embodiment herein. The resume summarizer tool 106 includes a database 302, a content collection module 304, a content parsing/extraction module 306, a content cleaning module 308, a content annotation module 310, an extracting module 312 (e.g. a content extracting module), a lattice construction module 314, a table generating module 316, a report generation module 318, and a report comparison module 320.

The database 302 stems the one or more resumes that are uploaded in the resume summarizer tool 106. The content collection module 304 collects content or text associated with the one or more resumes. The one or more resumes may be obtained from the user 102B (one or more candidates who are interested in seeking a job), in one example embodiment. The one or more resumes may be in a.doc format, a pdf, an .rtf and/or obtained from a Uniform (or universal) resource locator (URL), etc. The content parsing/extraction module 306 extracts the content and/or text from the one or more resumes. Further, the content parsing/extraction module 306 parses HTML content when the one or more resumes is obtained from the URL. The content cleaning module 308 cleans the content before sending it to the content annotation module 310. Cleaning may include removal of junk characters, new lines that are not useful, application specific symbols (e.g., MS Word bullets), and/or non-unicode characters etc. In one embodiment, specific parts of the document (e.g., a header and/or a footer) may be excluded.

The content annotation module 310 annotates the content of the one or more resumes for useful information. The useful information may include sentences, keywords, tokens, new lines, one or more sections (e.g., objectives, a work experience, education, circular activities, and/or personal information, etc.), durations (e.g., a first date range such as 2010-2012 that is associated with the one or more sections of the resume, and a second date range 2012-2013 that is associated with the one or more events of the one or more sections), durations within the sections, sentences associated with sections, and sentences associated with duration of the resume associated with a candidate. The one or more sections may include one or more events (e.g., the candidate has 2 years of experience in Java, C and C++).

Once the annotations are done in the content annotation module 310, the extracting module 312 extracts one or more artifacts (e.g., sentences, keywords, tokens, new lines, sections such as objectives, a work experience, education, circular activities, and/or personal information, etc., durations such as one or more date ranges, durations within the sections, sentences associated with sections, and sentences associated with duration of the one or more resumes associated with the one or more candidates). The extracting module 312 extracts a name, an email address, a phone number, and any other contact details that are mentioned in the resume, etc. In addition, the extracting module 312 may also identity and extract a second set of information (e.g., second information, fourth information). The second set of information includes (a) a first period that corresponds to at least one section in the one or more resumes and (b) a third period that corresponds to at least one event associated with the at least one section in the one or more resumes.

The lattice construction module 314 disambiguates one or more keywords in the resume to compute the context in which the keywords from the one or more resumes are used. The lattice construction module 314 constructs a lattice based on a weighted Formal Concept Analysis (wFCA) using the one or more keywords from the one or more resumes as objects, and their corresponding categories as attributes.

The table generating module 316 generates a table (e.g., a first table and a second table) for each of the one or more resumes using the one or more artifacts extracted in the extracting module 312. The report generation module 318 generates a summary (e.g., a first summary and a second summary) based on the table generated in the table generating module 316 for each of the one or more resumes. In one embodiment, the report generation module 318 generates one or more graphical representations based on the table. The report comparison module 320 compares the one or more resumes and assigns a weight for each of the one or more resumes based on the one or more keywords from the one or more resumes as objects, and the one or more resume as attributes.

FIG. 4 illustrates a user interface view of the content collection module 304 of FIG. 3 of the resume summarizer tool 106 of FIG. 1 according to an embodiment herein. The user interface view of the content collection module 302 includes a header 402, a test field 404, an upload button 406, an URL text field 408, a fetch button 410, a drag and drop field 412, an upload a file button 414, a task status table 416, a task progress field 418, and a proceed button 420. The header 402 displays a logo, a welcome message, and the status of an application. The user 102B (e.g., a job seeker) may upload the resume in one or more format. Through, the text field 404, the user 102B can provide details of the resume in the form of plain text and clicks on the upload button 406 to upload the plain text provided in the text field 404 to a remote server.

The plain text may also be provided as an URL m the URL text field 408 and the resume associated, with the URL is crawled using the fetch button 410. The drag and drop field 412 helps the user 102B to drag and drop his/her resume(s) to be uploaded. Through, the upload a file button 414, the user 502 can browse his/her resume(s) to be uploaded. The task status table 416 displays the uploaded resume as plain text, the URL, and/or the resume. The task progress field 418 notifies the user 102B about the progress of analyzing the resume. The user 102B is redirected to a next page when he/she clicks on the proceed button 420.

FIG. 5 illustrates a user interface view 500 of a resume 502 provided to the resume summarizer tool 106 of FIG. 1 according to an embodiment herein. The resume 502 may be either submitted by a recruiter or by a candidate, in one example embodiment. The resume 502 may be obtained in the form of a document, a URL, and/or a plain-text. In one embodiment, the content in the resume 502 is patted/extracted (e.g., using the content parsing/extraction module 306 of FIG. 3). In another embodiment, the resume 502 may he fed as an URL (e.g., www.abc.com/xyz-resume.html).

The content cleaning module 308 cleans the content obtained from the content parsing/extraction module 306 before sending for annotation. Cleaning the content is required, to remove junk characters, new lines that are not useful, application specific symbols (word processing bullets, etc.), and/or non-Unicode characters, etc. In one embodiment, the content from the resume 502 itself is a cleaned text (e.g., a ready text),

FIG. 6 illustrates an exploded view of the content annotation module 310 of FIG. 3 of the resume summarizer tool 106 of FIG. 1 according to an embodiment herein. The content annotation module 310 includes a token annotations module 602, a sentence annotations module 604, a stem annotations module 606, a forced new lines, paragraphs and indentations computing module 608, a parts of speech tag (POS) token annotations module 610, a POS line annotation module 612, a duration determining module 614, a section annotations module 616, and a boundary annotation module 618. The dotted lines of FIG. 6 represent internal dependencies among the various modules. The solid lines represent the flow of annotation process. The content annotation module 310 annotates a cleaned content obtained from the content cleaning module 308 for useful information. The useful information may include sentences, keywords, tokens, new lines and a first set of information (e.g., a first information, a third information). The first set of information includes one or more sections (a first section, a second section, a third section, a fourth section), durations (e.g., a first date range, such as 2010-2012 that is associated with the one or mote sections of the resume, and a second date range 2012-2013 that is associated with the one or more events of the one or more sections), durations within the sections, sentences associated with sections, and sentences associated with duration of the resume associated with a candidate. The one or more sections include objectives, a work experience, education, circular activities, and/or personal information, etc. The one or more sections may include one or more events (e.g., the candidate has 2 years of experience in Java, C and C++).

The cleaned content obtained from the content cleaning module 308 is annotated by performing various levels of annotations using the modules of content annotation module 310 of FIG. 3. The sentence annotations module 604 extracts each and every sentence from the cleaned content. For example, the first sentence of the cleaned content obtained from the resume 502 is extracted by the sentence annotations module 604 includes

    • “PhD at MIT Media Lab, Massachusetts Institute of Technology.”

Similarly, the sentence annotations module 604 extracts all the sentences from the resume 502.

The token annotations module 602 determines each and every token in the extracted sentences. For example, “PhD”, “at”, “MIT”, “Media”, “Lab”, “,” “Massachusetts”, “Institute”, “of”, “Technology” are all tokens in the first line of the cleaned content of the resume 502. The stem annotations module 606 compotes the root word for each and every token identified by the token annotations module 602.

The POS token annotations module 610 generates one or more parts of speeches (PPS) tag such as noun, and/or verb, etc. for each token in the sentences such that each token annotation has an associated POS tag. The forced new lines, paragraphs and indentations computing module 608 determines white spaces like new lines that are forced (e.g., pressed enter, list of items that are not proper sentence), paragraphs, and/or indentations, etc. Further, the POS line annotations module 612 tags each token in the extracted new lines as a noun, and/or a verb, etc. In addition, new lines are also useful for section extraction because section names may not be proper sentences. For example, in the resume content 502, “education” and “working experience” are not proper sentences but a word, and a fragment of two words respectively. These are captured as a new line (e.g., using the section annotations module 616) because they occur in a separate line.

The duration determining module 614 extracts one or more duration(s) wherever it occurs in the content of the resume 502. For example, it extracts duration(s), like “2008 to current”, “2006 to current”, etc. The section annotations module 616 determines a group of sentences that form a section that has a heading. To determine a start point and an end point of the section, various heuristics such as lookup for well known sections, sentence construction based on parts of speech, relevance with respect to surrounded text, exclusion terms, term co-occurrence, etc.

The boundary annotations module 618 associates related text with the duration identified by the duration determining module 614. Most often, there may be information that is associated with the duration but is not mentioned in the same line where duration occurs. The boundary annotations module 618 assigns a right boundary and a left boundary to identify exact information associated with the duration. For example,

    • “PhD at MIT Media Lab, Massachusetts Institute of Technology 2008 to current; Massachusetts Institute of Technology; CPA 5.0/5.0
    • Master of Science at MIT Media Lab, Massachusetts Institute of Technology 2006 to current; Media Arts and Sciences; Massachusetts institute of Technology; CPA 4.9/5.0
    • Master of Design at IDC, IIT Bombay 2003 to 2005; Industrial Design Centre, Indian Institute of Technology, Bombay; CPA 4.9/5.0
    • Bachelor of Computer Engineering at Gujarat University 1999 to 2003; Nirma Institute of Technology; Gujarat University; CPA 4.7/5.0
    • Working Experience”

In the example, the text shown is selected from the education section and a new section (“working experience”) of the cleaned content of the resume 502. The duration determining module 614 determines the periods such as “2008 to current”, “2006 to current”, “2003 to 2005” and “1993 to 2003”. The section annotations module 616 determines “working experience” as a new section. The boundary annotations module 618 assigns the left boundary and the right boundary for each of the identified duration. The left boundary for the duration “2008 to current” is “PhD at MIT Media Lab, Massachusetts Institute of Technology”. The right boundary is Master of Science at MIT Media Lab Massachusetts Institute of Technology. Both these lines, left and right to the duration annotations are considered as possible associations with the duration “2008 to current”. Similarly, left and right boundaries are assigned for each of the duration. The right boundary for the fast duration “1999 to 2003” is a new section (“working experience”). Therefore, the boundary annotations module 618 computes that right boundary for the last duration is not associated with the context of that duration. Further, the resume analyzer tool 106 understands the section and the context in which the year like numbers are occurring and include/exclude based on the context. For example, a candidate's resume states that the “person stands 1st out of 2000 people who have all attended the interview” then the resume analyser tool 106 correctly identifies that 2000 is not part of the duration.

Further, the boundary annotations module 618 uses a simple heuristics to determine the best possible association for entire section. The heuristic counts the number of left and right associations for the entire section. In the above example, the numbers of left associations are more compared to the number of right associations since the last duration annotation does not have any line covered by the right boundary. Since, the left associations are more compared to the right associations, the boundary annotations module 618 will consider left association as the best possible association. Thus the duration “2008 to current” is associated with the “PhD at MIT Media Lab, Massachusetts Institute of Technology”.

Once the annotations are done, Ore extracting module 312 one or more artifacts (e.g., sentences, keywords, tokens, new lines, sections such as objectives, a work experience, education, circular activities, and/or personal information, etc., durations such as one or more date ranges, durations within the sections, a gap duration, (e.g. a duration within the sections does not includes an event), sentences associated with sections, and sentences associated with duration of the one or more resumes associated with the one or more candidates.

The keywords in the one or more artifacts are extracted based on the parts of speech tag generated by the POS modules using the token annotations module 602 and the forced new lines, paragraphs and indentations computing module 608. For example, a noun is very likely to be a keyword in the sentence. Similarly co-occurring nouns and its derivatives are also a keyword. A keyword chamber is used to obtain these keyword and keyword phrases depending on the noun and related tags. The extracting module 312 extracts keywords (e.g., 3 keywords) using POS tag generated by the POS token annotations module 610 and the POS line annotations module 612. The extracted keywords are:

    • PhD—POS Tag says that it is a noun
    • MIT media lab—POS Tag says that it is a noun
    • Massachusetts Institute—POS Tag says that it is a noun.

Once these keywords are identified and extracted, they are disambiguated to find the right meaning. To disambiguate, the resume summarizer tool 106 determines the different disambiguated terms for the extracted keywords and their related categories using the lattice construction module 314. Further, the resume summarizer tool 106 uses the knowledge base 218 stored in the external database 216 for obtaining the categories for the extracted keywords. Each keyword is queried separately against the knowledge base 218 and corresponding categories are obtained. For example, for the above keywords. For example, the categories obtained are:

    • PhD—{Education, Qualifications, Academic Degrees, Doctoral Degrees, Doctor of Philosophy}
    • MIT Media Lab—{Education, Educational Organizations, Educational Institutions, Academic Institutions, Universities and Colleges, Universities and Colleges by Country, Universities and Colleges in the United States, Universities and Colleges in Massachusetts, Massachusetts Institute of Technology}
    • Massachusetts Institute—{Education, Educational Organizations, Educational Institutions, Academic Institutions, Universities and Colleges, Universities and Colleges by Country, Universities and Colleges in the United States, Universities and Colleges in Massachusetts}

These keywords are either nouns or noun phrases. The resume summarizer tool 106 allows certain prepositions as well to determine the keywords, for example, “in”. For example, if preposition “in” is considered, the keywords extracted will include—“PhD at MIT Media Lab” and “Massachusetts Institute of Technology”. These keywords are then queried against the knowledge base 216. If a match is found then they are included in the set of keywords. Here, there are no disambiguations found. All the extracted keywords are unique in the context of right meaning.

FIG. 7 illustrates a table 700 that is generated using the table generating module 316 of the resume summarizer tool 106 of FIG. 3 according to an embodiment herein. The table generating module 316 generates the table 700 for the resume 502. The table 700 is generated based on the information extracted in the extracting module 312. The information includes the one or more artifacts (e.g., sentences, keywords, tokens, new lines, one or more sections such as an education, a work experience, etc., durations such as one or more date ranges, durations within the sections, sentences associated with sections, and sentences associated with duration of the resume 502). The table 700 includes one or more columns 702A-E. The first data series column 702A includes the one or mom sections (e.g., education, experience) of the resume 502. The second data series column 702B includes the one or more events (e.g., Bachelor of computer engineering, Project Manager at company A) associated with each of the one or more sections of the resume 502. The third data series column 702C includes at least one duration corresponding to each event. For example, a candidate has completed Bachelor of computer engineering (which is an event) in the duration 1999 to 2003. The table 700 is generated for a current year of 2012, in one example embodiment. The fourth data series column 702D includes at least, one period associated with each event. For example, the period is 4 years in which the candidate has completed Bachelor of computer engineering. Here, the period is calculated based on the current year for the most recent event. For example in FIG. 7, 2012 is considered as current year. The fifth data series column 702E includes a period associated with each section. For example, the candidate has completed his/her education (e.g., Bachelor of computer engineering, Master of Design, Master of Science and PhD.) in 13 years (e.g., 1999 to 2012)).

FIG. 8 illustrates a graphical representation 800 generated using the report generation module 318 of FIG. 3 according to an embodiment herein. The graphical representation 800 is generated for the resume 502 based on the table 700. The graphical representation 800 includes an event line 802 (e.g., a first graphical representation), an events distribution 804, and a sections span 806 (e.g., a second graphical representation) generated based on the table 700 using the report generation module 318. The event line 802 indicates a correlation between the one or more events extracted from the resume 502. The users 102A-B can view the one or more events and key insights tor the entire resume 502 in the graphical representation 800. The event line 802 also indicates the one or more events that are overlapping with each other that are useful to obtain key insights (e.g., what the candidate was doing from skills perspective while studying (Education) or working (Experience)). The event line 802 also indicates the gaps in a profile of the one or more resumes. One or more colour codes may be used to indicate an overlap between the one or more events, in one example embodiment. Other techniques may be used to indicate the overlap between the one or more events, in another example embodiment.

Through, the event distribution 804, the users 102A-B may view the number of events within each section for a specific year. This distribution gives an insight on the area under which the user 102B has been very active for a particular time period. For example, the event distribution 804 indicates that the user 102B has been very active from the year 2007 till 2010. The sections span 806 indicates an overall distribution of the one or more events with respect to the one or more sections (e.g., education, experience, and awards) in the resume 502. Each of the event line 802, the events distribution 804, and the sections span 806 may indicate a gap/duration in the resume 502 (the gap is indicated based on the gap duration that is extracted in the extracting module 312). For example, a candidate may have pursued his/her education (e.g., an engineering degree in computer science) during the years August 2002 till June 2006, and has a work experience from a year 2008 January till December 2009. The resume summarizer tool 106 identifies the gap between the education and the work experience. The gap indicates a duration (e.g., July 2006 till December 2007) which represents that the candidate has (i) not pursued an education, (ii) no work experience, and/or (iii) not performed any events/activities (e.g., pursuing internship in an institute organization) with respect to his/her career.

FIG. 9 illustrates a user interface view 900 of intent selection by the users 102A-B of FIG. 1 according to art embodiment herein. The user interlace view of an intent selection includes the header 402, one or more resumes 902A-N stored in the database 302, a create folder(s) or organize content button 904, an intent analytics field 906, and an intent selection field 908. The one or more resumes 902A-N stored in the database 302 may be listed and/or displayed as a scrollable list (e.g., a left to right scrollable list and/or a right to left scrollable list).

The create folder(s) or organize content button 904 is used for organizing these resumes and can create new (folders). The users 102A-B can drag-drop one or more resumes from the scrollable list to a specific folder to organize them. The intent analytics field 906 displays the analysis details of the one or more resumes selected from the scrollable list using the report generation module 318 of FIG. 3. Such analysis details include graphical representation of the one or more resumes 902A-N. The graphical representation includes an event line 802 with years on the left side of the section and corresponding details adjacent to it. The intent selection field 908 provides one or more options to specify various intents (e.g., analysis details) around which one or more reports are to be generated. For example, the users 102A-B can select at least one section from the one or more sections of the one or more resumes 902A-N and may summarize the content around a selected section. In one embodiment, the one or more resumes 902A-N includes the resume 502. The users 102A-B may select one section either education or experience for the resume 502 to summarize the content around the selected section.

FIG. 10 illustrates a graphical representation 1000 of the event line 802 of FIG. 8 that indicates one or more overlapping events according to an embodiment herein. The graphical representation 1000 of the event line 802 indicates (i) the key insights, and (ii) the one or more overlapping events of the resume 502 upon receiving an input (e.g., a mouse-hover, and/or a cursor when moved on a particular event line). For example, the users 102A-B may view the key insights during the year 2009 such as an education event 1002, and an experience event 1004.

The education event 1002 indicates information that is associated with an education for a particular event line (e.g., the event line 802 such as Media Arts and sciences; Massachusetts institute of technology; CPA 4.9/5.0) when the users 102A-B experiences a mouse-hover on that particular event line 802. Similarly, the experience event 1004 indicates an experience information when an input is received, (e.g., when a mouse-hover is experienced on the event line 802). Since this event overlaps with education, this indicates a clear insight about the experience earned as an internship while the user 102B was pursuing his/her one or more degrees.

FIG. 11A is a user interface view 1100 illustrating a comparison two or more resumes using the resume comparison module 320 of FIG. 3 according to an embodiment herein. The user interface view of resume comparison includes the header 402, a resume display filed 1102 which displays the one or more resumes (R1, R2 . . . , RN), a resume analytic field 1104, a compare button 1106, and a cancel button 1108. The resume display field may include one or more check boxes 1110 which allows the users 102A-B to select the one or more resumes displayed in the resume display field 1102. The one or more resumes may also be selected by using (i) a control option and (ii) a shift option from a keyboard interface, or any other method that includes selection of two or mom resumes (e.g., drag and drop method, a touch interface, etc). For example, a first resume (R1) may be selected using a control option, and a second resume (R5) using the shift option, etc. The resume analytic field 1104 displays analysis details (e.g., one or more events from R1 are compared with one or more events from R5 using the event line 802, the events distribution 804, and the sections span 806) when an input is received on the compare button 1106. The user interface view 1100 further includes a text field 1112 which displays information (e.g., a name of a first candidate, a name of a second candidate, an email address of the first candidate, an email address of the second candidate, etc.) of the one or more resumes (R1, and R5) that are compared.

With reference to FIG. 11A, FIG. 11B is the user interlace view illustrating a first graphical representation associated with the resume R1 and a graphical representation associated with the resume R5 according to an embodiment herein. The resume summarizer tool 106 extracts one or more keywords from the resumes R1 and R5. The report comparison module 320 compares the one or more keywords in the resumes R1 and R5 using the weighted Formal Concept Analysts (wFCA). The report comparison module 320 computes a weight for the resumes R1 and R5 based on the weighted Formal Concept Analysis (wFCA) using the keywords extracted from the resumes R1 and R5 as objects, and the resumes R1 and R5 as attributes.

For example, the resumes R1 and R5 may have the following set of keywords:

    • R1: {Languages, Functional Scala}
    • R5: {Languages, Functional Scala, Erlang}

In this case the objects are resumes and the attributes are keywords. The hierarchical FCA gives the result as

    • First Level Concepts
    • Concept 1: [R1, R5]: [Languages]
      Both the resume has the keyword language
    • Second Level Concepts
    • Concept 2: [R1, R5]: [Functional]
      Both, the resumes has the keyword functional programming language (e.g., similarity)
    • Third Level Concepts
    • Concept 3: [R1, R5]: [Scala]
      • [R5]: [Scala, Erlang]

Both the resumes R1 and R5 have the keyword ‘Scala’ but resume R5 has the keyword Erlang, in addition to Scala. The report comparison module 320 assigns a weight (e.g., 8.5) to the resume R5 that is higher to a weight assigned to the resume R1 (e.g., 5.5). Further, the other categories “Languages”, “Functional”, “Scala” are common for both the resumes. The resume analytic field 1104 displays the analysis details such as the event line 802 for the resumes R1 and R5 along with the weights for the resumes R1 and R5.

When one or more events in R1 do not match with one or more events in R5, the resume summarizer tool 106 indicates/displays a message (e.g., R1 and R5 are not comparable). For example, the users 102A-B select a resume R2 and the resume R5 from the one or more resumes displayed in the resume display field 1102. The resume summarizer tool 106 extracts at least one keyword from the resumes R2 and R5. For example, the keywords are as below:

    • R2: {Biotech, Genes, Plantation}
    • R5: {Languages, Functional, Scala, Erlang}

In this case, the resumes R2 and R5 are considered as objects, and the keywords are considered as attributes. Thus, based on the keywords, the resume summarizer tool 106 displays a message that indicates R2 and R5 not comparable.

FIG. 12 is a flow diagram illustrating a method tor generating a summary of at least one resume from a set of resumes to analyze insights of the at least one resume using the resume summarizer tool of FIG. 1 according to an embodiment herein. In step 1202, a first input is processed. The first input includes a first indication to select a first resume from one or more of resumes. In step 1204, a first information is extracted from the first resume. The first information includes (a) a first section, (b) a second section, (c) one or more events associated with the first section, and (d) one or more events associated with the second section. In step 1206, a second information is obtained from the first resume. In step 1208, a first table is generated based on the first information and the second information. In step 1210, a first summary is generated based on the first table. The first summary indicates a first correlation between (i) the one or more events associated with the first section and (ii) the one or more events associated with the second section over years.

In one embodiment, the first information further includes (a) one or more first date range of the first section, (b) one or more second date range of the second section, (c) one or more third date range of the one or more events associated with the first section, and (d) one or more fourth date range of the one or mote events associated with the second section.

In another embodiment, the one or more first date range, the one or more second date range, the one or more third date range and the one or more fourth date range each includes a start date and an end date, the start date and the end date include a year, a date or a month. The second information may includes (a) a first period associated with the first section, (b) a second period associated with the second section, (c) a third period associated with the one or more events associated with the first section, and (d) a fourth period associated with the one or more events associated with the second section. In yet another embodiment, (a) the one or more third date range of the one or more events associated with the first section and (b) the one or more fourth date range of the one or more events associated with the second section overlaps with each other. The first summary may includes a first graphical representation that illustrates the first correlation. The first graphical representation may illustrates (a) the one or more events associated with the first section using first color and (b) the one or more events associated with the second section using second color. The first graphical representation may display at least one of the first information when a cursor moves over the first graphical representation.

The method may further includes (i) processing the first input includes a second indication to select a second resume from the one or more of resumes, (ii) extracting, the second resume, a third information, the third information includes (a) a third section, (b) a fourth section, (c) one or more events associated with the third section, and (d) one or more events associated with the fourth section, (iii) obtaining, from the second resume, a fourth information, (iv) generating a second table based on the third information and the fourth information, and (v) generating a second summary based on the second table, the second summary indicates a second correlation between (a) the one or more events associated with, the third section and (b) the one or more events associated with the fourth section over years, the first summary and the second summary indicates the insights of the first resume and the second resume. A third input may be processed. The third input includes an indication to select at least one of (a) the first resume and (b) the second resume based on a comparison between the first summary of the first resume and the second summary of the second resume. A duration within at least one of (i) the first section and (i) the second section of at least one of (a) the first resume and (b) the second resume may be identified. The duration does not include an event (e.g., one or more activities such as an internship).

The embodiments herein can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment including both hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc.

Furthermore, the embodiments herein can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.

A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.

Input/output (I/O) devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.

FIG. 13 illustrates a schematic diagram of a computer architecture used in accordance with the embodiments herein. The computer architecture includes one or more processor or central processing unit (CPU) 10. The CPUs 10 are interconnected via system bus 12 to various devices such as a random access memory (RAM) 14, read-only memory (ROM) 16, and an input/output (I/O) adapter 18. The I/O adapter 18 can connect to peripheral devices, such as disk units 11 and tape drives 13, or other program storage devices that are readable by the system. The system can read the inventive instructions on the program storage devices and follow these instructions to execute the methodology of the embodiments herein.

The computer architecture further includes a user interface adapter 19 that connects a keyboard 15, mouse 17, speaker 24, microphone 22, and/or other user interface devices such as a touch screen device (not shown) to the bus 12 to gather user input. Additionally, a communication adapter 20 connects the bus 12 to a data processing network 25, and a display adapter 21 connects the bus 12 to a display device 23 which may be embodied as an output device such as a monitor, printer, or transmitter, for example.

The foregoing description of the specific embodiments will so fully reveal the general nature of the embodiments herein chat others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments herein can be practiced with modification within the spirit and scope of the appended claims.

Claims

1. A computer implemented method for generating a summary of at least one resume from a plurality of resumes to analyze insights of said at feast one resume, said method comprising:

processing a first input comprising a first indication to select a first resume from said plurality of resumes;
extracting, from said first resume, a first information, wherein said first information comprises (a) a first section, (b) a second section, (e) at least one event associated with said first section, and (d) at least one event associated with said second section;
obtaining, from said first resume, a second information;
generating a first table based on said first information and said second information; and
generating a first summary based on said first table, wherein said first summary indicates a first correlation between (i) said at least one event associated with said first section and (ii) said at least one event associated with said second section over years.

2. The computer implemented method of claim 1, wherein said first information further comprises (i) at least one first date range of said first section, (ii) at least one second date range of said second section, (iii) at least one third date range of said at least one event associated with said first section, and (iv) at least one fourth date range of said at least one event associated with said second section.

3. The computer implemented method of claim 2, wherein said at least one first date range, said at least one second date range, said at least one third date range and said at least one fourth date mage each comprise a start date and an end date, wherein said start date and said end date comprise a year, a date or a month.

4. The computer implemented method of claim 2, wherein said second information comprises (i) a first period associated with said first section, (ii) a second period associated with said second section, (iii) a third period associated with said at least one event associated with said first section, and (iv) a fourth period associated with said at least one event associated with said second section.

5. The computer implemented method of claim 2, wherein (i) said at least one third date range of said at least one event associated with said first section and (ii) said at least one fourth date range of said at least one event associated with said second section overlaps with each other.

6. The computer implemented method of claim 1, wherein said first summary comprises a first graphical representation that illustrates said first correlation.

7. The computer implemented method of claim 6, wherein said first graphical representation illustrates (i) said at least one event associated with said first section using first color and (ii) said at least one event associated with said second section using second color.

8. The computer implemented method of claim 7, wherein said first graphical representation displays at least one of said first information when a cursor moves over said first graphical representation.

9. The computer implemented method of claim 1, further comprises:

processing said first input comprising a second indication to select a second resume from said plurality of resumes;
extracting, said second resume, a third information, wherein said third information comprises (a) a third section, (b) a fourth section, (c) at least one event associated with said third section, and (d) at least one event associated with said fourth section;
obtaining, from said second resume, a fourth information;
generating a second table based on said third information and said fourth information; and
generating a second summary based on said second table, wherein said second summary indicates a second correlation between (i) said at least one event associated with said third section and (ii) said at least one event associated with said fourth section over years, wherein said first summary and said second summary indicates said insights of said first resume and said second resume.

10. The computer implemented method of claim 9, further comprises:

processing a third input comprising an indication to select at least one of (i) said first resume and (ii) said second resume based on a comparison between (a) said first summary of said first resume and (b) said second summary of said second resume; and
identifying a duration within at least one of (i) said first section and (i) said second section of at least one of (a) said first resume and (b) said second resume, wherein said duration does not comprise an event.

11. A non-transitory program storage device readable by computer, and comprising a program of instructions executable by said computer to generate a summary of at least one resume from a plurality of resumes, said method comprising:

processing a first input comprising a first indication to select a first resume from said plurality of resumes;
extracting, from said first resume, a first information, wherein said first information comprises (a) a first section, (b) a second section, (c) at least one event associated with said first section, and (d) at least one event associated with said second section;
obtaining, from said first resume, a second information;
generating said first table based on said first information and said second information; and
generating a first summary, wherein said first summary comprises a first graphical representation that is generated based on said table, wherein said first graphical representation illustrates a first correlation between (i) said at least one event associated with said first section and (ii) said at least one event associated with said second section over years.

12. The non-transitory program storage device of claim 11, wherein said first summary further comprises a second graphical representation that is generated based on said table, wherein said second graphical representation illustrates an overall summary of said first section and said second section.

13. The non-transitory program storage device of claim 11, wherein said method further comprising:

processing a second input comprising a second indication to select at least one section from said first section and said second section; and
generating a second summary for said at least one section selected by said second input, wherein said second summary comprises a third graphical representation that is generated based on said table, wherein said third representation illustrates a third correlation between said at least one event associated with said at least one section selected by said second input over years.

14. A system for summarizing at least one resume from a plurality of resumes, said system comprising:

a memory unit that stores a database and a set of modules, wherein said database stores said plurality of resumes;
a display unit; and
a processor that executes said set of modules, wherein said set of modules comprising: a content extracting module that extracts, from said at least one resume, a first information and a second information, wherein said first information comprises: (a) a first section, (b) a second section, (c) at least one event associated with said first section, (d) at least one event associated with said second section and (e) at least one first date range of said first section, (f) at least one second date range of said second section, (g) at least one third date range of said at least one event associated with said first section and (h) at least one fourth date range of said at least one event associated with said second section; and a report generation module that generates a summary based on said first information and said second information, wherein said summary is displayed in said display unit.

15. The system of claim 14, further comprising:

a table generating module that generates a table based on said first information and said second information, wherein said summary is generated based on said table,
wherein said second information comprises (a) a first period associated with said first section, (b) a second period associated with said second section, (c) a third period associated with said at least one event associated with said first section, and (d) a fourth period associated with said at least one event associated with said second section.

16. The system of claim 14, wherein said content extracting module further comprises:

a duration determining module that extracts a plurality of date ranges from said at least one resume; and
a boundary annotation module determines, from said plurality of date ranges, (i) said at least one first date range of said first section, (ii) said at least one second date range of said second section, (iii) said at least one third date range of said at least one event associated with said first section, and (iv) said at least one fourth date range of said at least one event associated with said second section.
Patent History
Publication number: 20130198599
Type: Application
Filed: Jan 30, 2013
Publication Date: Aug 1, 2013
Applicant: FORMCEPT TECHNOLOGIES AND SOLUTIONS PVT LTD (Bangalore)
Inventors: Anuj Kumar (Bangalore), Suresh Srinivasan (Bangalore), FORMCEPT TECHNOLOGIES AND SOLUTIONS PVT LTD. (Bangalore)
Application Number: 13/753,550
Classifications
Current U.S. Class: Table (715/227)
International Classification: G06F 17/24 (20060101);