ELECTRONIC DEVICE AND METHOD FOR SEARCHING RELATED TERMS

A method for searching related terms first calculates a direct relationship between every two of a plurality of query terms to obtain a direct related matrix, and calculates a related score between every two of the query terms to obtain a related score matrix. The method further calculates an indirect relationship between every two of the query terms according to the direct relationship and the related score, and determines indirect terms of each query term according to the indirect relationship between every two of the query terms.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation application of U.S. application Ser. No. 13/217,272, filed on Aug. 25, 2011.

BACKGROUND

1. Technical Field

Embodiments of the present disclosure relate to file searching technology, and particularly to an electronic device and method for searching related terms using the electronic device.

2. Description of Related Art

Related terms of preset query terms can be obtained using a natural language processing (NLP) method by calculating a relationship between every two of the preset query terms. However, the NLP technology only calculates a direct relationship between every two of the preset query terms, and generates the related terms having the direct relation with the preset query terms. That is to say, the NLP technology cannot calculate an indirect relationship between every two of the preset query terms to generate the related terms having the indirect relationship with the preset query terms, which influences search results corresponding to the preset query terms.

For example, suppose that a query term is “baseball,” the query term “baseball” has a direct relationship with a first term “sport,” and the first term “sport” further has a direct relationship with a second term “basketball.” Thus, the query term “baseball” has an indirect relation with the second term “basketball.” The NLP technology can determine the first term “sport” as the related term of the query term “baseball,” but cannot determine the second term “basketball” as the related term of the query term “baseball.” It is thus less than efficient to implement a search operation according to the query term. Therefore, a more efficient method for searching related terms is desired.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of one embodiment of an electronic device including a related term search system.

FIG. 2 is a block diagram of one embodiment of the related term search system included in the electronic device of FIG. 1.

FIG. 3 is a flowchart of one embodiment of a method for searching related terms using the electronic device of the FIG. 1.

FIG. 4 is a topological diagram of direct relationship between a plurality of query terms.

FIG. 5 is an example of a direct related matrix created from the topological diagram of FIG. 4.

FIGS. 6 and 7 are exemplary schematic diagrams of related score matrices obtained from FIG. 5.

FIG. 8 is an exemplary topological diagram of indirect relationship of a plurality of query terms.

FIG. 9 is an example of an indirect related matrix created from the topological diagram of FIG. 8.

DETAILED DESCRIPTION

All of the processes described below may be embodied in, and fully automated via, functional code modules executed by one or more general purpose electronic devices or processors. The code modules may be stored in any type of non-transitory readable medium or other storage device. Some or all of the methods may alternatively be embodied in specialized hardware. Depending on the embodiment, the non-transitory readable medium may be a hard disk drive, a compact disc, a digital video disc, a tape drive or other suitable storage medium.

FIG. 1 is a block diagram of one embodiment of an electronic device 2 including a related term search system 24. In the embodiment, the electronic device 2 further includes a display device 20, an input device 22, a storage device 23, and at least one processor 25. The related term search system 24 may be used to determine related terms having indirect relationships with a plurality of query terms stored in the storage device 23. A detailed description will be given in the following paragraphs.

The display device 20 may be used to display search results matched with the determined related terms, and the input device 22 may be a mouse or a keyboard used to input computer readable data.

FIG. 2 is a block diagram of one embodiment of the related term search system 24 in the electronic device 2. In one embodiment, the related term search system 24 may include one or more modules, for example, a first calculation module 201, a second calculation module 202, a third calculation module 203, a related term determining module 204, and a searching module 205. The one or more modules 201-204 may comprise computerized code in the form of one or more programs that are stored in the storage device 23 (or memory). The computerized code includes instructions that are executed by the at least one processor 25 to provide functions for the one or more modules 201-204.

FIG. 3 is a flowchart of one embodiment of a method for searching related terms using the electronic device 2. Depending on the embodiment, additional blocks may be added, others removed, and the ordering of the blocks may be changed.

In block S1, the first calculation module 201 calculates a direct relationship “Ri, j” between every two of a plurality of query terms, and obtains a direct related matrix “R” according to all the calculated direct relationship. In one embodiment, the query terms may be inputted by a user, or stored in the storage device 23 in advance. As shown in FIG. 4, a direct relationship from “Term1” to “Term2” is 2, but a direct relationship from “Term2” to “Term1” is 1. That is to say, the direct relationship between two terms is determined according to a sequence of the two terms. As shown in FIG. 5, Ri, j represents the direct relationship from “Termi” to “Termj”, which is referred to as Relation(termi, termj).

In block S2, the second calculation module 202 calculates a related score between every two of the query terms, obtains a related score matrix according to all the calculated related scores, and stores the related score matrix in the storage device 23. In one embodiment, the related score between every two of the query terms is obtained by calculating a conditional probability between every two of the query terms. As shown in FIG. 6, each element “Pi, j” in the related score matrix “P” represents a conditional probability between “Termi” and “Termj”, where Pi, j=P((Termi∩Termj)|Termi). For example, assume that an occurrence number of a term “A” is 100, and an occurrence number of a term “B” is 30 given the occurrence of the term “A”. Thus, P(AωB)|A)=0.3, that is, the related score from the term “A” to the term “B” is 30%.

In other embodiments, the second calculation module 202 may calculate the related score using other methods to obtain the related score matrix. For example, assume that a direct relationship from the term “A” to the term “B” is 100, and a direct relationship from the term “B” to a term “C” is 300, where no other terms have a direct relationship with the term “B,” which is referred to as A→B→C. Thus, a total related value of the term “B” equals to (100+300)=400, where the term “A” occupies 100 (i.e., 25%), the term “C” occupies 300 (i.e., 75%). That is to say, the related score between the term “B” and the term “C” equals to 0.75, and an indirect relationship between the term “A” and the term “C” equals to 100*0.75=75. Using this method, the second calculation module 202 may calculate the related score between every two terms of the query terms in FIG. 5, and obtain a related score matrix “P′,” which is shown in FIG. 7, according to the calculated related scores.

In block S3, the third calculation module 203 calculates an indirect relationship “R′i, j” between every two of the query terms according to the direct relationship “Ri, j” and the related score “Pi, j” between every two terms, and stores the calculated indirect relationships in the storage device 23. In one embodiment, the indirect relationship “R′i, j” between every two terms of the query terms is calculated by a formula of R′i,jK=1nRi,k*Pk,j,k≠i,j, where the variable “n” represents a total number of the query terms, for example, n=7 as shown in FIG. 4. FIG. 8 shows an exemplary topological diagram of the indirect relationship between “Term1” and other query terms. FIG. 9 shows an example of an indirect related matrix “R′” created from the topological diagram of FIG. 8, where each element “R′i, j” in the indirect related matrix “R′” represents an indirect relationship between “Termi” and “Termj”.

In block S4, the related term determining module 204 determines indirect terms of each query term according to the indirect relationship between every two terms of the query terms, and stores the determined indirect terms in the storage device 23 of the electronic device 2. Then, the searching module 205 performs a search operation according to the determined indirect terms to obtain search results from a data source, and displays the search results on the display device 20 of the electronic device 2. The data source may be the Internet, at least one database, or at least one file system. In one embodiment, the related term determining module 204 determines that a first term of the query term is the indirect term of a second term of the query terms if the indirect relationship between the first term and the second term is greater than or equal to a preset value. The preset value may be 1.0. For example, as shown in FIG. 9, the indirect terms of “Term1” include “Term3,” “Term4,” “Term5,” and “Term7” whose indirect relationships are greater than 1.0.

In one embodiment, if the term “A” has the direct relationship with the term “B,” and the term “B” further has the direction relationship with the term “C,” which is referred to as A→B→C. Then the related term search system 24 determines that the term “A” has the indirect relationship with the term “C”, which is called a second-level relationship. In other embodiments, the system 24 may determine a third-level relationship or multi-level relationship using the above-mentioned method. For example, if the term “A” has the direct relationship with the term “B,” the term “B” further has the direction relationship with the term “C,” and the term “C” further has the direction relationship with a term “D,” which is referred to as A→B→C→D. Then the system 24 determines that the term “A” has the indirect relationship with the term “D”, which is called the third-level relationship.

It should be emphasized that the above-described embodiments of the present disclosure, particularly, any embodiments, are merely possible examples of implementations, merely set forth for a clear understanding of the principles of the disclosure. Many variations and modifications may be made to the above-described embodiment(s) of the disclosure without departing substantially from the spirit and principles of the disclosure. All such modifications and variations are intended to be included herein within the scope of this disclosure and the present disclosure and protected by the following claims.

Claims

1. A method for calculating indirect relationships between a plurality of query terms using an electronic device, the method comprising:

obtaining the plurality of query terms from a storage device of the electronic device;
calculating a direct relationship “Ri, j” between every two of the query terms;
calculating a related score “Pi, j” between every two of the query terms; and
calculating an indirect relationship “R′i, j” between every two of the query terms according to the direct relationship “Ri, j” and the related score “Pi, j” between every two of the query terms.

2. The method according to claim 1, further comprising:

determining indirect terms of each query term according to the indirect relationship between every two of the query terms, and storing the determined indirect terms in the storage device of the electronic device.

3. The method according to claim 2, further comprising:

obtaining search results from a data source by performing a search operation according to the determined indirect terms, and displaying the search results on a display device of the electronic device.

4. The method according to claim 2, wherein the related score “Pi, j” between every two of the query terms is obtained by calculating a conditional probability between every two of the query terms.

5. The method according to claim 2, wherein the indirect relationship “R′i, j” between every two of the query terms is calculated by a formula R′i, j=ΣK=1nRi,k*Pk,j,k≠i,j, wherein the variable “n” represents a total number of the query terms.

6. The method according to claim 5, wherein a direct related matrix “R” is generated according to the direct relationships “Ri, j”, a related score matrix “P” is generated according to the related scores “Pi, j”, and the indirect relationship “R′i, j” is calculated using the direct related matrix “R” and the related score matrix “P” according to the formula.

7. The method according to claim 2, wherein the determining step comprises: determining that a first term of the query term is the indirect term of a second term of the query terms upon the condition that the indirect relationship between the first term and the second term is greater than or equal to a preset value.

8. The method according to claim 7, wherein the preset value is 1.0.

9. An electronic device, comprising:

a processor;
a storage device storing a plurality of instructions, which when executed by the processor, causes the processor to:
obtain a plurality of query terms from the storage device;
calculate a direct relationship “Ri, j” between every two of the query terms;
calculate a related score “Pi, j” between every two of the query terms; and
calculate an indirect relationship “R′i, j” between every two of the query terms according to the direct relationship “Ri, j” and the related score “Pi, j” between every two of the query terms.

10. The electronic device according to claim 9, wherein the plurality of instructions further comprise:

determining indirect terms of each query term according to the indirect relationship between every two of the query terms, and store the determined indirect terms in the storage device.

11. The electronic device according to claim 10, wherein the plurality of instructions further comprise:

obtaining search results from a data source by performing a search operation according to the determined indirect terms, and displaying the search results on a display device of the electronic device.

12. The electronic device according to claim 10, wherein the related score “Pi, j” between every two of the query terms is obtained by calculating a conditional probability between every two of the query terms.

13. The electronic device according to claim 10, wherein the indirect relationship “R′i, j” between every two of the query terms is calculated by a formula R′i,j=ΣK=1nRi,k*Pk,j,k≠i,j, wherein the variable “n” represents a number of the query terms.

14. The electronic device according to claim 10, wherein the instruction of determining indirect terms of each query term according to the indirect relationship between every two of the query terms comprises: determining that a first term of the query term is the indirect term of a second term of the query terms upon the condition that the indirect relationship between the first term and the second term is greater than or equal to a preset value.

15. A non-transitory storage medium having stored thereon instructions that, when executed by a processor of an electronic device, causes the processor to perform a method for calculating indirect relationships between a plurality of query terms, the method comprising:

obtaining the plurality of query terms from a storage device of the electronic device;
calculating a direct relationship “Ri, j” between every two of the query terms;
calculating a related score “Pi, j” between every two of the query terms; and
calculating an indirect relationship “R′i, j” between every two of the query terms according to the direct relationship “Ri, j” and the related score “Pi, j” between every two of the query terms.

16. The non-transitory storage medium according to claim 15, wherein the method further comprises:

determining indirect terms of each query term according to the indirect relationship between every two of the query terms, and storing the determined indirect terms in the storage device of the electronic device.

17. The non-transitory storage medium according to claim 16, wherein the method further comprises:

obtaining search results from a data source by performing a search operation according to the determined indirect terms, and displaying the search results on a display device of the electronic device.

18. The non-transitory storage medium according to claim 16, wherein the related score “Pi, j” between every two of the query terms is obtained by calculating a conditional probability between every two of the query terms.

19. The non-transitory storage medium according to claim 16, wherein the indirect relationship “R′i, j” between every two of the query terms is calculated by a formula R′i,j=ΣK=1nRi,k*Pk,j,k≠i,j, wherein the variable “n” represents a total number of the query terms.

20. The non-transitory storage medium according to claim 16, wherein the determining step comprises: determining that a first term of the query term is the indirect term of a second term of the query terms upon the condition that the indirect relationship between the first term and the second term is greater than or equal to a preset value.

Patent History
Publication number: 20130262456
Type: Application
Filed: May 31, 2013
Publication Date: Oct 3, 2013
Inventors: CHUNG-I LEE (NEW TAIPEI), CHIEN-FA YEH (NEW TAIPEI), CHIU-HUA LU (NEW TAIPEI), GEN-CHI LU (NEW TAIPEI)
Application Number: 13/906,380
Classifications
Current U.S. Class: Ranking Search Results (707/723); Ranking, Scoring, And Weighting Records (707/748)
International Classification: G06F 17/30 (20060101);