SEAMLESS INTEGRATION OF MODULES FOR SEARCH ENHANCEMENT
There is provided a system for converting a search query into an interpreted search query for searching a plurality of objects of an object-related dataset using a search engine, the system comprising: at least one hardware processor executing a code for: receiving, using a software interface, a search query provided by a user for at least one object of the plurality of objects, parsing the search query to identify a plurality of search terms, converting at least one of the plurality of search terms into at least one interpreted query term, creating an interpreted search query based on the at least one interpreted query term, and providing, using the software interface, the interpreted search query for execution by the search engine by matching the interpreted search query with at least one modeled object representation of each of the plurality of objects defined in an enhanced indexed object dataset.
This application claims the benefit of priority of U.S. Utility patent application Ser. No. 15/275,620 filed on Sep. 26, 2016, the contents of which are incorporated herein by reference in their entirety.
FIELD AND BACKGROUND OF THE INVENTIONThe present invention, in some embodiments thereof, relates to web searches and, more specifically, but not exclusively, to product-related searches in e-commerce websites.
Known e-commerce product search services utilize keyword search engines. Such engines don't take into account relations between e-commerce concepts and frequently return unsuitable results, making the search for a desired product or for a list of suitable products very cumbersome and inconvenient for a user, which is forced to perform many search iterations and/or eventually may fail to find what he looks for.
SUMMARY OF THE INVENTIONAccording to a first aspect, a system for converting a search query into an interpreted search query for searching a plurality of objects of an object-related dataset using a search engine, comprises: at least one hardware processor executing a code for: receiving, using a software interface, a search query provided by a user for at least one object of the plurality of objects, parsing the search query to identify a plurality of search terms, converting at least one of the plurality of search terms into at least one interpreted query term, creating an interpreted search query based on the at least one interpreted query term, and providing, using the software interface, the interpreted search query for execution by the search engine by matching the interpreted search query with at least one modeled object representation of each of the plurality of objects defined in an enhanced indexed object dataset.
According to a second aspect, a method of converting a search query into an interpreted search query for searching a plurality of objects of an object-related dataset using a search engine, comprises: receiving, using a software interface, a search query provided by a user for at least one object of the plurality of objects, parsing the search query to identify a plurality of search terms, converting at least one of the plurality of search terms into at least one interpreted query term, creating an interpreted search query based on the at least one interpreted query term; and providing, using the software interface, the interpreted search query for execution by the search engine by matching the interpreted search query with at least one modeled object representation of each of the plurality of objects defined in an enhanced indexed object dataset.
The systems and/or methods (e.g., code instructions stored in a storage device executed by one or more processors) described herein provide a technical solution to the technical problem of converting a search engine that searches for products in a product dataset (e.g., of an e-commerce site) into a search engine with enhanced search capability without necessarily reformatting the code of the search engine (apart from adapting the search engine for communication with the software interface). The enhanced search capabilities may include enhancing the ability of a search engine that is designed to receive text search queries to process images used as search queries, and/or to enhance the ability of the search engine to utilize natural language understanding (NLU) technology. The technical solution to the technical problem is based on a code interface (e.g., script, application programming interface (API), software development kit (SDK)) that interfaces with the search engine. The interface (e.g., API) receives the search query from the search engine, converts the search query into an interpreted search query understood by the search engine when searching using the enhanced indexed product dataset, and provides the interpreted search query to the search engine for searching within the product dataset. The API may be used by a certain search engine, and/or by different search engines (e.g., having different internal architectures and/or different internal implementations) that may search different product-related datasets, regardless of the implementation of the search engine, and optionally while maintaining additional search result ranking considerations.
In a first possible implementation of the system according to the first aspect or the method according to the second aspect, each of the modeled object representations includes at least one model element and hierarchic relations between model elements of a hierarchic object model.
In a second possible implementation of the system or the method according to the first implementation of the first aspect or the second aspect, the conversion of each of the plurality of search terms into at least one interpreted query term is based on matching at least one of: synonyms, similar terms, hierarchy, and element distance of the matching model element of the hierarchic object model.
In a third possible implementation of the system or the method according to the first or second implementation of the first or second aspect, each of the plurality of search terms is parsed by at least one of: matching to at least one model element of the hierarchic object model of the plurality of objects, and based on linguistic set-of-rules.
In a fourth possible implementation of the system or the method according to the first or second or third implementation of the first or second aspect, each search term is analyzed into a plurality of interpretations matching to a plurality of model elements of the hierarchic object model, and converted into a plurality of interpreted query terms based on the plurality of interpretations.
In a fifth possible implementation of the system or the method according to the first or second or third or fourth implementation of the first or second aspect, the converting at least one of the plurality of search terms into at least one interpreted query term is performed based on each matching model element of the hierarchic object model.
In a sixth possible implementation of the system or the method according to the first or second or third or fourth or fifth implementation of the first or second aspect, the hierarchic object model comprises at least one enriched attribute of the at least one model element, wherein the at least one enriched attribute is calculated based on at least one of: on other attributes of a concept of the respective object and metadata of the respective object, wherein a concept is a model element representing a certain kind of objects or parts of objects and an attribute is a model element representing properties of the concept.
In a seventh possible implementation of the system or the method according to the sixth implementation of the first or second aspect, the metadata of the respective object includes one or more of the following: image, price, text, user information input, specification, title, description, overview, and review.
In an eighth possible implementation of the system or the method according to the sixth or seventh implementation of the first or second aspect, the enriched attribute is calculated based on at least one of: a formula representing domain expert knowledge of relationships between the attributes, and an automated machine learning process.
In a ninth possible implementation of the system or the method according to the sixth or seventh or eighth implementation of the first or second aspect, the enriched attribute is a Boolean value denoting at least one of a type and a suitability for at least one of a certain application and a certain use.
In a tenth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the search query comprises an input text provided by the user defining a natural language search query.
In an eleventh possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the search query comprises an image.
In a twelfth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the software interface comprises an application programming interface (API) in communication with a search server hosting the search engine.
In a thirteenth possible implementation form of the system according to the twelfth implementation form of the first or second aspects, the API is hosted by an interpretation server located externally and remotely from the search server, wherein the interpretation server and the search server communicate over a network using the API.
In a fourteenth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the system further comprises code instructions executable by the at least one hardware processor and/or the method further comprises creating structured data for each of the plurality of objects, by: receiving using the software interface, object-related data for each of the plurality of objects; extracting features from the object-related data for each of the plurality of objects; creating structured data for each of the plurality of objects using the corresponding extracted features; and transmitting, using the software interface, the created structured data for each of the plurality of objects to a server associated with the search engine for creation of the enhanced indexed object dataset by integration of the structured data with an existing object dataset searched by the search engine.
In a fifteenth possible implementation form of the system according to the preceding fourteenth implementation form of the first or second aspects, the object-related data includes one or more members selected from the group consisting of: natural language object-related data, images, videos, content of links associated with the product, user queries, specifications, titles, descriptions, overviews, and reviews.
In a sixteenth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the code executed by the at least one hardware processor is external to the code of the search engine, wherein the code is implemented without modification to the code of the search engine apart from code associated with communication between the search engine and the software interface.
In a seventeenth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, at least one of the plurality of search terms does not correspond to objects defined by the object-related dataset.
In an eighteenth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the interpreted search query is at least one of: formatted in a query language of the search engine, and formatted in a format that is designed for translation to a plurality of engine query formats of a corresponding search engine of a plurality of search engines.
In a nineteenth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the interpreted search query is associated with a mapping defining which of the plurality of search terms corresponds to which of the at least one interpreted query term.
In a twentieth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the at least one interpreted query term is associated with a data-element defining an extent to which the respective interpreted query term is prototypical denoting a quantitative measure of how much the respective interpreted query term conforms with a definition of a higher-level concept.
In a twenty-first possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the at least one interpreted query term is associated with a data-element defining a negative intent of the respective interpreted query term denoting non-existence of the respective interpreted query term.
In a twenty-second possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the at least one interpreted query term is associated with a distribution based on a value of at least one of: an attribute, a concept, a combination of attributes, and a combination of concepts associated with the at least one interpreted term.
In a twenty-third possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the conversion is performed according to a preference for the at least one interpreted query term by the user entering the search query.
In a twenty-fourth possible implementation form of the system according to the first aspect as such or the method according to the second aspect as such, or according to any of the preceding forms of the first or second aspects, the interpreted search query includes instructions for execution by the search engine for matching and ranking the matching modeled object representation defined in the indexed object dataset.
Some embodiments of the invention are herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the invention may be practiced.
In the drawings:
An aspect of some embodiments of the present invention relate to a system and/or method (e.g., code instructions stored in a storage device executed by processor(s)) that enhance a search engine using a software interface (e.g., API) that communicates with the search engine. The API extracts features from object-related data (that may include unstructured data), to create structured object data. The structured object data is transmitted to the search engine (and/or to another computing device associated with the search engine) using the API. The search engine (and/or another computing device associated with the search engine) creates an enhanced object dataset. The enhanced object dataset is created from an existing object dataset used by the search engine by integrating the structured object data received from the API. The enhanced object dataset may be created by code associated with the API, the search engine, and/or by other code. The enhanced structured object dataset is used by the search engine for searching using an interpreted search query received from the API. The interpreted search query is created by code associated with the API from a search query (e.g., a natural language query, an image) entered by a user for searching objects using the search engine. The interpreted query, when used by the search engine to search the enhanced indexed object dataset, enhances the search results returned by the search engine in comparison to a search engine executing the user entered query using standard methods on a standard object dataset. The interpreted query (and/or the search engine) may include additional instructions that define how the returned search results are processed by the search engine. For example, the search engine may return results that are more relevant to the search query and/or results that are more highly ranked, in comparison to results that would be returned using standard searching methods. The search engine may dynamically adapt (based on the additional instructions) the processing of the results, for example, for the query chairs to return only chair products, and for the search query laptop return both laptop and laptop accessory products.
It is noted that as used herein, the term API denotes an exemplary but not necessarily limiting implementation of the software interface. Other software interfaces may be implemented. The term API used here may be substituted with other software interface implementations.
The structured product data of each product and interpreted search query enhance the search results returned by the search engine by reducing irrelevant results and/or providing the irrelevant results with lower rankings than additional relevant results and/or including relevant results which may not be found using standard methods. For example, when executed using standard search methods the query chair without wheels may result in chairs with wheels being ranked higher than chairs without wheels, because the terms chair and wheels both appear in product title/description/specs of chairs with wheels, whereas only chair appears in title/description/specs of chairs without wheels. The API described herein solves the described technical problem, expressing a non-existence intent with respect to wheels in the interpreted query, resulting in the search engine correctly returning and/or ranking chairs without wheels. In another example, using standard methods, if smartphone cases are clicked on much more frequently than smartphones, the search engine might lead to black cases presented before black phones as results for the query black iphone. The API solves this technical problem by interpreting the product type intent correctly.
Some embodiments of the present invention provide a system and method for enhanced interpretation of product-related data, according to e-commerce expert knowledge (which may be manually entered and/or automatically learned by machine learning methods) implemented in a hierarchic product database model. The enhanced interpretation provided by some embodiments of the present invention facilitates transition of data from a website, e-commerce database or any other suitable data source, to an indexed model-based hierarchic database.
Arrangement of an e-commerce website product data in such model-based database facilitates interpretation of the database and the search terms according to the same model, and thus, for example, extraction of data that suits better the intentions of a user.
Some embodiments of the present invention provide a hierarchic database model representing comprehensive knowledge about cataloging and categorization of e-commerce products. The hierarchic model is used by a natural language analyzer (NLA) to provide enhanced indexing and search query interpretation, according to some embodiments of the present invention.
A hierarchic database model, according to some embodiments of the present invention, may include a hierarchy of concepts, i.e. database model nodes that represent product categories, rooted in a general super-concept of products such as e-commerce products.
A concept is a model element representing a certain kind of products or parts of products. For example, a concept “Chair” is the concept of all chair products, a concept “Laptop” is the concept of all laptop computer products, and the like. Concepts may include information about relations to other concepts, such as inheritance or meronomy (i.e. partonomy). For example, a concept “Seat” may represent a component of the concept “Chair” and therefore may include a meronomy relation to the Chair concept. Some concepts may represent products that may be sold as a separate product by themselves or as components of other products.
For example, a computer processor may be sold as a separate product or within a computer.
Each concept in the hierarchic model may have attributes representing properties of the concept and possible values of these attributes. For example, a Laptop concept may have a Weight attribute. For each attribute, the model defines the types of values that can be assigned to this attribute. Possible types of values may include, for example, a number, a string, a size (number+unit), and/or an instance of another concept.
The model may define a list of specific possible values that are allowed as values of the attribute, and/or it a range of possible values. For example, the model may define a minimal and maximal numeric value allowed for the attribute. In some embodiments, each attribute includes information about distribution of values of this attribute.
An instance of a concept represents an entity whose type is the concept. For example, an instance of the concept “mobile phone” may include a certain brand, color, size, and/or any other property of the instance. An instant of the concept has specific values of some or all of its attributes, i.e. an instance contains an assignment of specific values for some or all of its concept's attributes, in accordance with the defined value-types of the attributes as well as the possible values or value range defined for the attribute.
The interpretation system according to some embodiments of the present invention analyses texts and/or information input about products. The product-related data may be additional data which is not structured within a product-dataset that is searched by the search engine when standard methods are used. The systems and/or methods described herein analyze and make available such information for searching products, by extracting features from the additional product-related data, as described herein. Examples of external information sources that include product-related data that is passed to the API for feature extraction include non-text sources, for example images, videos, and/or other text-based sources for example, content of links associated with the product, and text sources such as, for example, user queries, specifications, titles, descriptions, overviews and/or reviews, and produces a structured representation of the input's meaning, formed of concepts, attributes, relations, values and/or constraints of the model. The extracted features being analyzed may not carry any semantic meaning and/or be coordinated in advance with the search engine, for example, arbitrary codes that are produced by the code of the API (e.g., NLA) but are meaningless to the search engine may be used. For example, bar codes, QR codes, or other codes may be used. The codes may represent an encryption. For example, color=FX321435Y, blue=YY24wxd2. The arbitrary codes may be used internally by the code of the API (e.g., NLA), while remaining hidden from the user entering a query, and/or the search engine performing a search.
The extracted features may be distinct. The search engine may select a sub-set of the extracted features for use, and/or to relate to different features in a different way. For example for the search query blue leather chair, the corresponding interpreted search query (created as described herein) is [concept=chair, color=blue, material=leather]. The search engine may decide to use the concept as a filter, and return only chairs, and then to use the color and material to boost (i.e., show higher in the results) those which are blue and/or made of leather (allowing chairs from other colors and materials but not objects which are not chairs). The search engine may assign different significance to different extracted components.
According to some embodiments of the present invention, the interpretation system(s) and/or method(s) may execute a natural language analyzer (NLA) for interpreting terms in a context of product-related data according to the hierarchic product model.
The interpretation process may create a dedicated e-commerce database indexed and constructed according to the hierarchic product model by obtaining the NLA's term interpretations, and based on the interpretations indexing the product-related data into elements of the model-based database. Accordingly, a product-related data may be represented by elements of the model and the relations between them.
Further, the interpretation system provided by embodiments of the present invention may interpret by the NLA search queries and create model-based representations of the search queries, thus enabling a search server to perform model-based search in the website indexed database. A search server according to some embodiments of the present invention may match between the model-based representations of the search queries and elements of the model-based website database, thus finding in the database suitable products to output as search results.
The provided model-based database, searchable by the model-based representations of search queries, is designed to improve the way e-commerce data is stored and retrieved from the database, and thus improves the database operation.
By using the hierarchic model and the NLA provided by some embodiments of the present invention, the system and method described herein solves the problem of inaccurate search results in e-commerce website, and provides, in response to product-related searches, product suggestions that match the meaning of the search query as the user intended.
This is thanks to an indexed database based on a hierarchic model and implementation in the hierarchic model of common uses of terms and/or values and relations between terms and/or values, along with limited value-spaces and/or value distributions.
It will be appreciated that as used herein, a term may include one or more words.
An aspect of some embodiments of the present invention relates to systems and/or methods that use a software interface (e.g., API) to convert a search query into an interpreted search query for searching for objects by a search engine. The software interface converts the search query into the interpreted search query without necessarily modifying the internal code of the search engine, for example, the search engine transmits the search query received from a user to a computing device (e.g., server) using the software interface and executes the received interpreted search query which is received using the software interface. It is noted that the search engine's code might be adapted for communicating with the software interface (e.g., API) described herein. The search query may include one or more terms which are not necessarily meaningful to the search engine and/or that are unmatchable to one or more products when searched by the search engine, and/or that return irrelevant results by matching to irrelevant products (i.e., not intended to be searched by the user). In the case where the search engine executes the query, for example blue dress, there may be products which may be exactly matched, but cannot be retrieved by the search engine because the products are not understood. For example, the color may appear only in image(s) and not in text (the systems and/or methods described herein are able to identify from image(s) that the color is blue), the product may be wrongly listed in the shirts category (the systems and/or methods described herein are able to interpret that it's a dress regardless), and/or the color may be described as navy and the search engine doesn't know that navy is a form of blue (the systems and/or methods described herein are able to interpret that navy is a shade of blue).
It is noted that the search engine executing the interpreted search query may still return irrelevant results. Using the interpreted search query, the search engine may be able to improve ranking of the most relevant results, with the irrelevant results (that would otherwise be returned when searching using standard methods) being ranked lower.
For example, execution of the search query office chair with wheels using standard search methods often returns chairs, wheels, and wheel chairs. Furthermore, executing the search query office chair without wheels using standard search methods typically returns chairs with wheels first because the search engine is unable to interpret the significance of the term without. In another example, the search term computers for kids may not be entirely understood by the search engine, as the term for kids may not be matched to any products. The search engine may return products that are different than the users intent, for example, returning books about teaching computers to kids, computer toys, and laptop covers with kids photos. The search engine can search for computers while ignoring for kids, thus returning computers that are irrelevant to kids, for example, expensive computers, computers with processing capabilities for advanced design work, and heavy desktop servers. Using the API and computing device described herein, the search engine may search using the interpreted search query, and return relevant results that are closer to the user's intent, for example, low cost computers, portable computers, computers designed for games, and ruggedly designed computers.
The computing device receives using the API, an input query entered by a user, for example, a text and/or image (e.g., typed, using a voice to text interface, selected from a list, uploaded, or other methods) defining the search query for one or more products. The search query is parsed into searched terms. Parsing may be performed by matching each term to model element(s) of a hierarchic product model (created as described herein). For example, the term color (which may not be meaningful to the search engine) may be matched to one or more concepts that represent the attribute color in the hierarchic product model, and the term blue may be its value. Alternatively or additionally, the parsing may be performed by interpreting the search query based on linguistic set-of-rules (e.g., patterns), which may be stored, for example, as a set-of-rules, code, a look-up table, a mapping database, or other implementations. The linguistic set-of-rules may be, for example, manually programmed, and/or automatically learned using machine learning techniques. For example, for the search query smartphone under $1000, the term under may be interpreted as a quantitative inequality intent, which for the described example is based on price (i.e., price <1000). The term under may not necessarily be included in the hierarchic product model. In another example, for the search query cheap smartphone, the term cheap may be interpreted as price <700.
Each of the search terms is converted into one or more interpreted query terms, which may be based on synonyms and/or similar terms and/or hierarchy and/or element distance of the matching model element. For example, the term blue is converted based on hierarchy into navy and/or other shades of blue.
Element distance may also be referred to herein as weighted similarity.
In terms of element distance, different close elements may be assigned scores that denote a “distance” to the intent of search query. Distance may be computed or product types, attributes, and/or other parameters described herein. The distance may be a statistical distance, for example, computed based on a correlation, and/or a Euclidean distance within a space. In an example based on element distance between attributes, when a search is entered for a color of a product that is not available, for example the query bright orange business suit, the converted query may include instructions for the search engine to search for the nearest available color, for example, light brown business suit. In an example based on distance between products, when the search query is for sandals, the interpreted query may include instructions (optionally weighted instructions) to search for flip flops.
In another example, the term for kids may be converted into a set of attributes that are defined as for kids, for example, weight, portability, cost, size, and ruggedness. In yet another example, the term furniture might be converted into a list of multiple possible furniture types, for example, chair, table, sofa, cabinet, and the like. In yet another example, the term for kids may be an existing Boolean enhanced attribute defined in the hierarchic model (as described herein). The interpreted search query is created based on the interpreted query terms, and transmitted to the search engine using the software interface. The search engine uses the interpreted search query to search an indexed product dataset (created as described herein).
The systems and/or methods (e.g., code instructions stored in a storage device executed by one or more processors) described herein provide a technical solution to the technical problem of converting a search engine that searches for products in a product dataset (e.g., of an e-commerce site) into a search engine with enhanced search capability without necessarily reformatting the code of the search engine (apart from adapting the search engine for communication with the software interface). The enhanced search capabilities may include enhancing the ability of a search engine that is designed to receive text search queries to process images used as search queries, and/or to enhance the ability of the search engine to utilize natural language understanding (NLU) technology. The technical solution to the technical problem is based on a code interface (e.g., script, application programming interface (API), software development kit (SDK)) that interfaces with the search engine. The interface (e.g., API) receives the search query from the search engine, converts the search query into an interpreted search query understood by the search engine when searching using the enhanced indexed product dataset, and provides the interpreted search query to the search engine for searching within the product dataset. The API may be used by a certain search engine, and/or by different search engines (e.g., having different internal architectures and/or different internal implementations) that may search different product-related datasets, regardless of the implementation of the search engine, and optionally while maintaining additional search result ranking considerations.
The systems and/or methods (e.g., code instructions stored in a storage device executed by one or more processors) described herein improve performance of a search engine executing on a search server, by extending the ability of the search engine to execute search queries, which may be written using natural language, and to retrieve relevant results based on the search query. The performance of the search engine is improved without necessarily considering the implementation and/or modifying the product-related dataset that is normally searched by the search engine, by enhancing the general search engine index (that would otherwise be used by the search engine) using the features extracted from the product-related dataset, creating an enhanced indexed product dataset that is searched by the search engine using the interpreted search query based on the search query.
It is noted that the indexed product dataset may include the original dataset from which the product data was obtained, an external dataset external to the search engine, a dataset integrated with the search engine (e.g., the dataset currently being used by the search engine prior to enhancement), and/or a previous version of the indexed product dataset that is updated with the new structured object data.
The API (or other interface) described herein more accurately interprets the intent of the user entering the search query, improving the computational efficiency of the search engine by more accurately matching the search query to the available products. The search for products in the product dataset by the search engine is improved, to more accurately identify products relevant to the user's search query, and/or to more completely identify additional products that are relevant to the user's query and would not be available without using the API. For example, improving the order of the result, and/or improving recall of objects that would be found without implementation of the API (many of the matches being irrelevant). The total search time it takes the user to find the object the user is interested in is thereby reduced, improving utilization of processor(s) of the search server, for example, by the user performing a single search that produces relevant results, rather than the user performing multiple searches using refined queries to try and locate the relevant results.
The systems and/or methods (e.g., code instructions stored in a storage device executed by one or more processors) described herein improve an underlying technical process within the technical field of search engines that search for products in a product-related dataset. The systems and/or methods (e.g., code instructions stored in a storage device executed by one or more processors) described herein may improve an underlying technical process within the technical field of NLP.
The systems and/or methods (e.g., code instructions stored in a storage device executed by one or more processors) described herein are tied to physical real-life components, including search servers, and data storage devices.
Accordingly, the systems and/or methods described herein are inextricably tied to computing technology and/or are network-centric to overcome an actual technical problem arising in processing of search queries by a search server.
Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth in the following description and/or illustrated in the drawings and/or the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.
The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Python, Javascript, Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
As used herein, the term natural language search query means a search query entered using human understood words and terms, in any language. The terms may be found in a dictionary, or commonly understood terms not necessarily found in a dictionary, for example, slang words. It is noted that the natural language search query may not necessarily be grammatically correct, or following conventional language structure. For example, the following natural language search queries may be entered by a user to mean a query a dress with a v-style neck which is also blue: blue v-neck dress or dress v-neck blue which is not even proper English.
The term search query used herein may refer to natural language search query, and/or to a query extracted from an image (also referred to herein as query image). It is noted that the image query is not necessarily used to search for image similarity. Features (e.g., textual based features) may be extracted from the image query for searching for objects that are similar in the enhanced index object dataset.
As used herein, the terms product, listing, and object may be interchanged. For example, when the objects being indexed and/or searched are not necessarily products, but representing other items that are searched by the search engine.
As used herein, reference to the natural language analyzer (e.g., 68 in
Reference is now made to
Model 100 includes inheriting concepts 22 which inherit attributes 30 directly from Product super-concept 20 and optionally may include further attributes 32. An inheriting concept 22 may restrict a value space 40 of an attribute 30 to possible values that are relevant to the concept 22. For example, inheriting concept 22 may include Electronics concept 22a, Furniture concept 22b, Clothing concept 22c, and/or any other suitable concept that inherit directly from Product super-concept 20. A value space 40a of Electronics Brand attribute 30a of Electronics concept 22a, is restricted to electronics brands and may exclude, for example, the value ‘Zara’ that may belong to a value space 40c of Clothing Brand attribute 30c of Clothing concept 22c.
Several inheriting concepts 24 may inherit from an inheriting concept 22. For example, Camera concept 24a, Computer concept 24b, Audio System concept 24c and/or any other suitable concept may inherit from Electronics concept 22a. Each of these may have inheriting concepts as well, and so on. For example, a Laptop Computer concept 26 may inherit from Computer concept 24b.
Some attributes may include instances of concepts, for example concepts with specific values. For example, a Computer Processor attribute 34a of Computer concept 24b may include an instance of a Processor concept 12, and/or a concept inheriting from the Processor concept.
The Computer Processor concept represents both a stand-alone product that may be sold separately and a component of a Computer concept 24b. Similarly, a Memory attribute 34b of Computer concept 24b may include an instance of a Memory concept 14, and/or a concept inheriting from the Memory concept. Such instances may have their own attributes, i.e. secondary attributes. An instance value in a Computer Processor attribute 34a may have secondary attributes such as, for example, Memory Capacity attribute 341, Clock Speed attribute 342, Manufacturer attribute 343, and or any other suitable attribute of a computer processor. Each attribute may have a corresponding possible value space. For example, a Memory Capacity attribute 361 of instances in a Memory attribute 36a of Laptop concept 26 has a possible value space 46a, possibly stored along with the distribution of possible memory capacity values.
Some attributes are enriched attributes, i.e. attributes that are calculated based on other attributes, based on formulas representing domain expert knowledge and/or based on automated machine learning code about relationships between the attributes. For example, a Portability enriched attribute 36b may have a value space 46b of values indicating how portable the laptop is. The values are calculated based on various attributes such as weight, size, battery life and/or other suitable attributes.
For example, Suitability for Students enriched attribute 36c may have a value space 46c, the values of which may be calculated based on price, portability and/or other suitable attributes. Some attributes may have Boolean values, such as true or false. For example, Boolean value attributes may be attributes that define a type and/or suitability for certain application and/or use, for example, Waterproof (yes or no) or For Students (yes or no), and/or attributes that define inclusion of a certain feature, such as SD Slot (included or not) or USB Port (included or not), and/or other suitable attributes. When attributes are not explicitly defined, the API may deduce the exclusion of the attribute. For example, for a laptop for which a USB port is not explicitly defined, the API may deduce that no USB port exists and create the attribute USB Port=NO.
A NLA according to some embodiments of the present invention analyses, by using model 100, various texts about products, such as, for example, user queries, specifications, titles, descriptions, overviews and/or reviews. Based on the analysis, the NLA produces a structured representation of the texts' meaning, formed of concepts, attributes, relations, values and/or constraints of model 100.
Reference is now made to
System 300 includes an interpretation server 60 and a product model repository 62. Product model repository 62 may store model 100. Interpretation server 60 may include at least one processor 64, which may control and/or execute an indexing engine 66 and/or a NLA 68. As described herein, the API (or other software interface) provides an interface to the NLA 68. Optionally, as described herein, search server 90 accesses NLA 68 using the API, to provide the search query and receive the interpreted search query. The API may be implemented as the outermost layer of NLA 68.
It is noted that the NLA 68 described herein, and/or model 100 described herein are exemplary not necessarily limiting implementations that are accessed using the API. Other and/or additional code instructions defining other implementations may be used, for example, code instructions that analyze images, and/or code instructions that analyze structured features (i.e., non-natural language representations) and/or other code instructions that extract features to create the structured product data described herein and/or other code instructions that convert the search query (e.g., user entered, text and/or image) into the interpreted query described herein.
Interpretation server 60 may receive query input, for example a string of text relating to an e-commerce product, interpret the terms in the query input by NLA 68 and based on model 100, and create a modeled representation of the query input by indexing engine 66, in accordance with model 100. The query input may include text input, and/or structured data for example an attribute (e.g., color=blue), and/or an image.
It is noted that in some implementations (e.g., as described herein) search server 90 may store code instructions for executing indexing engine 66 to create indexed product dataset 70, when the implementations is based on API providing structured product data based on the product-related data for integration in the enhanced indexed product dataset 70.
As described in more detail herein, modeled representations of product-related data from a data source 80 may be stored structured product data that is integrated into an enhanced indexed product dataset 70. As further described in more detail herein, interpretation server 60 may use NLA 68 to generate representations of the search query according to model 100, thus enabling a search server 90 to perform search in indexed product dataset 70 according to the modeled representation of the search query.
For a term in the query input, as indicated in block 210, NLA 68 finds in model 100 stored in product model repository 62 one or more corresponding term appearances, which may include, for example, the same term, a similar term and/or a synonym. Each term appearance represents an interpretation of the term. A term appearance may be a concept, an attribute, or a value. Each term appearance has a location in the model, indicative of related concepts, attributes and/or values. It is noted that terms may be obtained from images, for example, by image processing code that converts images and/or portions of images to text and/or to attributes.
As indicated in block 220, NLA 68 may detect relations between appearances of the text input terms. Then, as indicated in block 230, NLA 68 may detect appearance of a term as a high-level concept in model 100. A high-level concept appearance may be a concept of high or highest hierarchy, for example relative to other concepts and/or appearances of the received query input. For example, a high-level concept appearance may be a concept that concept appearances of other terms of the same input inherit from directly and/or indirectly, and/or that attribute appearances of other terms are attributes of the high level concept or an inheriting concept.
Such high level concept appearance may be used to limit the possible interpretations of other terms. An interpretation that is not related to the detected high level concept may be rejected by NLA 68. At this stage, as indicated in block 240, NLA 68 may detect possible interpretations of at least some of the terms, i.e. determine that term appearance is possibly a correct interpretation of the term in the specific query input, for example, according to a relation of an interpretation to the high level concept.
Some concepts, attributes and/or values in model 100 may limit the possible interpretations to certain concepts, attributes, value spaces and/or values of the attributes. NLA 68 may filter the term appearances according to appearances of other terms of the same query input, in order to find the possible interpretations. That is, as indicated in blocks 250 and 260, NLA 68 may reject interpretations that are not consistent with the already determined interpretations, and determine further possible term interpretations, and so on, until no more interpretations can be determined and/or rejected.
In some embodiments of the present invention, system 300 may communicate with a user to request clarifications from a user regarding the input query. For example, when there is more than one possible interpretation, system 300 may ask for additional information to decide between the possible interpretations.
In some embodiments of the present invention, NLA 68 may receive along with a query input an indication of a product and/or a certain product instance, to which the query input relate.
In some embodiments, NLA 68 may receive along with a query input an indication of a concept and/or an attribute to which the query input relate. For example, NLA 68 may receive values of a processor's attribute along with an indication that the values pertain to a Processor concept, and/or along with an indication of a certain laptop model that the values pertain to. Accordingly, NLA 68 may limit the search for term appearances to attributes of a processor.
For example, NLA 68 may find an appearance of a term as a value of a Processor attribute, a Processor's Manufacturer attribute, a Clock Speed attribute, or a value of any other suitable processor's attribute. Since the term processor may have appearances as a concept or as an attribute which is also a concept, NLA 68 may also receive an indication of a concept to which the instance of the Processor attribute relate, such as a computer, a laptop computer, or another suitable machine. For example, NLA 68 may receive a query input “Intel 2.2 GHz” along with an indication that this is a value of a processor of a laptop. NLA 68 may search model 100 under the Laptop concept and the Processor attribute, and may find an appearance of “Intel” as a value of a Processor's Manufacturer attribute, and/or an appearance of “2.2 GHz” as a value of a Processor's Clock Speed attribute. Accordingly, NLA 68 may create a modeled product representation, wherein each value is ascribed to a certain concept and/or attribute.
For example, a certain Laptop product is represented by a Laptop concept instance, for which “Intel” is a value ascribed to a Manufacturer attribute of a processor concept/attribute, and/or 2.2 GHz is a value ascribed to a Clock Speed attribute of a processor concept/attribute.
In some embodiments of the present invention, interpretation server 60 uses instances of concepts that are implemented in the model. For example, NLA 68 may recognize that “Intel” is an instance of a “brand” or “manufacturer” concept, based on implementation of this knowledge in the model. In some embodiments of the present invention, instances of concepts are generated, for example based on the NLA's ability to recognize such instances in the query input. For example, interpretation server 60 may add to the model an instance of a concept “chair”, such as “safety chair”, in case the safety chair instance is needed for the interpretation process.
As mentioned above, system 300 may produce an indexed product dataset 70 based on product-related data, for example product-related text, obtained, for example received and/or extracted, from a product-related dataset stored in a data source 80. Data source 80 may be, for example, an e-commerce website or a dataset gathering knowledge about products presented in a website. More specifically, indexing engine 66, controlled by interpretation server 60, may use NLA 68 to create indexed product dataset 70, by indexing product-related data according to model 100.
Product dataset 70 may be dedicated and/or formed according to a specific data source or to a group of data sources. The product-related data may include a product specification, a table of a product's features, a table of values of a product's features, a product title, a product description, and/or any other suitable features about a product.
Further reference is now made to
Reference is now also made to
System 600 includes a computing device 606, which may be implemented, for example, as a server, a web server, and a computing cloud. Computing device 606 may correspond and/or be integrated with one or more components described with reference to interpretation server 60 of
Computing device 606 stores (e.g., in a program store 608 and/or data storage device 610) interpretation code 608A that converts a search query into an interpreted search query for execution by search engine 602, which is accessible using an interface 608B, optionally an API. API 608B conforms to one or more communication protocols defining transmission of data over network 616, for example, HTTP (hypertext transfer protocol). API 608B may be hosted by computing device 606, located externally and remotely from the search server hosting search engine 602. Computing device 606 stores feature extraction code 608C that extracted features that may be added to an index generally created by the search engine to create enhanced indexed product dataset 70 (and/or 602A), for example, as described with reference to system 300 of
Interpretation code 608A executed by computing device 606 is located externally to the code of search engine 602, and is implemented without modification to the code of search engine 602. Search engine 602 accesses interpretation code 608A using API 608B.
Alternatively, system 600 may be implemented based on other architectures. For example, components of computing device 606, optionally the code instructions (e.g., 608A-C) may be integrated with and/or stored on search server 604. The integration may be implemented as, for example, a single computing device executing interpretation code 608A-C (in which case API 608B may conform with one or more programming languages), different virtual machines executing the respective features of search server 604 and computing device 606, or two locally connected devices where network 616 is a point-to-point network or a cable (or short range wireless link) connecting the two devices.
Computing device 606 includes one or more processors 612 (e.g., corresponding to processor 64) that execute interpretation code 608A-C. Processor(s) 612 may be implemented, for example, as a central processing unit(s) (CPU), a graphics processing unit(s) (GPU), field programmable gate array(s) (FPGA), digital signal processor(s) (DSP), and application specific integrated circuit(s) (ASIC). Processor(s) 612 may include one or more processors (homogenous or heterogeneous), which may be arranged for parallel processing, as clusters and/or as one or more multi core processing units.
Program store 608 stores code instructions implementable by processor(s) 612, for example, a random access memory (RAM), read-only memory (ROM), and/or a storage device, for example, non-volatile memory, magnetic media, semiconductor memory devices, hard drive, removable storage, and optical media (e.g., DVD, CD-ROM).
Computing device 608 may include a data storage device 610 for storing data. Data storage device 610 may be implemented as, for example, a memory, a local hard-drive, a removable storage unit, an optical disk, a storage device, and/or as a remote server and/or computing cloud (e.g., accessed using a network connection). It is noted that code instructions executable by processor(s) 612 may be stored in data storage device 610, for example, with executing portions loaded into program store 608 for execution by processor(s) 612. Data storage device 610 may store, for example, product model repository 62, hierarchic object (e.g., product) model 610A, and/or the created indexed product dataset 70 (e.g., enhanced indexed object dataset 602A transmitted to search engine 602).
Computing device 606 may include a communication interface 614, optionally a network interface, for connecting to a network 616. Communication interface 614 includes API 608B that provides communication to search engine 602, as describe herein. Communication interface 614 may be implemented as, for example, one or more of, a network interface card, a wireless interface to connect to a wireless network, an internal bus to communicate with other components of a computing device, a software interface to communicate with other processes executing on the computing device, a physical interface for connecting to a cable for network connectivity, a virtual interface implemented in software, network communication software providing higher layers of network connectivity, and/or other implementations.
Exemplary network(s) 616 includes, for example, the internet, a private network, a wireless network, a cellular network, an ad-hoc network, and a virtual network.
Search server 604 may be implemented as, for example, a web server, a network server, a computing cloud, and a virtual server.
Search engine 602 is accessed by one or more client terminals 618 over network 616. Exemplary client terminals 618 include: a server, a computing cloud, a mobile device, a desktop computer, a thin client, a Smartphone, a Tablet computer, a laptop computer, a wearable computer, a glasses computer, and a watch computer. Each client terminal 618 is associated with a display (e.g., touch screen, screen) and/or other device (e.g., microphone, Braille output) that presents the search engine 602 (e.g., GUI, microphone that reads the search results), and includes a mechanism (e.g., keyboard, microphone with speech to text software) for a user to enter the search query. Search engine 602 may be implemented, for example, as an off-the-shelf product (e.g., Elasticsearch, Solr), or custom made.
It is noted that when the search query is provided to the interpretation code (via the API) for analysis (e.g., as text), the user using client terminal 618 does not necessarily need to enter the search query as text, for example, a GUI may be used to convert graphical or menu selections into text, a speech-to-text interface (e.g., code) may be used to generate the text version from spoken words recorded by a microphone, and/or an image entered as the query may be analyzed to interpret what is in the image and used as a search query (e.g., an image of a white table taken by user may be analyzed to generate the search query white table). When the query is entered as an image, the image may be converted into a structured query, optionally the search query.
Search engine 602 searches enhanced indexed product dataset 602A using the received interpreted search query, as described herein. The search results generated by the search engine are presented on client terminal 618, for example, as a list of products on a display, as a message presenting using a messaging application following an interaction with a conversational bot, a text-to-speech interface that reads our the search results, or other implementation. Computing device 606 includes or is in communication with a physical user interface 620 that includes a mechanism for a user to enter data and/or view presented data. Exemplary user interfaces 620 include, for example, one or more of, a touchscreen, a display, a keyboard, a mouse, and voice activated software using speakers and microphone.
The method described with reference to
The API described herein receives product-related data, which may include natural language product-related data, and/or other information (e.g., images, codes) for each product defined in a product-related dataset of an e-commerce data source that is searched by the search engine. The product-related data is used to extract features which are later used to enhance an indexed product dataset and/or modeled product representations by the computing device (as described herein). The extracted features are transmitted to the search server (or other storage device associated with the search engine) over the network using the API. As described herein, the search engine transmits the received search query (e.g., received by a user using a client terminal to access the search engine) to the computing device over the network using the API, and receives over the network using the API the interpreted search query used for search for products.
As indicated in block 410, interpretation server 60 may receive product-related data from data source 80. The product-related data may be received by the computing unit using the API.
As indicated in block 420, interpretation server 60 finds by NLA 68 interpretations of terms of the product-related data according to model 100, as described in detail with reference to
As indicated in block 430, indexing engine 66 stores the term interpretations as a modeled representation of the query input in indexed product dataset 70, i.e. an interpretation indexed according to model 100. One or more interpreted query inputs that relate to the same product may form a modeled product representation in indexed product dataset 70.
Interpretation server 60 may communicate using the API with a search server 90, which receives search queries, obtains from interpretation server 60 their modeled interpretation indexed according to model 100, and finds search results in indexed product dataset 70, by matching the modeled search query interpretation to corresponding modeled product representations stored in indexed product dataset 70.
Further reference is now made to
As mentioned with reference to
As indicated in block 520, interpretation server 60 finds by NLA 68 interpretations of terms of the search query according to model 100, as described in detail with reference to
As indicated in block 540, search server 90 may perform search in indexed product dataset 70, by detecting modeled product representations that match the modeled representation of the search query. As indicated in block 550, the detected matching product representations may be outputted by search server 90 as search results and/or query terms suggestions.
Optionally, the search server executes code (e.g., which may be downloaded from the computing device (e.g., 606, or another server) that amalgamates multiple different search result ranking considerations into a single search engine query. Exemplary considerations include search result relevance, platform business considerations, and the like. The code executed by the search server may interface with the API of the computing device.
NLA 68 may be configured to filter term interpretations in a pre-defined manner when corresponding pre-defined expressions indicative of relations, thresholds, and/or restrictions are included in the query input. For example, NLA 68 may be configured to perform filtering according to relative expressions such as, for example, less than, more than, between, cheap, large, small, best for, and/or other suitable relative expressions. For example, NLA 68 may be configured to interpret an expression, for example “cheap” or another suitable expression, as a command to filter out all products with a price above a predetermined threshold value. For example, the threshold value may depend on an identified product category, i.e. a product type concept identified in the query input. Other expressions may be interpreted by NLA 68 as commands to filter out products that don't fit a certain value. In some cases, NLA 68 is instructed by the command to calculate a certain value and/or threshold value based on one or more attributes, and filter out products that don't fit the value and/or the threshold value. Accordingly, NLA 68 may be configured to interpret various relative expressions as respective filtering commands, which may depend on pre-defined thresholds and/or values, and/or on identified concepts, attributes and/or values in the query input.
In some cases, NLA 68 may identify an attribute to which a value in a query input refers, according to a space of possible values of the attributes. For example, NLA 68 may receive a memory capacity value that may refer to a computer RAM or to a computer hard drive. However, NLA 68 may identify that the memory capacity value is included in the value space of the attribute Memory Capacity of the attribute/concept Computer RAM, and excluded from the value space of the attribute Memory Capacity of the attribute/concept Computer Hard Drive, or vice versa. Therefore, NLA 68 may ascribe the memory capacity value to the suitable attribute/concept.
In some embodiments of the present invention, NLA 68 may provide a ranking to the term interpretations. For example, NLA 68 may receive a memory capacity value that may refer to a computer RAM or to a computer solid-state drive (SSD). In case the value is included in both value spaces, NLA 68 may ascribe a higher ranking to the more common interpretation. As described herein, the value spaces may be stored along with a distribution of values, i.e. the extent in which each value is used. Accordingly, NLA 68 may identify in which value space the received memory capacity value is used more frequently, and ascribe a higher ranking to the interpretation that includes the corresponding attribute/concept.
In some cases, NLA 68 may identify an attribute to which a value in a query input refers, according to statistical knowledge implemented in model 100, ascribing customary attributes to concepts. For example, NLA 68 may receive a television size value. Model 100 may define the diagonal length as a main size attribute of a television. A customary attribute of a certain concept may be assigned with a higher score in model 100 then other attributes which are less suitable to the concept. For example, the diagonal length attribute of the television concept is assigned with a higher score than other size attributes such as width.
Therefore, NLA 68 may ascribe the value to a television diagonal length and not, for example, to a width, height, or depth of a television. Accordingly, NLA 68 may filter products according to this value of television diagonal length. In some other embodiments, NLA 68 may ascribe higher ranking to an interpretation that includes a television diagonal length and a lower ranking to an interpretation that includes a width, height, or depth of a television.
In some cases, NLA 68 may identify to which of several possible attributes and/or concepts a query input refers, according to the attributes each of the possible attributes and/or concepts may have. NLA 68 may receive a query input that includes an attribute name, which may be an attribute of only some kinds of possible attributes and/or concepts. Therefore, NLA 68 may filter out the attributes and/or concepts that don't have the named attribute. For example, since only some camera concepts include a prism, a query input that includes the term Prism may cause NLA 68 to filter out attributes and/or concepts that don't have a Prism attribute.
The hierarchic structure of model 100 may facilitate search results and/or interpretations that suit better the user's intention. For example, by interpreting terms according to model 100, NLA 68 may interpret two terms of a search query/query input as a concept and an attribute of the concept, respectively, rather than interpreting the two terms as two stand-alone concepts. For example, NLA 68 may receive a search query or another query input that includes the terms Smartphone and Camera. As described herein, NLA 68 may identify in the query input a high-level concept, i.e. a concept of high hierarchy and/or from which interpretations of other terms inherit or to which interpretations of other terms constitute attributes.
For example, NLA 68 may find that Camera is an attribute of a Smartphone concept, and that Smartphone is not an attribute of a Camera concept. Therefore, NLA 68 may identify Smartphone as a high-level concept and Camera as an attribute of the Smartphone concept. Accordingly, NLA 68 may interpret the query input as Smartphone with Camera. In case the query input further includes the term Sharp, NLA 68 may find in model 100 that Sharp is an attribute of the Camera attribute/concept, and interpret the query input as Smartphone with Sharp Camera.
As described herein, model 100 may include enriched attributes of concepts, i.e. attributes that are calculated based on other attributes and/or calculated based on lack of other attributes (e.g., USB port attribute missing from a laptop) and/or from an image. In case a term in a query input received by NLA 68 has appearance as an enriched attribute, it may restrict the possible interpretations as any other attribute. For example, a Stool concept in model 100 may include a counter height suitability attribute, with values calculated based on a Height attribute. Accordingly, NLA 68 may filter out products with false counter height suitability attribute, in case a query input includes the terms Counter Height Stool and/or synonymous terms. Similarly, NLA 68 may filter out products with false Suitability for Students attribute, in case a query input includes, for example, the terms Laptop for Students and/or synonymous terms.
As described herein, Suitability for Students attribute may be an enriched attribute calculated based on, for example, a Portability attribute and a Price attribute. The portability attribute may be an enriched attribute calculated based on, for example, a Weight attribute, a Size attribute, and/or a Battery Life attribute.
Reference is now made to
At 750, product-related data is obtained from product information server(s) 602 by computing device 606 using API 608B (or other software interface), optionally over network 616. Product information server(s) 622 may be associated with (and/or integrated with, and/or equivalent to) search server(s) 604 and/or search engine 602, and/or may include other external and/or remote servers storing data, for example, publicly accessible databases. The product-related data may be obtained by API 608B accessing the product-related data from product information server(s) 602 (and/or other data sources), and/or by product information server(s) 602 transmitting the product-related data to API 608B.
Product-related data is received for multiple products.
The received product-related data may include raw product listings, and/or structured (i.e., formatted) product listings. The product-related data may include the dataset searched by search engine 602, other data from the website of the entity associated with search engine 602, and/or data from external sources that may or may not be associate with search engine 602. The product listing may include one or more of: natural language information of the product, images, videos, content of links associated with the product, user queries, specifications, titles, descriptions, overviews, and reviews.
Other exemplary data includes: product features (e.g., product type such as chair and furniture, product attributes such as color, dimensions, material, possible product uses such as for outdoor use, for kids, for sports, and the like), product metadata (e.g., brand, release date, standard identification codes, manufacturer, model number, and the like), and/or listing features (e.g., price, reviews, shipping destinations, and the like).
Optionally, code associated with the search server organizes the product-related data in accordance with the structure of the indexed product dataset.
At 752, features are extracted from the received product-related data using feature extraction code 608C. The Examples of the extraction of features for each product include attributes of the products and/or other features described herein.
Structured data (also referred to herein as features), may be extracted directly from listing data (and/or other product-related data), and/or inferred. For example, when the product-related data includes the properties “product type=chair” and “color=navy”, the extracted features may include the properties “product type=1. chair; 2. furniture”, “color=1. navy; 2. blue”. The features (which are organized into the structured product data) may be inferred from certain values and/or relationships within the product-related data. For example, the feature “portable=false” may be inferred based on the structured product data lacking any mention of wheels, or the feature “seating capacity=3” may be inferred based on the product-related data describing the width of a sofa falling in a specific range.
At 754, structured data is created for each product using the respective features extracted from the product-related data received for each respective product. The structured data is created by structuring the set of features for each product.
For example, for each product, the structured data may be stored as value of attributes using a suitable data structure, optionally based on the data structure implementing the current indexed data used by the search engine, for example, a table, a graph, a tree, and/or a map.
The structured data may include a product (and/or object) representation of the respective product (and/or object) in accordance with the model (sometimes referred to herein as modeled product representation), as described herein.
At 756, the structured product data of each product for which product-related data was received (at 750) is transmitted, using API 608B, optionally over network 616 (and/or within internal computer busses and/or direct computer-to-computer links and/or other methods) to search server 604 and/or another computing device.
Search server 604 (and/or another computing device and/or the search engine) creates and/or updates enhanced indexed product dataset 602A by integrating the received structured product data with the existing product dataset being searched by search engine 602. Indexed object dataset 602A is made accessible for searching by search engine 602 using the interpreted search queries.
It is noted that the enhanced indexed product dataset may include the original dataset from which the product data was obtained, an external dataset external to the search engine that may be stored locally on a computing device storing the search engine and/or stored remotely on a remote server, a dataset integrated with the search engine (e.g., the dataset currently being used by the search engine prior to enhancement), and/or a previous version of the indexed product dataset that is updated with the new structured object data.
It is noted that the creation and/or updating of enhanced indexed product dataset 602A may be performed by the computing unit, and transmitted using API 608B to search engine 602.
Referring now to
The search query is provided by a user using a client terminal accessing the search engine, for example, by the user typing the query as text, the user uploading an image used as a query image, the user typing using Braille, and/or the user speaking in to a microphone (or phone). The search query may be received by the search server executing the search engine. The search query defines a search for product(s) by the search engine.
Optionally, at 704, the search query is parsed into search terms, for example, into words where each word is separated by a space. The search terms may be natural language search terms, portions of the image, and/or portions of the recorded audio.
Optionally, the parsing is performed by identifying search terms that match to one or more model elements of the hierarchic object model. For example, for the search query “blue furniture”, the term blue and the term furniture are each identified as model elements of the hierarchic object model. An exemplary interpretation is “color should be blue or navy or light blue or dark blue or . . . ”, “product type should be furniture or table or chair or sofa or . . . ”.
Alternatively or additionally, the parsing may be performed by interpreting the search query based on linguistic rules and/or patterns, which may be stored, for example, as a set-of-rules, code, a look-up table, a mapping database, or other implementations. For example, for the search query smartphone under $1000, the term under may be interpreted as a quantitative inequality intent, which for the described example is based on price (i.e., price <1000). The term under may not necessarily be included in the hierarchic product model. In another example, for the search query cheap smartphone, the term cheap may be interpreted as price <700.
The search terms may be extracted directly from the search query, and/or may be inferred from the search query.
One or more of the search terms may not necessarily correspond to products of the product-related dataset, and/or may not necessarily correspond to parts of products defined by the product-related dataset. The search terms are potentially unmatchable with product(s), for example with free-text based matching, when searched by the search engine without the conversion into the interpreted term described herein. In the case where the search engine finds results for the search terms, the matching results may be irrelevant and/or partial results. For example, the search terms represent attributes and/or concepts that do not correspond to products, and therefore generate an error, bad recall, or irrelevant results when searched by the search engine. For example, the term blue may not necessarily yield results when used by the search engine to perform a standard search, since the color of products may not be a feature defined in the product-dataset. In another example, the term blue may yield results that are not ranked according to the user's intent, such as not ranking by color. The term blue may return only some of the relevant results because only some of the blue products include a feature for blue in a way that is understood to the search engine searching using standard methods (i.e. without the API), for example, in a field called color. Other examples of search terms that may not be found using standard methods include: for students, for babies, lasts a long time, and easy to repair.
It is noted that the search query may include one or more terms that correspond to products and/or parts of products.
Alternatively or additionally, the search query is analyzed in parts or as a whole, without necessarily performing the parsing.
At 706, one or more of the search terms (which may be analyzed in combination with the terms that correspond to products and/or parts of products) are converted into one or more interpreted query terms. It is noted that not all terms are necessarily converted. Some terms may be ignored. Some terms may be interpreted in pairs or groups.
The conversion may be performed based on at least some of synonyms, similar terms, hierarchy, and/or element distance of the model elements of the hierarchic product model that are matched to the search terms. For example the search term notebook may be converted to the synonym laptop. For example, the search term sandals may be converted to the similar term flipflops. For example, the search term blue may be matched to a model element having the following hierarchical terms: blue, navy, light blue, and dark blue. For example, the search term furniture may be matched to a model element having the following hierarchical terms: table, chair, and sofa.
Each interpreted query term is selected to match one or more products when the searching engine searches the indexed product dataset.
Additional details of methods for performing the conversion are described herein, for example, with reference to
The enriched attribute is calculated based on at least one formula representing domain expert knowledge of relationships between the attributes of the product, as described herein. For example, the enriched attribute for kids may be computed based on dimensions of furniture that is suitable for kids, and/or made out of materials suitable for kids, and/or user reviews such as “my kids loved it!”. For example, the enriched attribute for travelers may be computed based on common dimensions of compartments, such as a luggage compartment, based on rugged materials, and based on reviews, for example, “took this on my trip to Australia!”.
The enriched attribute may be dynamically computed during the conversion, and/or pre-computed and stored in association with the product. The enriched attribute may be a Boolean parameter having a value calculated based on the formula. The Boolean parameter may be note a type and/or suitability of the respective product for a certain application and/or certain use. For example, the enriched attribute may denote a true or false value for the Boolean parameters for kids and for travelers.
Each search term may be converted into multiple interpreted query terms, for example, to clarify areas of ambiguity. For example, a keyboard may mean a keyboard of a computer used for typing, or may mean a keyboard used to play music. The meanings of the keyboard may be expressed using synonyms of the keyboard. In another example, search terms may be converted into the multiple interpreted query terms based on weighted similarity and/or entity distance, for example, the search term sandals (optionally with a weight) may be converted into flip flops (optionally with a weight).
Optionally, the conversion is personalized to the user that provided the search query. The conversion may be performed according to a preference for the interpreted query term by the user entering the search query. The user preference may be monitored and/or computed, for example, by code executing on the client terminal of the user and/or on the search engine and/or the search server that recognize the user (e.g., cookie) and record a history of entered search queries and/or a history of user selection. For example, when out of 100 searches based on the term chair, a certain user selects sun chairs 95% of the time, the conversion may be performed accordingly. For example, for the certain user the prototypicality measure (described below in greater detail) of sun chairs may be increased significantly by the search engine. In another example, the term sun chair may be substituted for the term chair when a search query is received from the certain user.
Alternatively or additionally, the conversion is performed according to an analysis of behavior of a cluster of users that share one or more features with the user that provided the search query. The behavior may be learned (e.g., by code instructions) based on an analysis of search behavior of the cluster of users, and/or using a survey to collect data. Exemplary features include age, gender, and geographic location. For example, the learnt behavior is that women over the age of 60 mean simple phone when searching for phone, while every other population means smartphone. The conversion of the search term phone into simple phone or smartphone is performed according to the features of the user.
At 708, an interpreted search query is created based on the at least one interpreted query term. The interpreted search query is created as described herein, for example, as described with reference to
It is noted that blocks 706 and 708 may be executed as a single process and/or executed simultaneously.
The interpreted search query is formatted according to a query language of the search engine. For example, Boolean operators may be inserted into the interpreted search query. The formatting in the query language of the search engine may include instructions that are used by the search engine in deciding which results to return, for example, specific results, or compilation with a set-of-rules.
The interpreted search query may be created by encoding the names and/or values of attributes of the interpreted terms, with the operations (e.g., Boolean operators) remaining un-encoded. The operators may be used in the integration of the interpreted search query into the enhanced indexed product dataset.
The interpreted search query may be created based on synonyms of the search term. Multiple terms that are synonyms of one another may be converted to a single (or smaller number of) interpreted term. For example, a search query for laptop or notebook may be converted to one interpreted term, laptop, based on laptop and notebook being synonyms.
The interpreted search query may be created based on entity distance (i.e., weighted similarity), for example, the search term sandal may be used to create the interpreted search query flip flops. The entity distance may be defined using a weight and/or distance that defines the maximum distance (e.g., the degree of correlation) between the terms.
Based on the example above using the search terms color and furniture, an exemplary interpretation (optionally encoded) is “color should be blue or navy or light blue or dark blue or . . . ”, “product type should be furniture or table or chair or sofa or . . . ”.
Optionally, the interpreted search query is associated with a mapping between each of the search terms and the corresponding converted interpreted query term. The mapping may be stored as, for example, code instructions, points, a mapping matrix, or other data structures stored in association with the interpreted search query. It is noted that the mapping may be used by the search engine (or other code associated with the search engine) to re-format the interpreted search query, rather than interpretation of the original received search query. The mapping may be used, for example, to re-write the interpreted search query (e.g., by the search engine) and/or to provide an interactive user experience (e.g., presenting to the user a breakdown of the search for products using the search query). For example, attributes of products used in the search are marked on the screen. For search for a large briefcase, the mapping may be used by the search engine to highlight the dimensions of the presented returned results. For example, the mapping may be used to remove the terms without wheels from the query chair without wheels. A structured component denoting the absence of wheels may be included, for example, wheels=false. The search may be performed using only the originally received term chair, and/or may be performed based on the structured component denoting the absence of wheels. The removal of terms may be performed in cases where keyword based search with the removed terms have negative impact on the results (e.g., generating results for wheels, or just chairs).
Alternatively or additionally, the interpreted search query is associated with a data-element (e.g., code instructions, other data structure) that is not associated with explicit intent of the user expressed in the search query, which is based on behavioral patterns of the user, statistical computations based on external data, and/or other external data sources.
Alternatively or additionally, the interpreted search query is associated with a data-element (e.g., code instructions, other data structure) defining an extent to which the respective interpreted query term is prototypical denoting a quantitative measure of how much the respective interpreted query term conforms with a definition of a higher-level concept. For example, dining chairs may be considered “more of a chair” than sun chairs. The data-element may instruct the search engine to providing matching dining chairs with a higher ranking than sun chairs, as potential search results to the query term chair. Prototypicality be implemented based on the more that a product appears in the product-dataset, the more prototypical the respective product. For example, if 80% of chairs in the product-dataset are dining chairs, and only 5% of the chairs are sun chairs, the assigned prototypicality measures instructs the search engine to return results according to the relative percentages. Prototypicality be implemented based on the more that a concept shares properties with some higher level concept, the more prototypical it is considered. For example, the concept dining chair has more shared attributes with the concept chair, instructing the search engine accordingly. Prototypicality be implemented based on the more that a product's attributes are average or common, the more the product is considered prototypical. For example, if the average chair height is 1.5 feet, the search engine is instructed to increase dining chairs' prototypicality compared to sun chairs, since dining chairs are closer to the average chair height than sun chairs.
Alternatively or additionally, the interpreted search query is associated with a data-element (e.g., code instructions, other data structure) defining a negative intent of the respective interpreted query term denoting non-existence of the respective interpreted query term. The interpreted search query may be formatted according to the query language to include an indication of the negative intent of the interpreted query term. For example, the Boolean operator NOT may be used to exclude results matching the interpreted query term. For example, the search query chair without wheels is converted and formatted to clarify to the search engine that the search is for a chair that does not have wheels, rather than searching for both chairs and wheels. The data-element may include instructions for ranking of results by the search engine based on semantic meanings. For example, using standard methods (i.e. not based on the instructions of the data-element), the search engine may present chairs with wheels as results of the query chairs without wheels, simply because chairs and wheels represent two out of the three words in the query, and without may not be understood by the search engine. The data-element associated with the interpreted search query of chairs without wheels may define instructions that express the fact that without is a negative term, and may be used to by the search engine to assigns higher ranks to results whose data does not include the word wheels.
Alternatively or additionally, the interpreted search query is associated with a data-element (e.g., code instructions, other data structure) defining a distribution based on a value of an attribute associated with the interpreted object. The distribution is based on a value of an attribute, a combination of attributes, a concept, and/or a combination of concepts, for example, product-types, utilizations, and the like. The search engine may be instructed on the extent to which an attribute's value affects boosting based on the distribution. For example, for the search query table $500, the data-element may include a predefined distribution (e.g., Gaussian) for the price attribute of tables (that are potential search results) according to which the tables are to ranked by the search engine. For example, a table with a price of $500 is assigned the highest rank boosting (e.g., a 1.1 factor), another table with a price of $450 is assigned a slightly lower boost (e.g., a 1.08 factor), and yet another table with a price of $400 is assigned an even lower boost (e.g., a 1.02 factor). Boost distributions may be associated with each attribute, and/or associated with values (e.g., the distribution for the search query “table price should be around 500” may be different from the distribution for the search query “table price should be 400”).
Alternatively or additionally, the interpreted search query includes instructions for the search engine to attempt to search additional data that is not available to the computing device, for example, not stored in the hierarchical model. For example, prices of products may not be stored in the hierarchical model, however, the interpreted query may include instructions to search according to price, based on the conversion of the search query.
Based on one or more of the described associated elements, the interpreted search query includes instructions for execution by the search engine for matching and ranking modeled product representation defined in an indexed product dataset matching the interpreted search query.
At 710, the interpreted search query is provided using the API, for execution by the search engine. The results of the search (optionally ranked) may be presented to the user on the client terminal, for example, in a GUI. The results may be provided to another process (e.g., executing code) for further processing. For example, to automatically purchase the highest ranked product, read the results to the user (e.g., using a microphone of the client terminal, and/or as a phone call), and/or stored for use in future searches.
Optionally, the search engine executes the interpreted search query by matching the interpreted search query with modeled object representation of each of the products defined in the indexed product dataset. Each of the modeled product representations includes model element(s) and/or hierarchic relations between model elements of the hierarchic object model and/or other representations that may be matched with the interpretation (not necessarily with the hierarchical model). Is it noted that the indexed product dataset stores modeled product representations of the product that may include enhanced attribute(s) of the respective product corresponding to interpreted query terms. The enhanced attributes may not be included in the product-related dataset that would otherwise be searched by the search engine (using standard methods, without the interpreted search query). The search ability of the search engine is thereby enhanced with the ability to search terms.
Examples (which are not necessarily limiting) of search queries and the corresponding interpreted search queries include:
Search query: “black office chair”; interpreted search query: “product type=Office Chair, color=Black”.
Search query: “cheap smartphone with camera”; interpreted search query: “product type=Smartphone, price <=$250, Camera exists”.
Search query: “iphone 7 new”; interpreted search query: “product type=Smartphone, series=iPhone, model=7, condition=new”.
Alternatively (i.e., instead of utilizing the interpreted query as part of the search engine query generation) or additionally, the search engine (and/or another search process) may use the received interpreted search query as a higher level filtering and/or sorting mechanism for a set of results that have already been fetched by the search engine. The first set of results may be fetched by the search engine using the originally entered user query. The search engine may filter and/or sort the results by applying the interpreted search query.
The API (or other interface) may return search engine agnostic interpreted queries (i.e., that are translated to search engine queries by code associated with the search engine), and/or search engine specific interpreted queries (i.e., which are already in the search engine's query language used as is to fetch results, and/or consolidated into a more comprehensive search engine query, as described herein), and/or interpreted queries that may be executed by a family of search engines (e.g., search engines developed using a common development framework, common libraries, and other methods).
Optionally, the computing device includes code executable by the hardware processor for creating the indexed product dataset for use by the search engine to match the interpreted search query with modeled product representations defined in the indexed product dataset. The creation of the indexed product dataset is described herein, for example, with reference to
Optionally, processor(s) of the search server executes code instructions that incorporate the received interpreted search query with additional search instructions, for example, search result ranking instructions, such as shopper behavioral data (e.g., listing click rates, listing purchase rates, and the like), free text relevance enhancement logic, promotional result boosting, and the like. The code instructions that define handling of the received interpreted search query may be implemented, for example, as additional instructions stored as part of the interpreted search query, and/or as customized code instructions stored by the search server accessible by the search engine that define handling by the certain search engine.
The customized code instructions may define the significance of the interpreted search query according to, for example, each e-commerce platform's preference. The customized code instructions may change over time and/or according to different conditions and/or situations.
The interpreted search query may define the only ranking consideration for the search engine, in which case the search engine may execute the interpreted search query in its current form. The interpreted search query may be one of multiple ranking considerations, in which case the extent to which the interpreted search query affects the search result ranking may be adjusted by the customized code instructions. The interpreted search query may be ignored by the search engine. The interpreted search query may be broken down by the customized code into components. Each component may be assigned a certain weight, such that some components affected the search result ranking more than others.
Optionally, each interpreted query term (e.g., the names and/or values of interpreted term) of the interpreted search query matches one or more products (e.g., names and/or values of the products) of the product-related dataset and/or the indexed product dataset searched by the search engine.
Reference is now made to
At 852, catalogue product information (e.g., product-related data) is transmitted from a catalogue 870 (and/or product dataset, and/or object dataset, and/or the index of the search engine itself) to a computing device (e.g., server, code executing on a local computer) using API (or other software interface) 872, which extracts structured features from it. As used herein, the term listings means object-related data of the objects.
At 854, the structured features (also referred to herein as structured object data) are transmitted from the computing device (e.g., server, and/or code executing on a local computer) using API 872 to an indexing process associated with a search engine 874. The indexing process adds the structured features (i.e., structured object data) to the index of search engine 874 together with other information search engine 874 generally indexes in the search engine index to create an enhanced index product dataset.
At 856, a search query entered by a user using a front end device 876 (e.g., client terminal presenting a GUI) is received by search engine 874, and at 858 is transmitted to API 872 for conversion into an interpreted query.
At 860, the interpreted query (also referred to as interpretations) is transmitted from API 872 to search engine 874 for execution using the enhanced indexed product dataset.
At 862, the results obtained from executing the interpreted search query on the enhanced indexed product dataset are provided to the user, optionally to front end device 876 (e.g., for presentation on the GUI presented on the client terminal).
Reference is now made to
Three dataflow processes are depicted by the dataflow diagram. The dataflow processes correspond to methods and/or systems described herein. Call out numbers 902A-I denote the dataflow for creating the enhanced indexed product dataset. Call out numbers 904A-D denote the dataflow for converting the search query into the interpreted search query. Call out numbers 906 A-E denote execution of the interpreted search query by the search engine using the enhanced indexed product dataset.
At 902A, product-related data is provided by a catalog repository 950 (and/or product-related dataset and/or object-related dataset, optionally of an e-commerce datasource. As used herein the terms catalog, product-related dataset, and object-related dataset may be interchanged, and may refer to the index of the search engine used as the information source of the products) associated with a search server 952 (also referred to herein as E-commerce Platform Back-end). Catalog repository 950 may be external to search engine 960, may include the enhanced indexed product dataset, and/or may be integrated within search engine, for example, as the indexed dataset currently being searched by search engine 960.
The product-related data defines product elements of each product. The product-related data (which may be natural language information, structured data, free text, and/or non-textual information such as images and/or videos) is provided for objects (e.g., products) stored in catalog repository 950.
At 902B, a listing indexing module (e.g., code instructions executed by processor(s) of search server 952 and/or another server) optionally normalizes the product-related data, for example, processes the product-related data into a format for extraction of features, for example, organizes the product-related data into a define data structure.
At 902C, the product-related data of the products (optionally the normalized information) is transmitted to code 956 executing on a computing device (e.g., a server, code executing on a local computer, optionally including a Natural Language Engine (NLE) for example, corresponding to server 606 and/or NLA 68 and/or interpretation server 60), optionally using an API and/or other software interface (as described herein).
At 902D, code 956 extracts features from the product-related data and computes structured product data using the extracted features for each of the products, as described herein. Briefly, the structured product data maybe created by the following exemplary method: extracting product elements (i.e., feature) from the product-related data, selecting model elements from a hierarchic product model which may include synonyms, image features (e.g., extracted by image processing) and/or similar terms of the product elements (i.e., features) of the product, extracting from the product model hierarchic relations between the model elements, creating a modeled product representation of the respective product by combining: the selected model elements and the hierarchic relations between the model elements, and creating the index product data using the modeled product representation.
At 902E, the structured product data of the products is transmitted to search server 952 (and/or another computing device associated with search engine 960).
At 902F, a listing index module 954 (e.g., code executed by processor(s) may perform additional processing on the structured product data.
At 902G, the structured product data may be combined with the product-related data stored in catalogue repository 950 to create the enhanced indexed product dataset, by converting at 902H the product-related data into a format based on the structured product data. Alternatively or additionally, an existing enhanced indexed product dataset is updated by integration with the received structured product data. Alternatively or additionally, the structured product data is integrated to create the enhanced indexed product dataset.
At 902I access to the enhanced indexed product dataset is provided to search engine 960.
At 904A, a user uses a search input interface (e.g., GUI, search engine web page presented on a display of a client terminal) 962 to provide an input text and/or image defining a search query for one or more products stored in catalog repository 950.
At 904B, the search query is transmitted to code 956 using the API, optionally by search logic module 958.
At 904C the search query is converted to the interpreted search query by code 956 (as described herein).
At 904D, the interpreted search query is transmitted to server 952, using the API.
At 906A, the interpreted search query may be rewritten by search logic module 958 and/or server 952, for example, based on a set of customized rules, as described herein.
At 906B, the interpreted search query is provided to search engine 960.
At 906C, search engine 960 performs the search using the interpreted search query and the indexed product dataset, and optionally ranks the results based on instructions (e.g., received in association with the interpreted search query, and/or customized instructions).
At 906D, the search results are retrieved.
At 906E, the search results are transmitted to the client terminal of the user and presented in a search result page 964 and/or read to the user (e.g., using a microphone and/or as a phone call) and/or user by another process, and/or re-ranked by other code of the search engine.
Search input interface 962 and search result page 964 may define an E-commerce Platform Front-end 966, for example, a GUI, and/or a web page presented on a display of a client terminal.
The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
As used herein the term “about” refers to ±10%.
The terms “comprises”, “comprising”, “includes”, “including”, “having” and their conjugates mean “including but not limited to”. This term encompasses the terms “consisting of” and “consisting essentially of”.
The phrase “consisting essentially of” means that the composition or method may include additional ingredients and/or steps, but only if the additional ingredients and/or steps do not materially alter the basic and novel characteristics of the claimed composition or method.
As used herein, the singular form “a”, “an” and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a compound” or “at least one compound” may include a plurality of compounds, including mixtures thereof.
The word “exemplary” is used herein to mean “serving as an example, instance or illustration”. Any embodiment described as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments.
The word “optionally” is used herein to mean “is provided in some embodiments and not provided in other embodiments”. Any particular embodiment of the invention may include a plurality of “optional” features unless such features conflict.
Throughout this application, various embodiments of this invention may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
Whenever a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range. The phrases “ranging/ranges between” a first indicate number and a second indicate number and “ranging/ranges from” a first indicate number “to” a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals therebetween.
It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment.
Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the invention. Certain features described in the context of various embodiments are not to be considered essential features of those embodiments, unless the embodiment is inoperative without those elements.
Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.
All publications, patents and patent applications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent that section headings are used, they should not be construed as necessarily limiting.
Claims
1. A system for converting a search query into an interpreted search query for searching a plurality of objects of an object-related dataset using a search engine, comprising:
- at least one hardware processor executing a code for: receiving, using a software interface, a search query provided by a user for at least one object of the plurality of objects; parsing the search query to identify a plurality of search terms; converting at least one of the plurality of search terms into at least one interpreted query term; creating an interpreted search query based on the at least one interpreted query term; and providing, using the software interface, the interpreted search query for execution by the search engine by matching the interpreted search query with at least one modeled object representation of each of the plurality of objects defined in an enhanced indexed object dataset.
2. The system of claim 1, wherein each of the modeled object representations includes at least one model element and hierarchic relations between model elements of a hierarchic object model.
3. The system of claim 2, wherein the conversion of each of the plurality of search terms into at least one interpreted query term is based on matching at least one of: synonyms, similar terms, hierarchy, and element distance of the matching model element of the hierarchic object model.
4. The system of claim 2, wherein each of the plurality of search terms is parsed by at least one of: matching to at least one model element of the hierarchic object model of the plurality of objects, and based on linguistic set-of-rules.
5. The system of claim 2, wherein each search term is analyzed into a plurality of interpretations matching to a plurality of model elements of the hierarchic object model, and converted into a plurality of interpreted query terms based on the plurality of interpretations.
6. The system of claim 2, wherein the converting at least one of the plurality of search terms into at least one interpreted query term is performed based on each matching model element of the hierarchic object model.
7. The system of claim 2, wherein the hierarchic object model comprises at least one enriched attribute of the at least one model element, wherein the at least one enriched attribute is calculated based on at least one of: on other attributes of a concept of the respective object and metadata of the respective object, wherein a concept is a model element representing a certain kind of objects or parts of objects and an attribute is a model element representing properties of the concept.
8. The system of claim 7, wherein the metadata of the respective object includes one or more of the following: image, price, text, user information input, specification, title, description, overview, and review.
9. The system of claim 7, wherein the enriched attribute is calculated based on at least one of: a formula representing domain expert knowledge of relationships between the attributes, and an automated machine learning process.
10. The system of claim 7, wherein the enriched attribute is a Boolean value denoting at least one of a type and a suitability for at least one of a certain application and a certain use.
11. The system of claim 1, wherein the search query comprises an input text provided by the user defining a natural language search query.
12. The system of claim 1, wherein the search query comprises an image.
13. The system of claim 1, wherein the software interface comprises an application programming interface (API) in communication with a search server hosting the search engine.
14. The system of claim 13, wherein the API is hosted by an interpretation server located externally and remotely from the search server, wherein the interpretation server and the search server communicate over a network using the API.
15. The system of claim 1, further comprising code instructions executable by the at least one hardware processor for creating structured data for each of the plurality of objects, the code for:
- receiving using the software interface, object-related data for each of the plurality of objects;
- extracting features from the object-related data for each of the plurality of objects;
- creating structured data for each of the plurality of objects using the corresponding extracted features;
- and transmitting, using the software interface, the created structured data for each of the plurality of objects to a server associated with the search engine for creation of the enhanced indexed object dataset by integration of the structured data with an existing object dataset searched by the search engine.
16. The system of claim 15, wherein the object-related data includes one or more members selected from the group consisting of: natural language object-related data, images, videos, content of links associated with the product, user queries, specifications, titles, descriptions, overviews, and reviews.
17. The system of claim 1, wherein the code executed by the at least one hardware processor is external to the code of the search engine, wherein the code is implemented without modification to the code of the search engine apart from code associated with communication between the search engine and the software interface.
18. The system of claim 1, wherein at least one of the plurality of search terms does not correspond to objects defined by the object-related dataset.
19. The system of claim 1, wherein the interpreted search query is at least one of: formatted in a query language of the search engine, and formatted in a format that is designed for translation to a plurality of engine query formats of a corresponding search engine of a plurality of search engines.
20. The system of claim 1, wherein the interpreted search query is associated with a mapping defining which of the plurality of search terms corresponds to which of the at least one interpreted query term.
21. The system of claim 1, wherein the at least one interpreted query term is associated with a data-element defining an extent to which the respective interpreted query term is prototypical denoting a quantitative measure of how much the respective interpreted query term conforms with a definition of a higher-level concept.
22. The system of claim 1, wherein the at least one interpreted query term is associated with a data-element defining a negative intent of the respective interpreted query term denoting non-existence of the respective interpreted query term.
23. The system of claim 1, wherein the at least one interpreted query term is associated with a distribution based on a value of at least one of: an attribute, a concept, a combination of attributes, and a combination of concepts associated with the at least one interpreted term.
24. The system of claim 1, wherein the conversion is performed according to a preference for the at least one interpreted query term by the user entering the search query.
25. The system of claim 1, wherein the interpreted search query includes instructions for execution by the search engine for matching and ranking the matching modeled object representation defined in the indexed object dataset.
26. A method of converting a search query into an interpreted search query for searching a plurality of objects of an object-related dataset using a search engine, comprising:
- receiving, using a software interface, a search query provided by a user for at least one object of the plurality of objects;
- parsing the search query to identify a plurality of search terms;
- converting at least one of the plurality of search terms into at least one interpreted query term;
- creating an interpreted search query based on the at least one interpreted query term; and
- providing, using the software interface, the interpreted search query for execution by the search engine by matching the interpreted search query with at least one modeled object representation of each of the plurality of objects defined in an enhanced indexed object dataset.
27. A system for enhancing a search of a plurality of objects of an object-related dataset using a search engine, comprising:
- at least one hardware processor executing a code for: receiving, using a software interface, object-related data for each of the plurality of objects of the object-related dataset; extracting features from the object-related data for each of the plurality of objects; creating structured data for each of the plurality of objects using the extracted features from the respective object of the plurality of objects; and transmitting, using the software interface, the created structured data for each of the plurality of objects to a server associated with the search engine for creation of an enhanced indexed object dataset by integration of the structured data with an existing object dataset searched by the search engine; receiving, using the software interface, a search query provided by a user for at least one object of the plurality of objects; parsing the search query to identify a plurality of search terms; converting at least one of the plurality of search terms into at least one interpreted query term; creating an interpreted search query based on the at least one interpreted query term; and providing, using the software interface, the interpreted search query for execution by the search engine using the enhanced indexed object dataset.
Type: Application
Filed: Jan 26, 2017
Publication Date: Mar 29, 2018
Inventors: Noa GANOT (Givataim), Tal Koren (Modi'in-Maccabim-Re'ut), Avishay Lavie (Tel-Aviv), Iddo Lev (Tel-Aviv), Eli Shalom (Tel-Aviv), Adi Avidor (Tel-Aviv)
Application Number: 15/416,337