DATA VISUALIZATION INTERFACE
Provided is a method of preparing a data-visualization interface. The method may include receiving a request to view data; retrieving a display configuration responsive to the request to view data, the display configuration identifying two or more context dimensions and three or more sub-display dimensions; retrieving data responsive to the request to view data; designating, in visualization data, a portion of the data as visible data and a portion of the data as cache data based on the display configuration; assigning, in the visualization data, positioning data to a plurality of sub-displays based on the two or more context dimensions, the sub-displays displaying data markers; assigning, in the visualization data, a portion of the cache data to groups of the sub-displays that are co-linear based on the context dimensions; and transmitting the visualization data.
The present application is a continuation of U.S. patent application Ser. No. 14/046,427, filed Oct. 4, 2013, which is a continuation of U.S. patent application Ser. No. 13/355,248 filed Jan. 20, 2012, which claims the benefit of U.S. Provisional Application 61/461,682, filed Jan. 22, 2011, each of which are incorporated by reference in their entirety for all purposes.
BACKGROUND OF THE INVENTION1. Field of the Invention
The present invention relates generally to a system and tools for data visualization and, more specifically, to a system and tools for generating a data visualization interface.
2. Description of the Related Art
This section is intended to introduce the reader to various aspects of art that may be related to various aspects of the present invention, which are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present invention. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
As businesses have become increasing reliant upon information technology as their employees, customers, and vendors generate large and increasing amounts of data during business operations. The data may include items such as data entered in a database by employees regarding customer relationships or sales leads, data entered by customers regarding new orders or feedback on previous transactions, or data entered by vendors regarding orders placed by the company. The data may also include manufacturing data, such as inventory numbers, pricing, quality-control data, equipment data logs, and data tracking work in progress.
Further, some data may reflect potentially complex interrelationships between other portions of the data. For example, a serial number may be related in a database to vendor data, inventory data, customer order data, sales lead data, and customer feedback data. Similarly, for example, data identifying an employee may be related to data generated in the course of work for which that employee is responsible.
Other types of data generated in various businesses include real-estate data, intellectual-property portfolio data, data regarding portfolios of assets generally, data regarding commercial property asset management, financial data, e.g., mortgages, accounts receivable, accounts payable, etc. These are just a few examples that illustrate the volume and complexity of data stored in many companies' databases.
The data often relates to the ongoing operation of the business, and analyzing the data in a timely fashion may allow business managers to make better business decisions. Frequently, however, the amount of data and complexity of relationships between various fields of data impedes efforts by those managing a business to quickly understand the significance of the data to their business. To this end, some businesses employ specialized data analysts or consultants whose sole function is querying databases and generating reports, for example using Microsoft Excel from Microsoft Corp. of Redmond, Wash. These reports are typically tedious to generate, offer relatively limited understanding of the data, and in the event that the reports inspire additional lines of inquiry by their audience, another round of querying and reports is often required, thereby slowing the process of investigating the data.
More sophisticated tools for analyzing business data than Excel exist, but these tools are often difficult to operate and are often implemented by employees or consultants with higher levels of training and greater labor costs than those limited to use of Excel. The more sophisticated tools often include a more detailed view of the data that can include additional dimensions of the data displayed simultaneously or accessible through user interaction with an analysis of the data, but users often experience difficulties when selecting among a potentially large number of fields in a database to represent visually, e.g., in a graph or other form of data visualization, particularly when selecting which fields should be displayed on, or mapped to, which visual aspects of a data visualization to convey information precisely and with concision.
The above-mentioned aspects and other aspects of the present techniques will be better understood when the present application is read in view of the following figures in which like numbers indicate similar or identical elements:
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. The drawings may not be to scale. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but to the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
DETAILED DESCRIPTION OF CERTAIN EMBODIMENTSOne or more specific embodiments of the present invention will be described below. In an effort to provide a concise description of these embodiments, not all features of an actual implementation are described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming but would nevertheless be a routine undertaking of design, implementation, and manufacture for those of ordinary skill having the benefit of this disclosure.
The transmitted data-visualization interface may include relatively high dimensional data visualizations, e.g., the visualization server 12 may transmit data visualizations with more than four dimensions of data, more than five dimensions of data, more than six dimensions of data, more than seven dimensions of data, or more than eight dimensions of data. Further, as explained in greater detail below, the data-visualization interface may include interactive features and may cache data on the receiving devices 18, 20, and 22 for use in responding to user interaction. In some embodiments explained below, the cache data is not initially displayed as part of the data visualization, but is stored locally on the receiving devices 18, 20, and 22 for responding to user interaction with the data visualization with relatively low latency, e.g., relative to the time the system 10 would take to respond to user interaction with the data-visualization interface by requesting additional data from the data source 14. Additionally, in some embodiments, the visualization server 12, as explained further below with reference to
The visualization server 12 may be implemented in various forms. For example, the visualization server 12 may be implemented in a rack server or a standalone tower. The visualization server 12 may be implemented as a virtual server or in hardware form.
The visualization server 12 may include code that is, or may itself be, recorded on a tangible machine-readable medium storing: 1) instructions for forming data visualization interfaces, such as those discussed with reference to
The data source 14 may include a database 24 and a server 26. The database 24 may be a structured database, such as a relational database, a graph database, or a flat file. Or, in some embodiments, the database 24 may be an unstructured database. The server 26 may be configured to receive requests for data, e.g., from the visualization server 12, read the requested data from the database 24, and return the requested data to the visualization server 12. The server 26 may also be configured to receive and process requests for metadata, such as table names, field names, data types, and the statistics about the data in the database 24, such as the cardinality of a field. In some embodiments, the data source 14 may receive requests from the visualization server 12 and send data to the visualization server 12 via a network, such as the Internet.
Network 16 may include any of a variety of different types of networks, in combination or individually, such as the Internet, an intranet (wired or wireless), a cellular network, and a telephone network. Network 16 may carry requests for data visualization from receiving devices 18, 20, and 22 to the visualization server 12 and may carry data visualizations prompted by the requests from the visualization server 12 to the receiving devices 18, 20, and 22.
The receiving devices 18, 20, and 22 are shown as an example of a Web server 18, an example of a smart phone 20, and an example of a personal computer 22. In some embodiments, all of the receiving devices may fall within one of these categories, or the receiving devices may include a variety of different types of receiving devices configured to connect to the visualization server 12. In some embodiments, the receiving devices 18, 20, and 22 may include a World Wide Web browser for depicting data visualization interfaces, presenting visualization configuration interfaces, and receiving and transmitting user interaction. While only three receiving devices 18, 20, and 22 are shown, visualization server 12 is not limited to only three receiving devices, and in some embodiments, may transmit data visualizations to substantially more receiving devices. As explained below with reference to
In this embodiment, the context dimensions 36 and 38 are mapped to orthogonal spatial dimensions of the data-visualization interface 28. In the embodiment of
Other embodiments may include additional context dimensions that correspond to the position of the sub-displays 30, 32, and 34. For example, a data visualization may include a third spatial dimension with an additional context dimension mapped to a vector normal to the context dimensions 36 and 38, e.g., the sub-displays 30, 32, and 34 by appear out of, or into, the page of
Sub-displays having the same value for one or more context dimensions 36 and 38 are said to be “co-linear,” and the sub-displays can be co-linear even if the context dimensions for which they have the same value is not mapped to a spatial distance within the data-visualization interface 28, e.g., sub-displays having the same degree of transparency, where transparency is a context dimensions of sub-displays, are co-linear. In the illustrated example, the sub-displays 30 and 34 are co-linear for the value 2 in the context dimensions 36, and of the sub-displays 32 and 34 are co-linear in the context dimension 38 for the listed person's name.
Thus, the coordinates for the context dimensions 36 and 38 may correspond to the information in the sub-displays 30, 32, and 34 that are co-linear along those coordinate values for each context dimensions 36 and 38. For example, the information presented in sub-displays 32 and 34 may relate to the individual listed in context dimensions value 52 in context dimensions 38. Thus, in the illustrated embodiment, each sub-display 30, 32, and 34 depicts a graphical representation of information about a combination of two orthogonal context dimensions coordinates. In the pictured embodiment, context dimensions 36 corresponds to each of the previous three months, and context dimension 38 corresponds to the names of salespersons. Thus, in the present embodiment, data-visualization interface 28 presents a sub-display 30, 32, and 34 for each salesperson in context dimensions 38 for each of the last three months in context dimensions 36.
In the embodiment of
Similarly, in the illustrated embodiment, lower cardinality values than those of dimension 56 are mapped to the context dimensions 36 and 38. Each value of context dimensions 36 assumes one of three values, for each of the last three months, and each value of context dimensions 38 assumes one of a finite number of values for each of the names listed.
As explained below with reference to
Further, in some embodiments, the min value 58 and the max value 60 may be selected by the visualization server 12 (shown in
In the depicted embodiment, sub-displays 30, 32, and 34 include a plurality of data markers 64 positioned within the sub-displays according to their data values mapped to the horizontal and vertical axes of the sub-displays. Legend 40 may also include a shape legend 62 indicating that the shape of data markers is mapped to a dimension, and the illustrated case, a dimension indicating whether the item identified by the data marker was modified more or less than four days ago. Finally, legend 40 includes a list of data marker colors and labels identifying which data marker color 66 is mapped to which dimension value 68.
Thus, as indicated by legend 40, each sub-display 30, 32, and 34 includes four attributes: the horizontal axis mapped to attribute 56, the vertical axis mapped to attribute 54, the shape of data markers 62, and the correspondence between color 66 and dimension value 68. Other embodiments may include additional attributes mapped to additional dimensions. For example, data marker size, transparency, background color, and shadow. In some embodiments, the data markers 64 may be animated, for example with an animated GIF (graphic-interchange format image), and movement, for example frequency of vertical or horizontal oscillation, may be mapped to a dimension. For instance, data markers 64 may vibrate if a threshold is exceeded for the dimension to which movement is mapped.
Data-visualization interface 28 may be encoded in a variety of formats, including hypertext markup language (“HTML”), and the data-visualization interface 28 may be transmitted as an data visualization file that includes both HTML code, scripting language code (e.g., JavaScript), and associated image files. The present technique is not limited to HTML and JavaScript, however, and a data visualization file may be encoded in other formats, e.g., as a graphical user interface in a stand-alone application executing on a personal computer.
In some embodiments, cache data (defined below) may be used to expedite interactions between a user and the data-visualization interface 28. Data-visualization interface 28 may be transmitted by the visualization server 12 (shown in
In some embodiments, this additional data, called “cache data,” is transmitted by the visualization server 12 (shown in
In some embodiments, cached data is displayed by a user interacting with the data-visualization interface 28 to indicate which cache data is desired by the user.
In summary,
Sub-displays 136 may include data markers 138 that are mapped to additional dimensions as shown by a legend 140 of the data-visualization interface 130. Thus, in the illustrated embodiment, the radius of the position of the data markers 138 within the sub-displays is mapped to a priority designation and the angular position of the data markers 138 within the sub-displays 136 is mapped to the duration designation. In this embodiment, the data visualization 130 may be transmitted with cached data that is accessible by selecting designations for data markers 138, sub-displays 136, or context dimensions 132, 133, and 134.
Next, the data associated with the request to the data is retrieved, as indicated by block 148, and based on the display configuration, a portion of the retrieved data is designated as visible data and a portion is designated as cache data, as indicated by block 150. Based on two or more context dimensions identified by the display configuration, in a visualization file, positions are assigned to a plurality of sub-displays, as indicated by block 152. Assigning positions may include forming a rectangular matrix of sub-displays, such as illustrated in
Next, based on three or more sub-display dimensions identified by the display configuration, in the visualization file, data may be mapped to a plurality of the sub-display attributes in each sub-display, as indicated by block 154. Mapping may include assigning colors, shapes, and positions to data markers, such as those shown in
The process 142 may also include a step of associating a portion of the cached data with groups of co-linear sub-displays based on context dimensions, as illustrated by block 156, and as is shown in
The visualization-configuration interface 164 may be a webpage form, transmitted as a configuration-interface file, constructed with HTML, JavaScript, and associated image files. The illustrated visualization-configuration interface 164 includes fields 166 and 168 for naming and describing a data-visualization interface and a group of dimension-configuration fields 170 for mapping data fields to attributes of the data-visualization interface, such as context dimensions, and aspects of the sub-displays, e.g., data marker shape, color, position, etc. In some embodiments, the visualization-configuration interface 164 may present a list of menu options for dimension-configuration fields 170 when an individual field is selected, e.g., as a drop-down box of data field menu options among which a user may select. The process for populating menu options by the visualization-configuration interface 164 described below with reference to
The dimension-configuration fields 170 map data to visualization attributes having differing resolutions, or variety of visually distinguishable attribute states. For example, with reference to
In this embodiment, the process 196 begins with identifying a database containing data to be viewed, as indicated by block 198. Identifying the database may include correlating a user account with a database to which that user has access, or may include presenting a list of databases to which a user has access to a user and receiving a selection by the user of a database from among the list. In some embodiments, multiple databases may be identified.
Next, the identified database is queried for data indicative of the cardinality of fields (e.g., columns) of data in the database, as indicated by block 200. Querying the database may include querying the database for a list of tables within the database, a list of fields within each table, a list of data types associated with each field (for example integer, string, Boolean, date, etc.), a list of data values for a field, a maximum and a minimum value for a field, a number of characters in a field, or querying a cardinality of a field.
The process 196 further includes identifying menu option candidate fields having a cardinality below a visualization threshold, as indicated by block 202. Identifying fields having a cardinality below a visualization threshold may include eliminating from consideration as menu options high cardinality fields, such as fields with long strings of text including a narrative description or large number of residential addresses. Such high cardinality fields may be identified based on a cardinality value itself, based on a description of the field including a keyword such as comments or narrative, or the field having a string data type and a character limit above some threshold, for example above 100 characters. The visualization threshold may be selected based on the highest resolution attribute within a data-visualization interface, e.g., the visualization threshold may be multiple of, a fraction of, or approximately equal to the resolution of the highest resolution attribute. Thus, in some embodiments, the fields having a cardinality above the visualization threshold may be filtered out of the candidate fields for menu options, or in some embodiments.
The process 196 also includes designating menu option candidate fields identified in step 196 having a relatively high cardinality as menu options for a relatively high resolution data visualization attributes, as indicated by block number 204. Designating the identified fields having a relatively high cardinality may include identifying fields having a cardinality above some threshold or identifying fields based on their position within a distribution of cardinalities, for example the highest 50% of cardinality fields. In certain embodiments, the identified fields may also be ranked for listing in order or reverse order of cardinality as menu options. In some embodiments, the visualization server 12 may designate fields by populating menu options of high-resolution attributes in a visualization configuration interface, such as options for the attribute 178 in the interface of
The process 196 further includes designating the menu option candidate fields having a relatively low cardinality as menu options for a relatively low resolution data visualization attributes, as indicated by block number 206. Designating the identified fields having a relatively low cardinality may include identifying fields having a cardinality below some threshold or identifying fields based on their position within a distribution of cardinality, for example the lowest 50% of cardinality fields. In some embodiments, the low cardinality menu options may be created from higher cardinality fields by quantizing the data within that field, for example grouping date entries by month or rounding integer values to the nearest 100. Low cardinality fields may be identified based on data type, e.g., fields having a Boolean data type, or low cardinality fields may be identified by querying the database for all values in a field and counting the number of unique values. For instance, a part number field in a database may have a relatively high number of characters, but relatively few unique part numbers may occur, causing the field to have a low cardinality. Again, in certain embodiments, the identified fields may also be ranked for listing in order or reverse order of cardinality as menu options. In some embodiments, the visualization server 12 may designate fields by populating menu options of low-resolution attributes in a visualization configuration interface, such as options for the attribute 174 in the interface of
In some embodiments, cardinality may be calculated subject to other constraints on the data. For instance, cardinality of fields may be calculated based only on data acquired within some time period, e.g., within the last three months preceding a query.
Finally, a visualization configuration interface may be presented to a user with the higher cardinality fields presented as menu options for higher resolution attributes and the lower cardinality fields presented as menu options for lower resolution attributes, as indicated by block 208. In some embodiments, some fields may be presented as menu options for data visualization attributes spanning a range of resolutions, e.g., a date field may be presented as an option for both the vertical axis of a sub-display and as an option for a horizontal context dimension. Presenting a visualization configuration interface may include automatically constructing HTML and JavaScript configured to form the web page of
In summary, the process 196 is believed to facilitate the configuration of data visualizations. By presenting menu options drawn from the fields presently available in a data source, in some embodiments, the user is able to configure a data visualization without themselves querying the database to determine which fields are available. And by limiting menu options based on the cardinality of the fields and the resolution of an attribute to which the menu option applies, in some embodiments, a user is able to map data to data visualization attributes more easily than in systems in which the user maps data fields to attributes without guidance.
While the invention may be susceptible to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and have been described in detail herein. However, it should be understood that the invention is not intended to be limited to the particular forms disclosed. Rather, the invention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the following appended claims.
Claims
1. A method of configuring a data visualization interface and visualizing data with the configuration, the method comprising:
- causing a data visualization configuration interface to be presented on a user computing device, wherein: the data visualization configuration interface is operative to configure a data visualization including a plurality of subdisplays; the plurality of subdisplays are arranged in a rectangular grid in a display on the user computing device; and at least some of the subdisplays represent at least three dimensions of data with at least three respective attributes of data markers within the respective subdisplay;
- for at least some of the subdisplays, selecting, by a processor, a mapping of a given dimension of data to be visualized to a data marker attribute, wherein: the mapping is for the data marker attribute to represent the given dimension in at least some of the subdisplays; the selection is based, at least in part, on whether the given dimension is expressed in discrete units and a resolution of the data marker attribute; and the selection is presented in the data visualization configuration interface on the user computing device
- receiving, via the data visualization configuration interface, a data visualization configuration including: the selected mapping; mappings of two other dimensions of the data to be visualized to data marker positions within subdisplays; and mappings of two context dimensions of the data to be visualized to horizontal and vertical positions of the subdisplays, respectively; and
- configuring a data visualization with the received data visualization configuration; and
- causing the data visualization to be presented on the user computing device.
2. The method of claim 1, wherein the selection is based, at least in part, on a cardinality of the given dimension.
3. The method of claim 1, wherein the selection is based on a data type of the given dimension.
4. The method of claim 1, wherein the selection is a mapping of the given dimension of data to be visualized to a color scheme.
5. The method of claim 1, wherein the selection is a mapping of the given dimension of data to be visualized to a data mark size.
6. The method of claim 1, wherein the selection is a mapping of the given dimension of data to be visualized to a data mark shape.
7. The method of claim 1, wherein the selection is based on steps for determining cardinality of a field in a database.
8. The method of claim 1, comprising:
- quantizing the data within field of the data to be visualized to satisfy a cardinality constraint of an attribute of the data visualization.
9. The method of claim 1, wherein:
- the selection is presented as a user-adjustable option in the data visualization configuration interface.
10. The method of claim 1, wherein:
- the data visualization comprises a multi-touch interface.
11. The method of claim 1, wherein:
- data marker size is mapped to another given dimension of the data to be visualized.
12. The method of claim 1, wherein:
- data marker movement an animated sequence is mapped to a dimension of the data to be visualized in the configuration.
13. The method of claim 1, wherein causing the data visualization to be presented on the user computing device comprises:
- steps for caching visualization data on the user computing device; and
- steps for presenting cached data responsive to user interaction with the data visualization.
14. The method of claim 1, wherein causing the data visualization to be presented on the user computing device comprises:
- sending content to a web browser that when rendered by the web browser causes the web browser to present the data visualization.
15. The method of claim 1, wherein the data visualization configuration interface is configured to present menu options by which dimensions of the data to be visualized are selected, and wherein the dimensions are arranged in the menu options based on an amount of unique values the respective dimensions can have based on a respective data type.
16. The method of claim 16, wherein the data visualization configuration interface is initialized with the dimensions arranged in the menu options based on the amount of unique values the respective dimensions can have based on the respective data type.
17. The method of claim 1, comprising:
- causing data to be stored on the user computing devices before the data is presented in the data-visualization interface and before user interaction requesting presentation of the data; and
- wherein the presented data visualization is configured to detect a user interaction, retrieve the stored data, and present the stored data.
18. The method of claim 1, where the data to be visualized includes:
- data regarding customer relationships or sales leads,
- data entered by customers regarding new orders or feedback on previous transactions,
- data entered by vendors regarding orders, or
- manufacturing data.
19. The method of claim 1, wherein:
- the method is executed by the user computing device or a data visualization server.
20. A tangible, non-transitory, machine-readable medium storing instructions that when executed by one or more processors effectuate operations comprising:
- causing a data visualization configuration interface to be presented on a user computing device, wherein: the data visualization configuration interface is operative to configure a data visualization including a plurality of subdisplays; the plurality of subdisplays are arranged in a rectangular grid in a display on the user computing device; and at least some of the subdisplays represent at least three dimensions of data with at least three respective attributes of data markers within the respective subdisplay;
- for at least some of the subdisplays, selecting, by a processor, a mapping of a given dimension of data to be visualized to a data marker attribute, wherein: the mapping is for the data marker attribute to represent the given dimension in at least some of the subdisplays; the selection is based, at least in part, on whether the given dimension is expressed in discrete units and a resolution of the data marker attribute; and the selection is presented in the data visualization configuration interface on the user computing device
- receiving, via the data visualization configuration interface, a data visualization configuration including: the selected mapping; mappings of two other dimensions of the data to be visualized to data marker positions within subdisplays; and mappings of two context dimensions of the data to be visualized to horizontal and vertical positions of the subdisplays, respectively; and
- configuring a data visualization with the received data visualization configuration; and
- causing the data visualization to be presented on the user computing device.
Type: Application
Filed: Dec 22, 2016
Publication Date: May 18, 2017
Inventor: Robert F. Jones (Austin, TX)
Application Number: 15/388,453