CommonLounge Archive

Geocoding, Granularity, and Time-Series

April 18, 2018

In this post, we will learn how to create maps in Tableau and display geocoded information. For this tutorial as well, we use the Sales Superstore data that can be downloaded from here.

Geocoding

First, a little primer on geocoded information and how it is used for data visualizations in Tableau. Geocoding simply means using Latitude and Longitude details to map locations. Tableau can use the geographical coordinates to plot locations accurately. It also follows conventional hierarchies such as Country→ State/Province → City. This tool understands postal codes in different formats and also allows adding new geographic roles to existing hierarchy. For further documentation, you can refer to Tableau in-depth documentation. We can even assign custom values atop our data for addresses that Tableau cannot map.

Why would we want to use Geocoded Information?

Whenever we wish to answer questions spatially or analyze data geographically, we will select a map based visualization and use geocoded data. Examples of a few spatial questions are:

  • State wide metrics such as company profits, average internet connectivity speeds, electoral votes, etc.
  • Region wide such as employment percentage for every city, obesity rate, or school dropout rate in every county
  • Statistics comparison for different countries such as per capita income, literacy rates, population totals, and happiness quotient.

Hands-On Practice

In this tutorial, we will be analyzing region wise trends in Profit and Discount from our Sales data. Since we are interested in geographical comparisons, our natural choice will be a map. Let’s get started.

Open Tableau Public application and import the Sales dataset in csv format. The screen should look as follows:

Open a new worksheet by clicking on the Sheet1 option on the bottom left.

Drag the field Postal Code from Dimensions to the center sheet. You will observe that Latitude and Longitude are automatically added to Rows and Columns respectively. Postal code has been added to the Details card in Marks panel. You should now see the following on the screen:

Click on the Latitude field and drag a copy next to itself in the Rows field. This will create an identical map on the same sheet as follows.

The reason we are creating two maps is that we are plotting two separate Measures: Profit and Discount. This is a simple step to follow when we wish to plot multiple graphs on the same worksheet. (Just drag fields next to each other. The fields themselves can be the same or different) This technique works even if we want to ultimately blend different maps together (as we will do later in this tutorial).

Notice how there are two Latitude fields on the Marks cards.

Since we want to modify the second visualization, we will un-collapse the second one using the down arrow to its left.

Drag Discount from Measures to the Color card in Marks panel. Expand the color selection by clicking on Edit colors.

Select Orange Blue Diverging option as below.

Now check the box that says Stepped Color and include number as 10 steps. Click on Apply and then press OK. The reason we are using stepped color here is to ensure distinct, easily distinguishable colors to represent how large or small the levied discounts are.

Drag Discount over to the Size card in Marks panel as well

These steps will result in a legend being created to the right of the visualization sheet as below.

The actual sheet should look as follows.

Repeat the process with the first Latitude card on Marks panel for Profit field from Measures.


Sheet will look as follows.

As seen above, the maps illustrate profit and discount ranges by region. We can determine that losses (red) are incurred on the coasts with profits (blue) in the midwest and north. We also ascertain that large discounts were given in the coastal areas with lower values in the midwest and south.

Based on our visualization, it becomes evident that large discounts did not translate to profits. This can provide us clues about important questions we should be asking, such as which categories successfully bolstered revenue through discounts and which did not. It can also hint at potential action items like lowering or recalibrating discounts based on sales volume and profit margins. This is where the visuals lend themselves very well to data analysis — they give us pointers to actionable insights.

Continuing further, on the second Latitude field in the Columns field, right click on the down arrow. Select Dual Axis option.

The screen automatically changes as follows. The reason why we use Dual Axis is to view the two parameters — Profit and Discount together. Pursuing our previous analysis, the combined chart helps to see if our hypothesis is correct. As evident here, the larger the discount, the higher the loss.

Rename the file as Mapping Discounts & Profits and save to Tableau Public.

Congratulations, your first Geocoded Tableau visualization is complete.

Granularity and Time Series

Now, we move on to another two important concepts: granularity and time series.

Granularity refers to the level of detail of the data stored in a table. In other words, this corresponds to what a single row represents. Tableau’s level of detail allows us to whittle down to the most granular level of a field. For instance, in Category, we can further probe into the next level of sub-category. Or in case of dates, in this dataset we can inspect in hierarchical details as Year > Quarter > Month > Day.

So, let’s get started. Here, we will analyze profits over time for different categories. We can learn about granularity and level of details as well in depth.

Hands-On Practice

Select a new sheet to open a new visualization. You can do this by clicking on the Plus icon next to Sheet 1 on the bottom panel. Select Order Date from Dimensions and add to Column section.

Automatically, Tableau adds the values at the level of years.

Click on the + symbol next to Year to extend the detailing to every Quarter.

Continue until the values do not get any more atomic. In this example, it is at the level of Day. In other words, the highest granularity you can get with this data is the Day level.

In our particular example, we can exclude quarters and days, choosing to focus only on Years and Months.

We can also choose other options. Clear the worksheet to do this by selecting the following in the top menu: Worksheet → Clear → Sheet

The worksheet will become blank again. Observe that only the sheet under consideration was reset and not our previous geocoded visualization.

Again, add Order date to Column section. Click on the Down arrow to view the different options and levels of granularity available.

We can retain Years here. Either add the Order Date again to Columns section or click the ‘+’ symbol on Year(Order Date). Both result in Quarter(Order Date) being added. Change it to Month(Order Date) using the options.

Drag Profit to the Rows section. Since, we want to analyze profits over time, we are including this in our visual. Change the default aggregation from Sum to Average.

Viz is created by default as:

From the Marks card, select Bar graph.

You can alternatively select Bar charts from the Show Me panel on the top right. This however requires swapping Rows and Columns later to view the viz similar to one given below. This is because Dates get mapped to Rows and Profit to Columns on using the Bar chart option.

We will now add details to this graph by adding Categories on the Colors card in Marks panel.

Next, assign Profit to Detail on the Marks card and Category to Color. Change the Measure from default aggregation of sum to average.

Repeat this step one more time. This time additionally choose the Percent Difference option from Quick table calculation in the menu. Refer to the Image below to see the selections.

The second average now shows difference from the previous month values when hovering over the different bars in the graph.

You can also observe how the second AVG detail shows the Delta symbol next to it. This indicates relative difference. Hover on the Delta symbol to view the explanation.

But now the descriptions look really cluttered and clunky. We can alter this by editing the tooltip. Click on tooltip card. We see the values being displayed as well as the variable names. The syntax followed is Variable Name : <Value>

Since the description of the %Avg variable is very long, we can edit it to be shorter and more concise as follows:

Upon selecting, our tooltip values change, and hovering action produces a neater display.

In the tooltip, also remove the Month and Year values. This way, only essential details are illustrated — reducing cognitive load on audiences.

We now add filters to this visualization to make it more interactive. Our first filter is Order Date. Drag this filter to the Filters card (above Marks card). A window appears with options for filtering criteria

Here, we select Range of Dates.

We get a new menu to specify the range. Check the Null values checkbox and click OK. This will include any records with dates being unspecified i.e. NULL.

The selection is applied and appears in the Filter section. But, we cannot use it at this point. Hence, click on the down arrow next to it and choose the option as ‘Show Filter’.

This creates the sliding filter on the right hand panel. Shift the range to observe the changing viz.

We can also view the differences day-wise: for instance, difference on weekends and weekdays. To do this, simply add another date filter by selecting Weekdays instead of Range of Dates.

Select All and click OK.

Select Show Filter in the down arrow menu and choose weekends i.e. Saturday and Sunday. Toggle to entire week by clicking on Filter option ‘All’

Let’s go over the insights we can gain from this viz.

It is clear that profits made from Office Supplies orders are reduced or non-existent over the weekends. Technology profits are high during the weekends whereas Furniture profits are low or even negative.

Comparisons like this can present action items such as investigating discount feasibility during weekends or during holiday months as in this example, in January. It can point us to probe further if January weekends show a trend in furniture losses or if the discounts levied in this month were causal. (In this dataset, it is possible to theorize that New Years sales and discounts are levied all through January. With two long weekends: New Years weekend and Martin Luther King Day, consumers used this to update their home furniture and hence, losses resulted.)

Filters along with Granularity and Level of Detail empower our data analysis as we can scrutinize different aspects and parameters.

We can also add a Quick Filter to the viz on Category Dimension as Analysis → Filters → Category. This will not only add the Filter but also show up in the right panel.

Rename the sheet as “Profit Metrics” and save the file again. This time, Tableau saves the updates to existing file as multiple tabs.

Great work! We covered a lot in this lesson, but the best learning will come from practice. If you are ready for advanced mapping visualizations, check this link.


© 2016-2022. All rights reserved.