How much should you pay for 1GB of data?

Martin Lippert
7 min readMay 20, 2021

--

Have you ever wondered whether you got the right deal for your data plan? The questions becomes even more confusing when traveling abroad. Prices vary greatly across continents and even countries! Here is an analysis of the average cost of 1GB per country in the world.

We are using a dataset and existing visualization from Makeovermonday to try and get even more out of it and have a better understanding of what the dataset is telling us. For a detailed look at the source data, please have a look here.

The dataset was originally published here with the following visualization:

Original Visualization

What is immediately visible is the three most expensive countries. When we look further down, however, it becomes extremely difficult to distinguish between the flags and compare the countries among each other.

In the following, we will build our own data presentation based on a defined problem statement and its underlying issue tree. Then, we will analyze how the solution is more easily accessible with the new approach and how it avoids other deficits of the original. Finally, we will list the potential limitations and biases within the data set.

Defining The Problem

Let’s first have a look at the data structure based on this excerpt:

Data Excerpt

The data really only consists of two kinds of information: a list of countries with their associated costs in USD per 1GB of data. This makes for a very clearly and narrowly defined problem statement. For this purpose, let’s assume we are executing this analysis for a group of travelers who need this information to plan their future trips:

Problem Statement

Which countries are the cheapest and which are the most expensive around the world and per continent, based on the cost of 1GB of data?

This is a clearly defined, actionable problem statement which is meaning full to our target group of travelers looking for the cheapest options around the world.

Issue Tree

The corresponding action tree looks like this:

Issue Tree

The three sub-issues are clearly separated in their aim but are all in need of the same, limited input data. To answer all the questions, we need to engineer an additional column called ‘Region’ where we associate each country with its region. We do this with a lookup file provided by Makeovermonday within a separate project. Please find its source here.

New visualization as a data presentation.

The newly formed problem statement allows us to dig deeper in the data and not only view the countries at a whole but compare them at a more granular level. For that, a suitable choice is its visualization as a data presentation based on the Zoom-In story type using maps to first have a look at the overall picture and then zoom in to smaller regions.

Please follow this link to have a look at the final data story published on Tableau Public.

In the following, we will have a look at two representative story tiles from the publication in order to describe the story approach and compare the visualization choices to the original.

Overall Structure

The structure of the visualization is as follows:

  • Simple Title Page.
  • World view dashboard with a map colored by the price of data and an associated bar chart comoparing the top and bottom ranking countries.
  • 3 more slides with regional break downs following the same structure as the world view.
  • A final page for the viewer to interactively group countries together and compare their own set of countries for individual decision making.

With this structure, the problem statement is answered in all its facets, allowing for a clear picture of cheapest and most expensive countries on a entire world level, exemplary regional levels, as well as individual comparisons tailored to the target group’s travel plans.

World View

Tableau Public: Story Excerpt — Slide 1 (World View)

The main slides of the story all consist of the structure mentioned above:

  • a big overview map on the left, colored by the cost of 1GB of data with a few annotations highlighting some especiall cheap or expensive countries.
  • on the right, we have a horizontal bar chart showing the top and bottom 3 countries based on their cost of data, delivering a visual contrast and comparison option.
  • For the colors of the map, an orange sequential scale was chosen due to the nature of the figures displayed. (all-positive expenditures)
  • The colors were not repeated in the bar chart to avoid dual encoding. Here, it was more important to give a clear impression as to how much cheaper the cheapest countries are. Therefore, a visualization by length is much more effective than color. For a clear contrast to the other graph, blue was chosen as the complementary color, resulting in the most suitable color combination for colorblind people.

This slide is the replacement of the original visualization, whereas the following slides are additional zoom-ins further enhancing the insights gained by the dataset. Therefore, it is worthwhile comparing this exact page to its orginial and list the deficits it avoids:

  • We now have a map, making it easy to see how the differently priced countries are distributed across the world. While the most expensive countries are all in Africa, this is in no way an indication that Africa is expensive overall, on the contrary. This is not immediately clear in the original.
  • The new visualization shows the exact prices for each country in the tool tip, an information lost in the original visualization as well.
  • The original uses dual encoding for the price, the position on the y-scale as well as the size of the circles. Evidence shows this is more confusing than in helps interpret the data.
  • Colors should be used in a limited way that serves the interpretation of the data. The country flags make the visualization fuzzy and do not help the viewer in general, as we usually have limited knowledge of a country’s flag. Position on a world map solves that problem.
  • The cheapest and most expensive countries are now shown next to each other to be able to properly compare them as opposed to on opposite ends of the chart. Furthermore, bar charts are used instead of only pure numbers to visualize the difference even better.

Zoom-In Example: Europe

Tableau Public: Story Excerpt — Slide 2(Europe)

This is one example of the zoomed-in views of the map using the regional data linked to the data set. This way, the regional element of the problem statement is fulfilled, and the viewer gets a clear impression as to how prices still differ greatly within a limited region of countries. The structure is exactly the same as for the world view and the same argument apply.

For example, it is interesting to see in the case of Europe that Greece and Italy lie at opposite ends of the price scale despite their cultural, economical and geographical proximity. A similar observation can be made for Japan and Korea in the case of Aisa & Pacific.

Limitations and Biases

Nevertheless there are some limitations and biases both coming directly from the data step as well as introduced via the analysis. Let’s have a look at them here:

Data Collection Stage

  • The dataset is not exhaustive and shows missingness. The missing countries are most probably a case of missingness at random (MAR), potentially having to do with political limitations or other difficulties to even retrieve the necessary data. Therefore, no impact on the availble data is to be expected, except for the fact that we are dealing with undercoverage and maybe missing out on even more extreme values we could not capture when gathering the data.
  • The distribution of the cost of 1GB of data is skewed. There are a few outlier-ish values towards the top of the scale. This is also not the biggest issue for this analysis, as we put a special focus on finding these extreme values. However, if the analysis should be extended towards finding out why these difference exists, this bias needs to be further investigated.

Data Processing Stage

  • At a data processing stage, regions were joined with the original data set. At this stage, we rely entirely on the correctness of the joined data set and need to be aware of potential mismatches or countries without a matching region. For that purpose, geographical knowledge needs to further flow into the interpretation of the data when viewing different zoom-ins.

Data Insights Stage

  • When seeing specific countries and their rank in prices, we run the risk of confirmation bias; especially due to the fact that no reasons behind the differences in the prices are investigated.
  • Non-gathered confounding variables may need to be kept in mind for a clear conclusion as to which country to choose for the next trip, e.g. the number and variance between different phone companies, which may have had an influence on the data as well as on the final actual price you will get for 1GB after making the final choice.

Overall, this new visualization can help you make your next choice if data plans are an important factor to you. Have fun drawing your own conclusions on the last page of the story.

What is your next travel destination going to be?

By: Martin Lippert | Data Source: Visual Capitalist, via Makeovermonday | Data Story created with Tableau

--

--

Martin Lippert
Martin Lippert

No responses yet