Board Game Geek Top 10: The Past 15 years

Lifestyle Culture

Explore the top 10 games in the Board Game Geek database over time.

Quinn Hargrove
2021-05-10
Show code

##Introduction

Board Game Geek is the largest database on the internet that collects information on board games. In addition to storing information on almost every game that has been released, BGG also offers the ability for each of its users to write reviews for each of the games that they’ve played. These reviews are displayed on the pages for each game, but they’re also aggregated into a list of the most popular games.

This isn’t a great metric for determining if a game is going to be fun to play, as everyone has different tastes, but it’s still interesting to look at what the general BGG user consensus thinks the best games are.

I wanted to examine the changes in these rankings over time using Plotly, a tool that was touched on in Math 241, but wasn’t fully explored.

##Data collection and Wrangling

Thankfully, I didn’t need to trudge through Internet Archive’s backups of the Board Game Geek rankings to find their historical values, as BGG user/“Wii Music” fan/#1 geocaher in Belgium LordT has kindly recorded the top 50 entries from every month since December 2006, along with stats on the changes from month to month.

I manually entered the names of the top 10 from the first of January for each year from 2007 to 2021 into a spreadsheet, joining them with data on each game scraped from the database by Markus Shepherd.

Show code
#Data from https://www.kaggle.com/mshepherd/board-games
BGG_data <- read_csv("data/BGG_data.csv")

#Data manually collected from https://www.boardgamegeek.com/geeklist/30543/bgg-top-50-statistics-meta-list
Top10OverTime <- read_csv("data/Top10OverTime.csv")

Before that join could be done, duplicate names needed to be removed from the scraped data in order to solve the an issue that currently plagues all forms of media: sequels and remakes that are named exactly the same as the originals, as exemplified by the naming of “Doom” (1997, video game), “Doom” (2004, board game), “Doom” (2005, movie), “Doom” (2016, video game), and “Doom” (2016, board game).

Luckily, picking the correct one was as simple as keeping the oldest version and removing the rest. Thankfully, neither “Doom” board game has made it to the top 10, and BGG doesn’t collect data on video games, otherwise this might have been a bit harder.

Show code
BGG_data <- BGG_data %>%
  arrange(year) #This step makes it so that the later removal of duplicates keeps the older version of the game.

Top10Stats <- Top10OverTime %>%
  merge(BGG_data[!duplicated(BGG_data[1:1]),],
        by.x = "Game Title", by.y = "name") %>% #Removes any row that has exactly the same name, but was released in a later year.
  rename("Released" = "year", "Title" = "Game Title",
         "Rank" = "BGG Top 10 position") #Plotly has even more trouble with spaces than ggplot, so renaming these variables is necessary.

##Plotly

With the data wrangled, it was time to learn Plotly.

My goal was to make a plot that was interactive (toolips on hover that give information on the games in question), showed the path each game took over time, and had animation.

This proved to be difficult. ggplotly was the first tool I tried, as I’m fairly confident in my ability to use ggplot2, but unfortunately animated line plots that accumulate over time are not fully supported with Plotly (only geom_point() is complete), causing lines to disappear, change colors, and move about in undesired ways.

The approach I eventually ended up using was to make a plot_ly object, add a static line plot and an animated scatterplot, using Plotl’s native method of creating visualizations.

Unfortunately, while this code works correctly in RStudio, the graph extends behind the legend in-browser (on Firefox, at least), which is an unfortunately common issue in the examples given on Plotly’s website, but only for the Plotly.R package. This doesn’t appear to be an issue in the Plotly Python library.

Some other issues arose while working on the Plotly graph, but those were solveable. For example, adding the scatterplot makes games that were only on the top 10 for 1 year visible, and sorting the data by the year

##How to Use

There are two ways to use the graph in order to view the data.

With the year set to 0 (the leftmost setting on the slider), clicking a game’s title in the legend will show the path that game took while in the top 10.

Scrubbing along the timeline will allow you to exclusively look at the top 10 for any given year instead of looking at games over time.

Some interesting things to look at:

Toggling “Agricola,” “Gloomhaven,” “Pandemic Legacy: Season 1,” “Puerto Rico,” and “Twilight Struggle” will show the paths taken by every game that has held the #1 spot.

Hovering over a point on the line graph will show you the rank that game held in Jan 2021, along with the year that game was released.

Show code
Top10Stats <- Top10Stats %>%
  arrange(Year) #Unlike ggplot, plotly cannot infer the order of points, so they need to be arranged using something else beforehand.

fig <- plot_ly()

fig <- add_trace(
   fig,
   type = 'scatter',
   mode = 'line',
   x = Top10Stats$Year,
   y = Top10Stats$Rank,
   frame = 0,
   color = Top10Stats$Title,
   alpha = 0.8,
   visible = "legendonly",
   hovertext = paste(
    Top10Stats$Title,
    "<br> Released: ", Top10Stats$Released,
    "<br> 2021 Rank: ", Top10Stats$rank))

fig <- add_trace(
   fig,
   type = 'scatter',
   x = Top10Stats$Year,
   y = Top10Stats$Rank,
   frame = Top10Stats$Year,
   color = Top10Stats$Title,
   hovertext = paste(
    Top10Stats$Title,
    "<br> Released: ", Top10Stats$Released,
    "<br> 2021 Avg Rating: ", Top10Stats$avg_rating))


fig <- fig %>% animation_opts(1000, easing = "quad-in-out", redraw = TRUE)
fig <- fig %>% layout(xtitle = "BGG Top 10: 2007-2021",
    yaxis = list(title = "Rank as of Jan 1st",
                 zeroline = F,
                 range = c(10.4, 0.6)),
    xaxis = list(title = "Year",
                 zeroline = F, 
                 showgrid = T))

fig

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".