Lightning Talk: Board Games + R¶

Website for board game geeks
Getting data out of the site
- rvest for simple scraping & accessing BGGXMLAPI2
- SelectorGadget
- dplyr

# the beginnings of me doing a webscraping tutorial
# https://rpubs.com/Radcliffe/superbowl
library(rvest)
library(stringr)
library(tidyr)

Loading required package: xml2

url <- 'http://espn.go.com/nfl/superbowl/history/winners'
webpage <- read_html(url)

sb_table <- html_nodes(webpage, 'table')
sb <- html_table(sb_table)[[1]]
head(sb)

PCA & MDS¶

Linearity: assumes data to be linear combinations of variables
Mean & Covariance: No guarantee that directions of maximum variance will contain good features for discrimination
Large variances = important: Assumes larger variance = interesting and lower variance = noise

Visually represent proximities between predictors
Input is matrix of distances
Goal: find projections that preserve original distances in input matrix in lower dimensional space
Distances are preserved by optimizing a stress function
Non-linear
More on MDS

X1	X2	X3	X4
Super Bowl Winners and Results	Super Bowl Winners and Results	Super Bowl Winners and Results	Super Bowl Winners and Results
NO.	DATE	SITE	RESULT
I	Jan. 15, 1967	Los Angeles Memorial Coliseum	Green Bay 35, Kansas City 10
II	Jan. 14, 1968	Orange Bowl (Miami)	Green Bay 33, Oakland 14
III	Jan. 12, 1969	Orange Bowl (Miami)	New York Jets 16, Baltimore 7
IV	Jan. 11, 1970	Tulane Stadium (New Orleans)	Kansas City 23, Minnesota 7