Alaskan Outdoor Fatality MapJune 28, 2013

Data Sources

Tab-Separated Table (TSV File)
Spreadsheet (ODS Format)
Google Earth (KML File)
Geo Data (GeoJSON)

The data for this map was collected from akfatal.net. Tagline: "Many people have died in the Alaskan outdoors. Read their stories and learn from them."

The map itself was put together in CartoDB.

Data Smoothing

LibreOffice is a free software alternative to Microsoft Excel.

The original data is stored in an HTML table. It's prety easy to import the data into Excel or LibreOffice, but after that there's a few inconsistencies to iron out.

Time Formats

The date format is inconsistent across entries, but LibreOffice is pretty good at detecting alternate date formats. In some cases I found it easiest to save the spreadsheet as a CSV file and re-import it in order to clear out the text delimiters (') for unrecognized formats.

Geo-Referencing

CartoDB has a built-in georeferencing feature to extract latitude & longitude coordinates from place names. However, it's not very good at recognizing features like the rivers, glaciers, and peaks that make up most of the dataset. My first thought to get around this limitation was to try and hack National Geographic's Topo software to spit out a list of coordinates. I was disappointed to find that not only did Topo not run on Linux, it had been discontinued. Turning to Google, I happily discovered the USGS Placename Information System, which was probably where the Topo data came from anyway. I downloaded the Alaska dataset and imported it into a new sheet.

Accident Database Sample

Name Date Location Activity Cause of Death
Pete Egelzian 1920-04-28 Bird Railroad Avalanche
John Rudeen 1920-04-28 Bird Railroad Avalanche
R. Romero 1920-04-28 Bird Railroad Avalanche

USGS Database Sample

FEATURE_ID FEATURE_NAME FEATURE_CLASS STATE_ALPHA STATE_NUMERIC COUNTY_NAME COUNTY_NUMERIC PRIMARY_LAT_DMS PRIM_LONG_DMS PRIM_LAT_DEC PRIM_LONG_DEC SOURCE_LAT_DMS SOURCE_LONG_DMS SOURCE_LAT_DEC SOURCE_LONG_DEC ELEV_IN_M ELEV_IN_FT MAP_NAME DATE_CREATED DATE_EDITED
247074 Pacific Ocean Sea CA 6 Mendocino 45 391837N 1235041W 39.3102778 −123.8447222 0 0 Mendocino 01/19/1981 05/16/2011
1397640 Cape Hinchinbrook Cape AK 2 Valdez-Cordova (CA) 261 601405N 1463830W 60.2347222 −146.6416667 0 0 Cordova A-8 01/01/2000 02/05/2009

Accuracy

There's probably a better (regular-expression-y) way to look up the geo-coordinates, but I just used VLOOKUP(C2,Places.$B$2:$K$35126,9,0) and did a bit of manual correction. In some cases I had to rely on local knowledge to find the nearest fit for points that weren't found in the database. I'd say 95% of the points are accurate to within 20 miles or so, but a few errors definitely slipped through (largely due to repeated place names within the state).