Alaskan Outdoor Fatality MapJune 28, 2013
Data Sources
Tab-Separated Table (TSV File)
Spreadsheet (ODS Format)
Google Earth (KML File)
Geo Data (GeoJSON)
The data for this map was collected from akfatal.net. Tagline: "Many people have died in the Alaskan outdoors. Read their stories and learn from them."
The map itself was put together in CartoDB.
Data Smoothing
LibreOffice is a free software alternative to Microsoft Excel.
The original data is stored in an HTML table. It's prety easy to import the data into Excel or LibreOffice, but after that there's a few inconsistencies to iron out.
Time Formats
The date format is inconsistent across entries, but LibreOffice is pretty good at detecting alternate date formats. In some cases I found it easiest to save the spreadsheet as a CSV file and re-import it in order to clear out the text delimiters (') for unrecognized formats.
Geo-Referencing
CartoDB has a built-in georeferencing feature to extract latitude & longitude coordinates from place names. However, it's not very good at recognizing features like the rivers, glaciers, and peaks that make up most of the dataset. My first thought to get around this limitation was to try and hack National Geographic's Topo software to spit out a list of coordinates. I was disappointed to find that not only did Topo not run on Linux, it had been discontinued. Turning to Google, I happily discovered the USGS Placename Information System, which was probably where the Topo data came from anyway. I downloaded the Alaska dataset and imported it into a new sheet.
Accident Database Sample
Name | Date | Location | Activity | Cause of Death |
Pete Egelzian | 1920-04-28 | Bird | Railroad | Avalanche |
John Rudeen | 1920-04-28 | Bird | Railroad | Avalanche |
R. Romero | 1920-04-28 | Bird | Railroad | Avalanche |
USGS Database Sample
FEATURE_ID | FEATURE_NAME | FEATURE_CLASS | STATE_ALPHA | STATE_NUMERIC | COUNTY_NAME | COUNTY_NUMERIC | PRIMARY_LAT_DMS | PRIM_LONG_DMS | PRIM_LAT_DEC | PRIM_LONG_DEC | SOURCE_LAT_DMS | SOURCE_LONG_DMS | SOURCE_LAT_DEC | SOURCE_LONG_DEC | ELEV_IN_M | ELEV_IN_FT | MAP_NAME | DATE_CREATED | DATE_EDITED | |
247074 | Pacific Ocean | Sea | CA | 6 | Mendocino | 45 | 391837N | 1235041W | 39.3102778 | −123.8447222 | 0 | 0 | Mendocino | 01/19/1981 | 05/16/2011 | |||||
1397640 | Cape Hinchinbrook | Cape | AK | 2 | Valdez-Cordova (CA) | 261 | 601405N | 1463830W | 60.2347222 | −146.6416667 | 0 | 0 | Cordova A-8 | 01/01/2000 | 02/05/2009 |
Accuracy
There's probably a better (regular-expression-y) way to look up the geo-coordinates, but I just used VLOOKUP(C2,Places.$B$2:$K$35126,9,0) and did a bit of manual correction. In some cases I had to rely on local knowledge to find the nearest fit for points that weren't found in the database. I'd say 95% of the points are accurate to within 20 miles or so, but a few errors definitely slipped through (largely due to repeated place names within the state).