Five Public Datasets, and Lots of Ideas for Exploring Them

ufo

The world is full of interesting datasets. But even though data is increasingly accessible, it’s sometimes hard think up an interesting problem to analyze. Maybe there are just too many possible questions, maybe it’s a pain to set up analytical tools, or maybe it’s just too easy to get distracted by animal GIFs.

Whatever the case, we want to make it easier to start working on interesting problems right away. Here are five datasets, already loaded into Mode’s public database, that you can query, analyze, and visualize right now.

For each dataset, I’ve provided a link to the table in Mode’s public data warehouse. If you’re feeling lazy and only want to work with a tiny amount of data (as in, one row), I found the best single row of data from each dataset. And if you’re feeling ambitious—and want to get popular on the internet or explain some things—I added some ideas for turning these datasets into maps.

FEC Campaign Finance Data

The Federal Election Commission requires candidates to make their campaign expenditures public. This dataset includes over 200,000 campaign expenditures from the 2012 U.S. presidential campaign, and is full of fascinating discoveries. Like Herman Cain’s $150,000 expense on Herman Cain. And the $5,000–the most of any candidate by far–Mitt Romney spent at liquor stores. And Ron Paul’s and Romney’s addiction to fast food (and Obama’s clear preference for Subway).

Herman Cain be like:
treat yo self

Crunchbase

Crunchbase is quickly becoming the dataset of record for the startup and venture capital communities. It can provide information on anything from what industries are hot (biotech) to the potential effects of founder experience or age. The dataset includes funding, investment, and acquisition data on over 40,000 companies.

UFO Sightings

Quandl, which provides millions of free datasets on vast range of subjects, added data on UFO sightings to Mode. The data includes the number of reported sightings by month. Quandl gets the data from the National UFO Reporting Center (and in case you need to report a sighting, they have a hotline).

FiveThirtyEight

FiveThirtyEight, Nate Silver’s data journalism site, produces a lot of great analysis. For some articles, they publish the underlying data on GitHub. If you want to explore their data or expand on their analyses, we’ve uploaded most of their datasets. A few topics include classic rock radio plays, the ages of Congressional representatives, World Cup predictions, and surveys about defining U.S. geographic regions and international cuisine preferences.

Holidays all over the world

This dataset includes a list of all the holidays in the world over the next year. While this data is useful for analysis, it could be even more valuable for figuring out which parts of the world—and which of your customers—are on vacation.

Ideas for More?

Inspired to do something fun with one of these datasets? Send us a link to your project on Twitter or Facebook, and we’ll share some of the best work! And if you want to make a map, we’ll soon be publishing a quick tutorial for how make one, but feel free to email us if you have any questions now.

 
5
Kudos
 
5
Kudos

Now read this

Are Traffic Lights the Real Hyperloop?

At some point, presumably while idling in his Tesla Model S on a clogged freeway in Los Angeles or San Francisco - which account for a remarkable 9 of the 10 most congested roads in the country - Elon Musk decided he’d had it with... Continue →