Do you want to make sense of messy data? Google Refine may prove to be the right tool! It allows for cleaning up messy data, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase.
It takes only 8 minutes to watch the following introductory video!
Do you want to learn more? The next video explains how to transform a wikipage like this into a table by isolating rows of text using a filter and transforming them in one shot using a command.
If you still have time to spent in learning about Google Refine, you may watch the following video. It explains how to augment a dataset with external data. In particular, it shows
- how to obtain latitude and longitude of an address by invoking Google Geocoding Web Service and parsing/filtering the JSON results;
- how to group multiple text filed in multiple languages by language using Google Language Detection Tool;
- how to link text fields to freebase IDs using free based reconciliation service; and
- how to augment your data set with freebase data.

