The Data Science Toolkit is a “collection of the best open data sets and open-source tools for data science, wrapped in an easy-to-use REST/JSON API with command line, Python and Javascript interfaces”.
Examples of the services provided by the toolkit are:
- Street Address to Coordinates conversion: calculates the latitude/longitude coordinates for a postal address.
Currently restricted to the US and UK. - File to Text conversion: extracts text from PDFs, Word Documents, Excel Spreadsheets. It also recovers text from JPEG, PNG or TIFF images of scanned documents
- Coordinates to political areas conversion: returns the country, region, state, county, constituencies and neighborhood a point is inside.
- GeoDict: it pulls country, city and region names from unstructured English text, and returns their coordinates.
- IP Address to Coordinates conversion: it calculates country, state, city and latitude/longitude coordinates for IP addresses.
The toolkit also contains services for text analysis, such as the Text To People and the Text To Time services.
The latest version is marked as 0.35, and it has been released in April 17th 2011. The Data Science Toolkit was assembled by Pete Warden and the source code is available at http://github.com/petewarden/dstk
powered by a real-time sentiment analysis on Twitter and other social networks. Mombo provides an 

