Browsing Posts tagged API

The Data Science Toolkit is a “collection of the best open data sets and open-source tools for data science, wrapped in an easy-to-use REST/JSON with command line, Python and Javascript interfaces”.

Examples of the services provided by the toolkit are:

  • Street Address to Coordinates conversion: calculates the latitude/longitude coordinates for a postal address.
    Currently restricted to the US and UK.
  • File to Text conversion: extracts text from PDFs, Word Documents, Excel Spreadsheets. It also recovers text from JPEG, PNG or TIFF images of scanned documents
  • Coordinates to political areas conversion: returns the country, region, state, county, constituencies and neighborhood a point is inside.
  • GeoDict:  it pulls country, city and region names from unstructured English text, and returns their coordinates.
  • IP Address to Coordinates conversion: it calculates country, state, city and latitude/longitude coordinates for IP addresses.

The toolkit also contains services for text analysis, such as the Text To People and the Text To Time services.

The latest version is marked as 0.35, and it has been released in April 17th 2011. The Data Science Toolkit was assembled by Pete Warden and the source code is available at http://github.com/petewarden/dstk

The ParkWhiz API provides developer with an access to the ParkWhiz’s real-time parking and event data in major US cities, airports, venues, and events. The ParkWhiz deals with 4 types of objects:

  • Parking locations: a specific geographic location; it contains the description of the physical location of a parking spot.
  • Parking listings: pricing and availability information for a parking location.
  • Venues: points of interest, such as theatres, stadiums, or airports. A venue object describes the geographic location of the venue, as well as any events occurring at that venue.
  • Events Events describe the start, end, and name of events occurring at a venue. Think of events as a pre-built search query for parking.

APIs allow to search for available parking at a specific location and time, but also to create a ParkWhiz reservation.

API

Mombo is a movie recommendations and ratings Web application  powered by a real-time sentiment analysis on Twitter and other social networks. Mombo provides an API that enables developers to access topical lists (in theaters, most popular, coming soon) of movies and specific information for individual movies in the Mombo.com database.

Metropolitan Area API

The WMATA API provides access to Washington Metropolitan Area Transit Authority transparent data sets, including information about:

  • rail and bus stations
  • rail and bus lines
  • bus positions
  • arrival time estimates
  • incident information (rail, elevators, bus)

Google Refine

No comments

Do you want to make sense of messy data? Google Refine may prove to be the right tool! It allows for cleaning up messy data, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase.

It takes only 8 minutes to watch the following introductory !

Do you want to learn more? The next video explains how to transform a wikipage like this into a table by isolating rows of text  using a filter and transforming them in one shot using a command.

If you still have time to spent in learning about Refine, you may watch the following video. It explains how to augment a dataset with external data. In particular, it shows

Google Shopping API

No comments
Google Shopping API Logo

Shopping

Google announced the release of the Shopping API, a new set of Web Application Programming Interfaces that are meant to substitute the existing Google Base APIs. The new Shopping Application Programming Interfaces (APIs) have two main components: Content and Search. Those components are part of a unique CRUD infrustructure for product data management.

On one hand, the Content API enables retailers to upload their product data to Google, and to make incremental updates to frequently changing attributes like price and availability.

On the other hand, the Search API provides access to product data. After creating a new project in the APIs console, a developer can issue JSON queries as the following one:

https://www.googleapis.com/shopping/search/v1/public/products?key=key&country=US&q=digital+camera&alt=atom

This query will return a feed pf products sold in the United States which are all matching the keywords digital and camera. With a registered account, the new Google Shopping API feature a default limit: 2,500 queries/day

The API supports both structured and free text search. Results can be ordered according to relevance, novelty, or price. It is possible to increase diversity in the set of products matching a query by using the APIs crowding mechanism to restrict the number of products with an equivalent property.

The Google Base API will be fully deactivated on June 1, 2011. Some non-shopping data types (such as jobs, real estate, events, and activities) won’t be supported anymore.

Mechanical Turk (Mturk) is a Web service where users, turkers, are paid small rewards (few cents) for short computational task called HITs (Human Intelligence Tasks). A contractor generates the HITs, post them on Mturk and later download all the result.

TurKit is a Java/JavaScript (developed by the Design Group at MIT) for running iterative tasks on Mechanical Turk. As of today, TurKit represents the first example of iterative tasks framework for Mturk, as it allows users to perform incremental tasks by automatically generating HITs based on the results of previous HITs.

Many applications can benefit from this iterative paradigm: turkers can take turns improving a passage of text, verify each other’s work by voting on it or implement the comparison function of an iterative sorting algorithm. In the context of SeCo, turkers can be employed, for instance,  to evaluate the quality of a query response.

continue reading…

Powered by WordPress Web Design by SRS Solutions © 2012 Search Computing Blog Design by SRS Solutions
Rss Feed Tweeter button Facebook button Linkedin button Delicious button Digg button