How to use the Yelp API in Python
In last week’s post, I pulled data about local restaurants from Yelp to generate a dataset. I was happy to find that Yelp actually has a very friendly API. This guide will walk you through setting up some boiler plate code that can then be configured to your specific needs.
Step 1: Obtaining Access to the Yelp API
Before you can use the Yelp API, you need to submit a developer request. This can be done here. I’m not sure what the requirements are, but my guess is they approve almost everyone. After getting access, you will need to get your API keys from the Manage API access section on the site.
Step 2: Getting the rauth library
Yelp’s API uses OAuth authentication for API calls. Unless you want to do a lot of work, I suggest that you use a third party library to handle the OAuth for you. For this tutorial I’m using rauth, but feel free to use any library of your choice. You can use <a href="http://peak.telecommunity.com/DevCenter/EasyInstall" target="_blank" title="Easyintall">easy_install</a> rauth or <a href="http://www.pip-installer.org/en/latest/installing.html" target="_blank" title="pip Installation">pip install</a> rauth to download the library.
Step 3: Write the code to query the Yelp API
You’ll first need to figure out what information you actually want to query. The API Documentation gives you all of the different parameters that you can specify and the correct syntax. For this example, we’re going to be doing some location-based searching for restaurants. If you store each of the search parameters in a dictionary, you can save yourself some formatting. Here’s a method that accepts a latitude and longitude and returns the search parameter dictionary:
def get_search_parameters(lat,long):
#See the Yelp API for more details
params = {}
params["term"] = "restaurant"
params["ll"] = "{},{}".format(str(lat),str(long))
params["radius_filter"] = "2000"
params["limit"] = "10"
return params
Next we need to build our actual API call. Using the codes from the Manage API access page, we’re going to create an OAuth session. After we have a session, we can make an actual API call using our search parameters. Finally, we take that data and put it into a Python dictionary.
def get_results(params):
#Obtain these from Yelp's manage access page
consumer_key = "YOUR_KEY"
consumer_secret = "YOUR_SECRET"
token = "YOUR_TOKEN"
token_secret = "YOUR_TOKEN_SECRET"
session = rauth.OAuth1Session(
consumer_key = consumer_key,
consumer_secret = consumer_secret,
access_token = token,
access_token_secret = token_secret)
request = session.get("http://api.yelp.com/v2/search",params=params)
#Transforms the JSON API response into a Python dictionary
data = request.json()
session.close()
return data
Now we can put it all together. Since Yelp will only return a max of 40 results at a time, you will likely want to make several API calls if you’re putting together any sort of sizable dataset. Currently, Yelp allows 10,000 API calls per day which should be way more than enough for compiling a dataset! However, when I’m making repeat API calls, I always make sure to rate-limit myself. Companies with APIs will almost always have mechanisms in place to prevent too many requests from being made at once. Often this is done by IP address. They may have some code in place to only handle X calls in Y time per IP or X concurrent calls per IP, etc. If you rate limit yourself you can increase your chances of always getting back a response.
def main():
locations = [(39.98,-82.98),(42.24,-83.61),(41.33,-89.13)]
api_calls = []
for lat,long in locations:
params = get_search_parameters(lat,long)
api_calls.append(get_results(params))
#Be a good internet citizen and rate-limit yourself
time.sleep(1.0)
##Do other processing
At this point you have a list of dictionaries that represent each of the API calls you made. You can then do whatever additional processing you want to each of those dictionaries to extract the information you are interested in. When working with a new API, I sometimes find it useful to open an interactive Python session and actually play with the API responses in the console. This helps me understand the structure so I can code the logic to find what I’m looking for. You can get this complete script here. Every API is different, but Yelp is a friendly introduction to the world of making API calls through Python. With this skill you can construct your own datasets from any of the companies with public APIs.