How to use the Yelp API in Python

In last week’s post, I pulled data about local restaurants from Yelp to generate a dataset. I was happy to find that Yelp actually has a very friendly API. This guide will walk you through setting up some boiler plate code that can then be configured to your specific needs.

Step 1: Obtaining Access to the Yelp API

Before you can use the Yelp API, you need to submit a developer request. This can be done here. I’m not sure what the requirements are, but my guess is they approve almost everyone. After getting access, you will need to get your API keys from the Manage API access section on the site.

Step 2: Getting the rauth library

Yelp’s API uses OAuth authentication for API calls. Unless you want to do a lot of work, I suggest that you use a third party library to handle the OAuth for you. For this tutorial I’m using rauth, but feel free to use any library of your choice.

You can use easy_install rauth or pip install rauth to download the library.

Step 3: Write the code to query the Yelp API

You’ll first need to figure out what information you actually want to query. The API Documentation gives you all of the different parameters that you can specify and the correct syntax.

For this example, we’re going to be doing some location-based searching for restaurants. If you store each of the search parameters in a dictionary, you can save yourself some formatting. Here’s a method that accepts a latitude and longitude and returns the search parameter dictionary:

def get_search_parameters(lat,long):
	#See the Yelp API for more details
	params = {}
	params["term"] = "restaurant"
	params["ll"] = "{},{}".format(str(lat),str(long))
	params["radius_filter"] = "2000"
	params["limit"] = "10"

	return params

Next we need to build our actual API call. Using the codes from the Manage API access page, we’re going to create an OAuth session. After we have a session, we can make an actual API call using our search parameters. Finally, we take that data and put it into a Python dictionary.

def get_results(params):

	#Obtain these from Yelp's manage access page
	consumer_key = "YOUR_KEY"
	consumer_secret = "YOUR_SECRET"
	token = "YOUR_TOKEN"
	token_secret = "YOUR_TOKEN_SECRET"
	
	session = rauth.OAuth1Session(
		consumer_key = consumer_key
		,consumer_secret = consumer_secret
		,access_token = token
		,access_token_secret = token_secret)
		
	request = session.get("http://api.yelp.com/v2/search",params=params)
	
	#Transforms the JSON API response into a Python dictionary
	data = request.json()
	session.close()
	
	return data

Now we can put it all together. Since Yelp will only return a max of 40 results at a time, you will likely want to make several API calls if you’re putting together any sort of sizable dataset. Currently, Yelp allows 10,000 API calls per day which should be way more than enough for compiling a dataset! However, when I’m making repeat API calls, I always make sure to rate-limit myself.

Companies with APIs will almost always have mechanisms in place to prevent too many requests from being made at once. Often this is done by IP address. They may have some code in place to only handle X calls in Y time per IP or X concurrent calls per IP, etc. If you rate limit yourself you can increase your chances of always getting back a response.

def main():
	locations = [(39.98,-82.98),(42.24,-83.61),(41.33,-89.13)]
	api_calls = []
	for lat,long in locations:
		params = get_search_parameters(lat,long)
		api_calls.append(get_results(params))
		#Be a good internet citizen and rate-limit yourself
		time.sleep(1.0)
		
	##Do other processing

At this point you have a list of dictionaries that represent each of the API calls you made. You can then do whatever additional processing you want to each of those dictionaries to extract the information you are interested in.

When working with a new API, I sometimes find it useful to open an interactive Python session and actually play with the API responses in the console. This helps me understand the structure so I can code the logic to find what I’m looking for.

You can get this complete script here. Every API is different, but Yelp is a friendly introduction to the world of making API calls through Python. With this skill you can construct your own datasets from any of the companies with public APIs.

Tagged on: , , , ,

10 thoughts on “How to use the Yelp API in Python

  1. Harry

    When attempting this script – Python returns
    File “API1.py”, line 31
    return data
    SyntaxError: ‘return’ outside function

    28 #Transforms the JSON API response into a Python dictionary
    29 data = request.json()
    30 session.close()
    31 return data

    Any thoughts

  2. Nathan Burnham

    What kind of latency are you getting with the api?

    It is taking me something like 25 seconds to authenticate and retrieve the data.
    Does that sound right?

    1. Phillip Post author

      That doesn’t sound too extreme if you’re pulling a lot of data down. If you want to eliminate variables, you can always just use curl with your parameters to see if that’s any faster. You might also try to make a request that will return no data or a request that will intentionally 404 to see where the slow down is.

  3. Renato Utsch

    Hello,

    I am trying to make that work with my python script, but all my requests are failing with the following error:
    ———–
    {‘error': {‘description': ‘Invalid signature. Expected signature base string: GET&http%3A%2F%2Fapi.yelp.com%2Fv2%2Fsearch&location%3Dlocation%26oauth_consumer_key%3D1sqhFxRdGZS1Ksy_tLcLxg%26oauth_nonce%3D80fec8017c0a0834ca225761c2c9246522b81192%26oauth_signature_method%3DHMAC-SHA1%26oauth_timestamp%3D1410534092%26oauth_token%3D6_ewtaaJU5YWM4MTiYBVUOoqmmKWqrOx%26oauth_version%3D1.0%26term%3Dsearchterm’, ‘id': ‘INVALID_SIGNATURE’, ‘text': ‘Signature was invalid’}}
    ————-

    Upon inspecting that more closely, I see that oauth_signature is not present in the request, and it is required by yelp’s API. Why is that?

    The code:
    ———
    params = { }
    params[‘term’] = urllib.parse.quote(query)
    params[‘location’] = urllib.parse.quote(self.city)

    session = rauth.OAuth1Session(consumer_key=self._consumer_key,
    consumer_secret=self._consumer_secret,
    access_token=self._token,
    access_token_secret=self._token_secret)

    response = session.get(“http://api.yelp.com/v2/search”, params=params)
    return response.json()
    ———————

    How can I make the request work?

    1. Phillip Post author

      This is most likely an issue with your keys. It means that the signature created by the oauth library does not match what Yelp is expecting. Double check all your keys and make sure that there are no extra special characters. Also compare the URL in the error message to the actual URL you are generating. That may help to pint out differences.

  4. Frank

    Hi Phillip,

    I’m currently using your script to query the Yelp API (thanks), and end up with a block of JSON embedded in a Python list. Like this:

    [{u'region': {u'span': {u'latitude_delta': 0.0215103900000031, u'longitude_delta': 0.024831400000010717}, u'center': {u'latitude': 34.00977745, u'longitude': -117.98871299999999}}, u'total': 876, u'businesses': [{u'is_claimed': True, u'distance': 3009.570851637409, u'mobile_url': u'http://m.yelp.com/biz/foo-foo-tei-hacienda-heights', u'rating_img_url': u'http://s3-media4.fl.yelpcdn.com/assets/2/www/img/c2f3dd9799a5/ico/stars/v1/stars_4.png', u'review_count': 1332, u'name': u'Foo Foo Tei', u'rating': 4.0, u'url': u'http://www.yelp.com/biz/foo-foo-tei-hacienda-heights', u'categories': [[u'Japanese', u'japanese']], u'is_closed': False, u'phone': u'6269376585', u'snippet_text': u"This place lives up to its hype!\n\nIf it's your first time here, you'd be surprised with that many ramen soup choices they offer off from menu. Def will try...", u'image_url': u'http://s3-media4.fl.yelpcdn.com/bphoto/sWEFkZw2u39YFSd4qmhlhg/ms.jpg', u'location': {u'city': u'Hacienda Heights', u'display_address': [u'15018 Clark Ave', u'Hacienda Heights, CA 91745'], u'postal_code': u'91745', u'country_code': u'US', u'address': [u'15018 Clark Ave'], u'state_code': u'CA'}, u'display_phone': u'+1-626-937-6585', u'rating_img_url_large': u'http://s3-media2.fl.yelpcdn.com/assets/2/www/img/ccf2b76faa2c/ico/stars/v1/stars_large_4.png', u'id': u'foo-foo-tei-hacienda-heights', u'snippet_image_url': u'http://s3-media3.fl.yelpcdn.com/photo/anEM2GKE1bl26QD1sq00BA/ms.jpg', u'rating_img_url_small': u'http://s3-media4.fl.yelpcdn.com/assets/2/www/img/f62a5be2f902/ico/stars/v1/stars_small_4.png'}]}]

    I’m rather new to APIs and the Python language but am trying to learn. Would you be able to provide any hints as to how one can instantiate some Python classes, store this raw output into a readable format, and perhaps even read the data into a dataframe of some sort? I’ve been stuck for quite some time now and am reaching out for some guidance if you could spare some.

    Thanks for the help,

    Frank Chen

    1. Phillip Post author

      Well using json.loads you can easily convert JSON to Python objects. Then you traverse the lists and dictionaries the same way you would any ordinary dictionary or list. In this example, you probably would loop through each of the “businesses” and create a new custom Business object for each item in the list.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>