How to use the Yelp API in Python

In last week’s post, I pulled data about local restaurants from Yelp to generate a dataset. I was happy to find that Yelp actually has a very friendly API. This guide will walk you through setting up some boiler plate code that can then be configured to your specific needs.

Step 1: Obtaining Access to the Yelp API

Before you can use the Yelp API, you need to submit a developer request. This can be done here. I’m not sure what the requirements are, but my guess is they approve almost everyone. After getting access, you will need to get your API keys from the Manage API access section on the site.

Step 2: Getting the rauth library

Yelp’s API uses OAuth authentication for API calls. Unless you want to do a lot of work, I suggest that you use a third party library to handle the OAuth for you. For this tutorial I’m using rauth, but feel free to use any library of your choice.

You can use easy_install rauth or pip install rauth to download the library.

Step 3: Write the code to query the Yelp API

You’ll first need to figure out what information you actually want to query. The API Documentation gives you all of the different parameters that you can specify and the correct syntax.

For this example, we’re going to be doing some location-based searching for restaurants. If you store each of the search parameters in a dictionary, you can save yourself some formatting. Here’s a method that accepts a latitude and longitude and returns the search parameter dictionary:

def get_search_parameters(lat,long):
	#See the Yelp API for more details
	params = {}
	params["term"] = "restaurant"
	params["ll"] = "{},{}".format(str(lat),str(long))
	params["radius_filter"] = "2000"
	params["limit"] = "10"

	return params

Next we need to build our actual API call. Using the codes from the Manage API access page, we’re going to create an OAuth session. After we have a session, we can make an actual API call using our search parameters. Finally, we take that data and put it into a Python dictionary.

def get_results(params):

	#Obtain these from Yelp's manage access page
	consumer_key = "YOUR_KEY"
	consumer_secret = "YOUR_SECRET"
	token = "YOUR_TOKEN"
	token_secret = "YOUR_TOKEN_SECRET"
	
	session = rauth.OAuth1Session(
		consumer_key = consumer_key
		,consumer_secret = consumer_secret
		,access_token = token
		,access_token_secret = token_secret)
		
	request = session.get("http://api.yelp.com/v2/search",params=params)
	
	#Transforms the JSON API response into a Python dictionary
	data = request.json()
	session.close()
	
	return data

Now we can put it all together. Since Yelp will only return a max of 40 results at a time, you will likely want to make several API calls if you’re putting together any sort of sizable dataset. Currently, Yelp allows 10,000 API calls per day which should be way more than enough for compiling a dataset! However, when I’m making repeat API calls, I always make sure to rate-limit myself.

Companies with APIs will almost always have mechanisms in place to prevent too many requests from being made at once. Often this is done by IP address. They may have some code in place to only handle X calls in Y time per IP or X concurrent calls per IP, etc. If you rate limit yourself you can increase your chances of always getting back a response.

def main():
	locations = [(39.98,-82.98),(42.24,-83.61),(41.33,-89.13)]
	api_calls = []
	for lat,long in locations:
		params = get_search_parameters(lat,long)
		api_calls.append(get_results(params))
		#Be a good internet citizen and rate-limit yourself
		time.sleep(1.0)
		
	##Do other processing

At this point you have a list of dictionaries that represent each of the API calls you made. You can then do whatever additional processing you want to each of those dictionaries to extract the information you are interested in.

When working with a new API, I sometimes find it useful to open an interactive Python session and actually play with the API responses in the console. This helps me understand the structure so I can code the logic to find what I’m looking for.

You can get this complete script here. Every API is different, but Yelp is a friendly introduction to the world of making API calls through Python. With this skill you can construct your own datasets from any of the companies with public APIs.

Tagged on: , , , ,

28 thoughts on “How to use the Yelp API in Python

  1. Harry

    When attempting this script – Python returns
    File “API1.py”, line 31
    return data
    SyntaxError: ‘return’ outside function

    28 #Transforms the JSON API response into a Python dictionary
    29 data = request.json()
    30 session.close()
    31 return data

    Any thoughts

    1. Aturen

      You need to indent the lines, like in the code block. Python uses whitespace to mark what’s part of a function definition, much like C uses { }. This error is saying that it encountered a `return` but — to its knowledge — it wasn’t inside of a function at all.
      Tabbing in line 31 (and the lines above it) makes the `return data` line the last line of the `get_results` function definition.

  2. Nathan Burnham

    What kind of latency are you getting with the api?

    It is taking me something like 25 seconds to authenticate and retrieve the data.
    Does that sound right?

    1. Phillip

      That doesn’t sound too extreme if you’re pulling a lot of data down. If you want to eliminate variables, you can always just use curl with your parameters to see if that’s any faster. You might also try to make a request that will return no data or a request that will intentionally 404 to see where the slow down is.

  3. Renato Utsch

    Hello,

    I am trying to make that work with my python script, but all my requests are failing with the following error:
    ———–
    {‘error’: {‘description’: ‘Invalid signature. Expected signature base string: GET&http%3A%2F%2Fapi.yelp.com%2Fv2%2Fsearch&location%3Dlocation%26oauth_consumer_key%3D1sqhFxRdGZS1Ksy_tLcLxg%26oauth_nonce%3D80fec8017c0a0834ca225761c2c9246522b81192%26oauth_signature_method%3DHMAC-SHA1%26oauth_timestamp%3D1410534092%26oauth_token%3D6_ewtaaJU5YWM4MTiYBVUOoqmmKWqrOx%26oauth_version%3D1.0%26term%3Dsearchterm’, ‘id’: ‘INVALID_SIGNATURE’, ‘text’: ‘Signature was invalid’}}
    ————-

    Upon inspecting that more closely, I see that oauth_signature is not present in the request, and it is required by yelp’s API. Why is that?

    The code:
    ———
    params = { }
    params[‘term’] = urllib.parse.quote(query)
    params[‘location’] = urllib.parse.quote(self.city)

    session = rauth.OAuth1Session(consumer_key=self._consumer_key,
    consumer_secret=self._consumer_secret,
    access_token=self._token,
    access_token_secret=self._token_secret)

    response = session.get(“http://api.yelp.com/v2/search”, params=params)
    return response.json()
    ———————

    How can I make the request work?

    1. Phillip

      This is most likely an issue with your keys. It means that the signature created by the oauth library does not match what Yelp is expecting. Double check all your keys and make sure that there are no extra special characters. Also compare the URL in the error message to the actual URL you are generating. That may help to pint out differences.

  4. Frank

    Hi Phillip,

    I’m currently using your script to query the Yelp API (thanks), and end up with a block of JSON embedded in a Python list. Like this:

    [{u'region': {u'span': {u'latitude_delta': 0.0215103900000031, u'longitude_delta': 0.024831400000010717}, u'center': {u'latitude': 34.00977745, u'longitude': -117.98871299999999}}, u'total': 876, u'businesses': [{u'is_claimed': True, u'distance': 3009.570851637409, u'mobile_url': u'http://m.yelp.com/biz/foo-foo-tei-hacienda-heights', u'rating_img_url': u'http://s3-media4.fl.yelpcdn.com/assets/2/www/img/c2f3dd9799a5/ico/stars/v1/stars_4.png', u'review_count': 1332, u'name': u'Foo Foo Tei', u'rating': 4.0, u'url': u'http://www.yelp.com/biz/foo-foo-tei-hacienda-heights', u'categories': [[u'Japanese', u'japanese']], u'is_closed': False, u'phone': u'6269376585', u'snippet_text': u"This place lives up to its hype!\n\nIf it's your first time here, you'd be surprised with that many ramen soup choices they offer off from menu. Def will try...", u'image_url': u'http://s3-media4.fl.yelpcdn.com/bphoto/sWEFkZw2u39YFSd4qmhlhg/ms.jpg', u'location': {u'city': u'Hacienda Heights', u'display_address': [u'15018 Clark Ave', u'Hacienda Heights, CA 91745'], u'postal_code': u'91745', u'country_code': u'US', u'address': [u'15018 Clark Ave'], u'state_code': u'CA'}, u'display_phone': u'+1-626-937-6585', u'rating_img_url_large': u'http://s3-media2.fl.yelpcdn.com/assets/2/www/img/ccf2b76faa2c/ico/stars/v1/stars_large_4.png', u'id': u'foo-foo-tei-hacienda-heights', u'snippet_image_url': u'http://s3-media3.fl.yelpcdn.com/photo/anEM2GKE1bl26QD1sq00BA/ms.jpg', u'rating_img_url_small': u'http://s3-media4.fl.yelpcdn.com/assets/2/www/img/f62a5be2f902/ico/stars/v1/stars_small_4.png'}]}]

    I’m rather new to APIs and the Python language but am trying to learn. Would you be able to provide any hints as to how one can instantiate some Python classes, store this raw output into a readable format, and perhaps even read the data into a dataframe of some sort? I’ve been stuck for quite some time now and am reaching out for some guidance if you could spare some.

    Thanks for the help,

    Frank Chen

    1. Phillip

      Well using json.loads you can easily convert JSON to Python objects. Then you traverse the lists and dictionaries the same way you would any ordinary dictionary or list. In this example, you probably would loop through each of the “businesses” and create a new custom Business object for each item in the list.

  5. M A

    I’m pretty new to Python, but immersing myself in various aspects. This is going to sound like a silly, dumb question but with those functions written… now what? How do I look at the data that they are returning?

    1. Phillip

      Well the simplest thing you can do is print them to the console. But usually when you use an API it’s because you want to do something more interesting with the data. If you just wanted to see the data, you could use the site directly. As far as what that interesting thing is, well, the sky’s the limit!

  6. Anushka

    Hi!

    Thanks for your code! I’m trying to use it exactly as given, but it does not return anything.
    What do I have to put in at the end to get the API calls returned as visible data in my python console? (using python3.4 and ipython notebook). I will then write it to a csv file, but for now I just want to see it in the output here. Thanks!

    1. Phillip Johnson Post author

      I removed the code from your comment because API keys should never be shared publicly. However, I was able to confirm that there is an error with your keys. I tried my keys and there was no problem. You might want to try regenerating your keys. Also make sure you are entering the correct key in the correct place in the code. Sorry I can’t offer more specific advice, but this error means there’s an issue with your token.

      {  
         "error":{  
            "id":"INVALID_PARAMETER",
            "field":"oauth_token",
            "text":"One or more parameters are invalid in request"
         }
      }
  7. Allen

    Thank you very much for the tutorial. It is really helpful. My question is that what if I want to search all the restaurants in Toronto? Should I get the lat and long all the time to do that, or there is another easier way? Thanks!

      1. Allen

        Thank you very much for your prompt reply! I think setting the location parameter to be Toronto will work. My concern is that you mentioned every time Yelp Api will only return 40 results. So next time if I send the same request (in my case, request restaurants list in Toronto), will the Api return the same 40 results? Thanks!

        1. Phillip Johnson Post author

          I wouldn’t rely on the API always returning the same exact 40 results. Yelp may do a number of things to control what results you get, such as if the restaurant is open, the rating, number of reviews, etc.

  8. abel

    Hello,
    Already I down load the yelp Data in form of json file.
    My question is How do I extract the long and lat data point from the json file. I am using python .
    would you please advise me

  9. Jeffrey

    Hi I’m trying to use your code but am getting a TypeError: __init__() got an unexpected keyword argument ‘access_token_secret’ in rauth’s OAuth1Service. I just got a fresh token and token secret and am still getting error. Any help would be greatly appreciated.

    1. Phillip Johnson Post author

      My example uses an OAuth1Session not an OAuth1Service. These are really similarly named, but a service uses URLs to obtain tokens and a session passes in tokens that you already have. Try changing to a session and see if that works. Happy programming!

  10. Marissa

    Thank you for this tutorial, it’s really helpful and worked quite well for me. In Yelp’s documentation they mention you can get up to 1000 results using the limit and offset parameters. I want to get the most restaurants for a specific city (around 800), and I believe you would need to update those parameters by looping through and updating them, however I am having a hard time getting started. Any advice?

    1. Phillip Johnson Post author

      Hi Marissa,

      I would probably do something like this:

      limit = 100
      
      def make_api_call(offset):
          # Code here to make API call using limit and offset
          return request.json()
      
      def load_all_data():
          data = []
          offset = 0
          while make_api_call(offset)['businesses']:
              # Add what you want to data
              offset = offset + 100
      
  11. Blanco

    Hi, Thank you for the tutorial. It is really helpful. your code work correclty for me but return on one restaurant :(, I change only locations = [(45.4301928892447, -73.6253511424274)] in your code

Leave a Reply to Nathan Burnham Cancel reply

Your email address will not be published. Required fields are marked *