Introduction to the requests library in Python
Start your free 7-days trial now!
Python's requests
library provides the tools to send requests over the Internet. The library allows you to perform GET
and POST
requests, which means that you can:
fetch the HTML of a website
call a third party API to fetch data
send data over to a server for processing
Fetching the HTML of a website
To fetch the HTML of a website:
import requests
# The URL of the desired siteurl = "https://en.wikipedia.org/wiki/Python_(programming_language)"r = requests.get(url)print(r.text)
<!DOCTYPE html><html class="client-nojs" lang="en" dir="ltr"><head><meta charset="UTF-8"/><title>Python (programming language) - Wikipedia</title>...
Here, r.text
is of type string.
Calling a third party API
Let us call a free API hosted by Github that tells us how many projects there currently are for some programming language:
# The free GitHub APIurl = "https://api.github.com/search/repositories? class='code_optional_parameter'>q=javascript"
r = requests.get(url)response_dictionary = r.json()
print("Status code:", r.status_code)print("Response:", response_dictionary.keys)
Status code: 200Response: dict_keys(['total_count', 'incomplete_results', 'items'])
Here, the ?q=javascript
is the parameter that we've specified - we are only interested in counting the number of JavaScript projects.
The GitHub API returns a JSON, which we can extracted using r.json()
. In Python, JSONs are treated as dictionaries, so you can get to use all the properties and methods of dictionaries to interact with the data.
The status_code
tells you whether or not your request was successful. A status code of 200
means that the request was successful and that no problem was encountered. Unsuccessful requests would result in a status codes starting with a 4
(e.g. 400
, 404
).
In the final line, we've printed out all the keys in the response. The keys that you see here are unique to this GitHub API, so you won't see the same keys when you call other third party APIs. In this case, we are just interested in extracting value for the the total_count
key, which tells us how many projects there are in total for JavaScript:
response_dictionary["total_count"]
671940
So, there are over 670,000 JavaScript projects - that's a lot!
Raising an error
By default, no error will be raised when the status code of the response is 4xx
or 5xx
:
import requests
# Some invalid urlurl = "https://en.wikipedia.org/wiki/Python_(programming_langcccuafawefwaege)"r = requests.get(url)print(r.status_code)
404
If we want to raise an error instead, call r.raise_for_status()
like so:
import requests
# Some invalid urlurl = "https://en.wikipedia.org/wiki/Python_(programming_langcccuage)"r = requests.get(url)r.raise_for_status()print(r.status_code)
HTTPError: 404 Client Error: Not Found for url: https://en.wikipedia.Python_(programming_langcccuage)