Requests in Python: Using Python to Request Web Pages

Requests in Python is a great module that allows you to use Python to send HTTP/1.1 requests to web pages.

Python 2.7 and 3.5+ are both officially supported. Keep – Alive, Connection Pooling, Sessions with permanent cookies, and Browser Style SSL verification make it the preferred solution for developers.

In this post, we’ll go over some of these features in greater detail and show you how to get started with the Python Requests module to construct web requests.

Installation of Requests in Python

Installing requests in Python is simple and straightforward. There are various ways to install a module in Python. However, in this tutorial, we will demonstrate how to use it using the pip module.

Open your terminal or command prompt (if you’re using Windows) and enter the following command.

pip install requests 
#Alternatively, if the first command does not work, use the below:
pip3 install requests

It should have installed the requests module successfully on your device.

Python Requests

To understand how the requests module works, you must first understand what happens when you browse the web and how it instantaneously displays the data we were hoping to see.

When you click a link, we make an HTTP (Hypertext Transfer Protocol) request to the server that hosts the requested page.

When the server receives the request, it returns the requested content to us.

The two most useful HTTP requests are GET and POST.

Importing Requests:

import requests

What is GET Request?

This approach indicates that we are requesting the contents of our chosen URL from the server. So, let’s suppose we want to retrieve Amazon’s homepage utilizing HTTP requests.

Then Enter the below code:

import requests
my_req = requests.get("http://amazon.com")

What the single line of code accomplishes here?

It uses the get() method to send an HTTP GET request to Amazon’s homepage, with the URL as the argument. The response object is then saved in our ‘my_req ‘ variable.

Our Response object instance further classifies the retained data and saves it in the appropriate attributes.

Example:

import requests
my_req = requests.get("http://amazon.com")
# The result contains the url's status code.
# The result of a successful full attempt is 200.
print(my_req.status_code)


# This attribute returns a Python dictionary containing the
# headers' key-value pairs.
print(my_req.headers)

# It displays the server's response content or Static Source Code.
print(my_req.text)

# we could also view or modify the encoding of the response content
# using the Requests library.
print(my_req.encoding)
my_req.encoding = 'utf-8'

Output:

503
{'Server': 'Server', 'Content-Type': 'text/html', 'Content-Length': '1203', 'x-amz-rid': 'T4PK7QWJPVK62Z0FZM5M', 'Last-Modified': 'Fri, 03 Dec 2021 19:33:54 GMT', 'ETag': '"a6f-5d242fcc50c80-gzip"', 'Accept-Ranges': 'bytes', 'Content-Encoding': 'gzip', 'Strict-Transport-Security': 'max-age=47474747; includeSubDomains; preload', 'Permissions-Policy': 'interest-cohort=()', 'Date': 'Wed, 15 Dec 2021 02:21:45 GMT', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding'}
<!--
        To discuss automated access to Amazon data please contact [email protected].
        For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.com/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
<!doctype html>
<html>
<head>
  <meta charset="utf-8">
  <meta http-equiv="x-ua-compatible" content="ie=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
  <title>Sorry! Something went wrong!</title>
  <style>
  html, body {
    padding: 0;
    margin: 0
  }

  img {
    border: 0
  }

  #a {
    background: #232f3e;
    padding: 11px 11px 11px 192px
  }

  #b {
    position: absolute;
    left: 22px;
    top: 12px
  }

  #c {
    position: relative;
    max-width: 800px;
    padding: 0 40px 0 0
  }

  #e, #f {
    height: 35px;
    border: 0;
    font-size: 1em
  }

  #e {
    width: 100%;
    margin: 0;
    padding: 0 10px;
    border-radius: 4px 0 0 4px
  }

  #f {
    cursor: pointer;
    background: #febd69;
    font-weight: bold;
    border-radius: 0 4px 4px 0;
    -webkit-appearance: none;
    position: absolute;
    top: 0;
    right: 0;
    padding: 0 12px
  }

  @media (max-width: 500px) {
    #a {
      padding: 55px 10px 10px
    }

    #b {
      left: 6px
    }
  }

  #g {
    text-align: center;
    margin: 30px 0
  }

  #g img {
    max-width: 90%
  }

  #d {
    display: none
  }

  #d[src] {
    display: inline
  }
  </style>
</head>
<body>
    <a href="/ref=cs_503_logo"><img id="b" src="https://images-na.ssl-images-amazon.com/images/G/01/error/logo._TTD_.png" alt="Amazon.com"></a>
    <form id="a" accept-charset="utf-8" action="/s" method="GET" role="search">
        <div id="c">
            <input id="e" name="field-keywords" placeholder="Search">
            <input name="ref" type="hidden" value="cs_503_search">
            <input id="f" type="submit" value="Go">
        </div>
    </form>
<div id="g">
  <div><a href="/ref=cs_503_link"><img src="https://images-na.ssl-images-amazon.com/images/G/01/error/500_503.png"
                                        alt="Sorry! Something went wrong on our end. Please go back and try again or go to Amazon's home page."></a>
  </div>
  <a href="/dogsofamazon/ref=cs_503_d" target="_blank" rel="noopener noreferrer"><img id="d" alt="Dogs of Amazon"></a>
  <script>document.getElementById("d").src = "https://images-na.ssl-images-amazon.com/images/G/01/error/" + (Math.floor(Math.random() * 43) + 1) + "._TTD_.jpg";</script>
</div>
</body>
</html>

ISO-8859-1

Passing Arguments with the GET Method

Usually, a single GET method does not allow us to get all of the information we require, so we must send additional parameters with our original get request.

Parameters are mostly key-value pairs of data wrapped in a tuple or list. We can send it using the params parameter of our get() method.

Example:

# Import requests using the import keyword
import requests 
# Give the dictionary as static input and store it in a variable.
payload_val = {'key1': 'value1', 'key2': 'value2'}
# Get the requests by passing some random URl and payload values as the arguments to it.
my_req = requests.get('http://httpbin.org/get', params=payload_val)
# print the above request
print(my_req.text)

Output:

{
  "args": {
    "key1": "value1", 
    "key2": "value2"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.23.0", 
    "X-Amzn-Trace-Id": "Root=1-61b95330-4018303804e1772e3f298164"
  }, 
  "origin": "34.86.125.144", 
  "url": "http://httpbin.org/get?key1=value1&key2=value2"
}

What is POST Request?

In contrast, to GET requests in Python, the POST Method in HTTP requires a payload to be sent along with it. Instead of retrieving data directly, this method is used to transfer it to a server. Using the post() method in our requests module, we can access POST.

Example

# Import requests module using the import keyword
import requests 
# Give the payload dictionary as static input and store it in a variable
payload_val  = {'key_1': 'value_1', 'key_2': 'value_2'}
# Pass the Url and above payload dictionary as arguments to the post() method
# and store in another variable
my_req  = requests.post("https://httpbin.org/post", data=payload_val)
# print the above request
print(my_req .text)

Output:

{
"args": {}, 
"data": "", 
"files": {}, 
"form": {
"key_1": "value_1", 
"key_2": "value_2"
}, 
"headers": {
"Accept": "*/*", 
"Accept-Encoding": "gzip, deflate", 
"Content-Length": "27", 
"Content-Type": "application/x-www-form-urlencoded", 
"Host": "httpbin.org", 
"User-Agent": "python-requests/2.23.0", 
"X-Amzn-Trace-Id": "Root=1-61cf20c7-4887ec7502308d5a0671e1ae"
}, 
"json": null, 
"origin": "35.245.205.160", 
"url": "https://httpbin.org/post"
}

Python Advanced Request Features

GET and POST are the most fundamental and important HTTP methods. However, the requests module allows a variety of such methods such as PUT, PATCH, DELETE, and so on.

These are some of the main reasons why the ‘requests’ module is so popular among developers is because of advanced features such as:

Sessions Object: It is mostly used to store the same cookies across multiple requests, resulting in a speedier response.

SOCKS Proxies are supported: Although a second requirement (called’requests[socks]’ must be installed, it can considerably improve your performance for many requests, especially if the server rate limits your IP.

SSL Verification: By passing an extra parameter “verify=True” within the get() method, you can force check if a website fully supports SSL. If the website does not support SSL properly, the script will throw an error.