Skip to content

Session 12: HTTP protocol

Juan Gonzalez-Gomez edited this page Mar 8, 2020 · 43 revisions

Session 12: HTTP protocol

  • Time: 2h
  • Date: Tuesday, March-10th-2020
  • Goals:
    • Learn about the HTTP protocol
    • Write our first web server using sockets

Contents

Introduction to the HTTP protocol

  • HTTP protocol is the language spoken between a browser (client) and a web server
  • This is our general scenario, in which there is a communication between one client and one server. As we already know, there are two kinds of sockets: one just for listening to new connection on the server (Red dot), and others for interchanging data between the client and the server (blue dots)

Requesting a web page

Let's understand what is happening when a browser connects to a web server for viewing a web page. This is the initial scenario:

The client is the browser running in our device (computer, mobile, tablet...). the server is running in another computer on the internet. It is waiting for the clients to connect

Step 1: Connection establishment

When we write an URL in the browser, we are requesting a web page from the server. The client creates a socket and establish a connection with the server. The server creates a new socket (clientsocket) for interchanging data with the client (in both directions). The original sockets continues listening for new connections

Now the client and server can communicate by means of the "blue" sockets. When they write to the sockets, the data is sent. When they read from them, the data is received. There is a bidirectional communication channel established

Step 2: The client sends a request message for a web page

The client takes the initiative (always) and sends a request message for obtaining the web page that the user wants to see

Step 3: The server reads the page from the disk

The server receives the request message and reads the html file from the hard disk

Step 4: The server sends a response message

The server builds a response message, composed of different fields. The HTML contents are located in the end of the message

Step 5: The browser renders the page on the screen

The client receive the html content and shows it on the screen

HTTP messages

There are two types of messages in HTTP: Request and response. They both have the same format: They consist of Lines in plain text (strings) separated by the special characters '\n' and '\r'

The lines are divided into two parts: the heather and the body. There is a blank line for separating both elements

Request messages

This is the format of the Request messages

The request line is the most important part. Here is where the client tells the server the service it needs. Consist of three parts separated by one space:

  • Method: Command name. There are three: GET, POST, HEAD
    • GET: Request an object to the server. The client wants the server to send it an object. The object id is given in the Path argument
    • POST: The client wants to send data to the server. They are placed in the message body
    • HEAD: Similar to GET, but only the object's headers are requested. It is used by the client to know if the object has been modified without having to transfer the whole object
  • Path: It is the name of the object that the client wants to get from the server, or the object which will receive the data the client is sending
  • Version: the HTTP version used. The syntax is like this: HTTP/x.y, where x and y are integer numbers

This is an Example of a request line:

GET /directory/other/file.html HTTP/1.0

And this is an example of a real message:

In this example, there is no body (it is empty)

Response messages

This is the format of the response message. It is the same than for the request message

The status line consist of three parts separated by spaces

  • Version: HTTP version. The syntax is: HTTP/x.y
  • Status code: A number that indicates what happened with the request
    • 200 --> OK
    • 404 --> Not Found
    • 304 --> Not modified
  • Status: Status information, in text format (readable)

Example of a status line:

HTTP/1.0 200 OK

This is an example of a response message:

Creating our first HTTP server

Let's create our first HTTP server, step by step, learning while doing

Starting point: The echo server

We start from a simple server, from the previous week, that just receives the request message and print it on the console: The echo server. It does no generates a response yet

Create the Session 12 folder and the new python file echo-server.py. Copy & paste the following code

import socket
import termcolor


# -- Server network parameters
IP = "127.0.0.1"
PORT = 8080


def process_client(s):
    # -- Receive the request message
    req_raw = s.recv(2000)
    req = req_raw.decode()

    print("Message FROM CLIENT: ")
    termcolor.cprint(req, "green")


# -------------- MAIN PROGRAM
# ------ Configure the server
# -- Listening socket
ls = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# -- Optional: This is for avoiding the problem of Port already in use
ls.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

# -- Setup up the socket's IP and PORT
ls.bind((IP, PORT))

# -- Become a listening socket
ls.listen()

print("SEQ Server configured!")

# --- MAIN LOOP
while True:
    print("Waiting for clients....")
    try:
        (cs, client_ip_port) = ls.accept()
    except KeyboardInterrupt:
        print("Server Stopped!")
        ls.close()
        exit()
    else:

        # Service the client
        process_client(cs)

        # -- Close the socket
        cs.close()

First, let's check that our server is working fine. From the linux console we send a message to the server using the printf and nc commands:

printf "Hello!" | nc 127.0.0.1 8080

We should see this message on the server's console, in green color

Reading the browser's request message

Internet browsers (like Firefox or Chrome) speak the HTTP protocol. It means that they send a request message with the format we have already seen. Let's check it

Open a new tab in your browser and type it:

http://127.0.0.1:8080/

This is the URL of the main page of our server:

  • "http://": It means that we want to use the HTTP protocol
  • 127.0.0.1: Server's IP (in this case is the server in our local machine)
  • :8080: The Server's Port. It is separated by the caracter : from the IP
  • /: This slash indicate that we want to access the server's main page

In the browser we will see something like this:

As our server does NOT speak HTTP yet, the browser could not establish the connection with the web server. An error message is shown

But... our server has received the request messages from the browser. If we have a look at the server's console, we will see something like this:

Notice that there appear many request messages (all the same). This is because we have not generate a response to the client's request messages. The browser re-sends the request messages many times, until there is a timeout and the browser writes an error message

This is the request message received from the browser:

GET / HTTP/1.1
Host: 192.168.124.41:8089
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0

Sending a simple response message

Our response message should have the following format:

  • Status line. We will inform the browser that everything went well. The typicall status line is like this:
HTTP/1.1 200 OK\n
  • The header should contain at least two elements:
    • Content-Type: This is for indicating the type of content return by the server. It will be typically text/html (but can also be image/png in the case of sending back an image in png format)
    • Content-Length: It indicates the total length of the information sent in the body of the response
  • The body with the contents we are sending to the browser

In our server we will generate a simple response, which contents are the string: "Hello from my first server!"

import socket
import termcolor

IP = "192.168.124.41"
PORT = 8090
MAX_OPEN_REQUESTS = 5


def process_client(cs):
    """Process the client request.
    Parameters:  cs: socket for communicating with the client"""

    # Read client message. Decode it as a string
    msg = cs.recv(2048).decode("utf-8")

    # Print the received message, for debugging
    print()
    print("Request message: ")
    termcolor.cprint(msg, 'green')

    # Build the HTTP response message. It has the following lines
    # Status line
    # header
    # blank line
    # Body (content to send)

    contents = "Hello from my first server!"

    # -- Everything is OK
    status_line = "HTTP/1.1 200 OK\r\n"

    # -- Build the header
    header = "Content-Type: text/plain\r\n"
    header += "Content-Length: {}\r\n".format(len(str.encode(contents)))

    # -- Build the message by joining together all the parts
    response_msg = str.encode(status_line + header + "\r\n" + contents)
    cs.send(response_msg)

    # Close the socket
    cs.close()


# MAIN PROGRAM

# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))

# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)

print("Socket ready: {}".format(serversocket))

while True:
    # accept connections from outside
    # The server is waiting for connections
    print("Waiting for connections at {}, {} ".format(IP, PORT))
    (clientsocket, address) = serversocket.accept()

    # Connection received. A new socket is returned for communicating with the client
    print("Attending connections from client: {}".format(address))

    # Service the client
    process_client(clientsocket)

Now we can see the answer in the browser!. Our first mini-web server is working!!! :-)

Let's analyze the information we have in our console. We can see that we have received two requests. The first request message is:

GET / HTTP/1.1
Host: 192.168.124.41:8090
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1

The first line is the request line, the most important part. We can ignore the rest of the message. Our request line is this one:

GET / HTTP/1.1

It has three parts:

  • The method: This first word is called the method. It indicates the operation that the client needs. In this case is a GET method. It means that the client wants to have access to some resource
  • The resource: The second word is the resource. The meaning of the "/" resource is: "I want to have access to your main page"
  • The HTTP version that is being used

In our case, the browser wants to get our main page with the first request

The second request message is this one:

GET /favicon.ico HTTP/1.1
Host: 192.168.124.41:8090
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

Let's focused only on the request line:

GET /favicon.ico HTTP/1.1

The server is asking for the resource /favicon.ico. The favicon is a short image file that stores the icon of the webpage you are accessing. We are ignoring this request

Response with HTML contents

Let's response with our first web page written in HTML. We know nothing about HTML. It is the language used for the creating web pages,that describes the structure of the document

In our server we are changing the contents. Instead of responding with a string, we will send a message in HTML. It is important to change the Content-type header from text/plain to text/html for indicating that we are sending HTML code instead of plain text

import socket
import termcolor

IP = "192.168.124.41"
PORT = 8080
MAX_OPEN_REQUESTS = 5


def process_client(cs):
    """Process the client request.
    Parameters:  cs: socket for communicating with the client"""

    # Read client message. Decode it as a string
    msg = cs.recv(2048).decode("utf-8")

    # Print the received message, for debugging
    print()
    print("Request message: ")
    termcolor.cprint(msg, 'green')

    # Build the HTTP response message. It has the following lines
    # Status line
    # header
    # blank line
    # Body (content to send)

    # This new contents are written in HTML language
    contents = """
    <!DOCTYPE html>
    <html lang="en" dir="ltr">
      <head>
        <meta charset="utf-8">
        <title>Green server</title>
      </head>
      <body style="background-color: lightgreen;">
        <h1>GREEN SERVER</h1>
        <p>I am the Green Server! :-)</p>
      </body>
    </html>
    """

    # -- Everything is OK
    status_line = "HTTP/1.1 200 OK\r\n"

    # -- Build the header
    header = "Content-Type: text/html\r\n"
    header += "Content-Length: {}\r\n".format(len(str.encode(contents)))

    # -- Build the message by joining together all the parts
    response_msg = str.encode(status_line + header + "\r\n" + contents)
    cs.send(response_msg)

    # Close the socket
    cs.close()


# MAIN PROGRAM

# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Bind the socket to the IP and PORT
serversocket.bind((IP, PORT))

# Configure the server sockets
# MAX_OPEN_REQUESTS connect requests before refusing outside connections
serversocket.listen(MAX_OPEN_REQUESTS)

print("Socket ready: {}".format(serversocket))

while True:
    # accept connections from outside
    # The server is waiting for connections
    print("Waiting for connections at {}, {} ".format(IP, PORT))
    (clientsocket, address) = serversocket.accept()

    # Connection received. A new socket is returned for communicating with the client
    print("Attending connections from client: {}".format(address))

    # Service the client
    process_client(clientsocket)

Now we will see a different page in the browser:

HTML

HTML is a special language used for defining the structure and the contents of the web pages. It consist of text inside tags. There is always an opening tag and a closing tag. This is the HTML code for the green server we used in the previous example (green-server.html)

<!DOCTYPE html>
<html lang="en" dir="ltr">
  <head>
    <meta charset="utf-8">
    <title>Green server</title>
  </head>
  <body style="background-color: lightgreen;">
    <h1>GREEN SERVER</h1>
    <p>I am the Green Server! :-)</p>
  </body>
</html>
  • HTML documents should always start with the special tag: <!DOCTYPE html>
  • The rest of the html code is inside the <html> and </html> tags
  • Every html document consist of two parts: the head and the body
  • The head contains information about the document for the brower
  • The actual content is located in the body
  • In this example there are two elementos inside the body:
    • The heading: GREEN SERVER. It is a bigger text
    • A paragraph: "I am the green server"
  • The background color of the elements in the body is set inside the style attribute
  • You can learn more about html following this tutorials from the w3school
  • You also can learn more HTML in this notes that I prepared for the CSAAI subject (in spanish)

Exercises

All the exercises and experiments performed during this session should be stored in the Session-10 folder

Exercise 1

Convert the happy server into an echo server: It is a server that just response with the same message sent by the client. It should start the response message with the string "ECHO: " and then add the client's message

  • Filename: Session-10/echo-server1.py
  • Description: Once the server is running, it will print the client's messages in the server's console in green color. If we send 3 messages using the nc command, this is what we will on the linux's console:

And this is what we should see on the Server's console:

Exercise 2

Modify the server from exercise 1 so that it counts the number of connections from the clients. It should also print the client's Ip and ports

  • Filename: Session-10/echo-server2.py
  • Description: This is an example of the output that you should see on the Server's console. In this example 4 clients have been connected to the server

Exercise 3

Modify the server of exercise 2 so that it stores the client's tuples (IP, PORT) in a list. After 5 clients has connected, it will print the information of all the clients in the console and finish

  • Filename: Session-10/echo-server3.py
  • Description: This is an example of the output

Exercise 4

Write a client for testing the server of the exercise 3.It should connect 5 times to the server, sending the message: message i, where i change from 0 to 4. You must use the Client0 module developed in the Practice 2. Use the method debug_talk() for sending the messages to the server

  • Filename: Session-10/client-test.py
  • Description: This is an example of the output in the client's console:

And this is what is shown in the server's console:

END of the session

The session is finished. Make sure, during this week, that everything in this list is checked!

  • You have all the items of the session 9 checked!
  • Your working repo contains the Session-10 Folder with the following files:
    • happy-server.py
    • echo-server1.py
    • echo-server2.py
    • echo-server3.py
    • client-test.py
  • All the previous files have been pushed to your remote Github repo

Author

Credits

  • Alvaro del Castillo. He designed and created the original content of this subject. Thanks a lot :-)

License

Links

Clone this wiki locally