Socket Programming in Linux

Some time last year I wanted to learn more about network programming and multi threading and wrote a webserver in C. A few days ago I decided that my learning in network programming is not quite complete without writing a client as well, so I wrote a client too. It's now time to write a bit about how all of this works.

Programming sockets in Linux is similar to all system programming in C. It's easy to make errors, so you have to be careful. It's difficult to comprehend the complexity and intricacy if you're just copying and pasting code from Stack Overflow. The Linux Programming Interface is a book that has several chapters dedicated to socket programming and it describes it very well, as well as throwing in a lot of information about how networks work and an excellent concise description of the TCP protocol. I highly recommend that book and I might write more about it in the future.

So let's talk a bit about how to write a webserver and a client for it in C.

Let's start with the server. The full code is in server.c, but I'll summarize the imporatnt points:

  • Create a socket with socket(). The code looks like this:
int sockfd = socket(AF_INET, SOCK_STREAM, 0);

SOCK_STREAM means that we're creating a stream socket (TCP socket).

  • Bind the socket to an address. The key lines are:
1: struct sockaddr_in serv_addr;
2: uint16_t port = 8000;
...
3: serv_addr.sin_family = AF_INET;
4: serv_addr.sin_port = htons(port);
5: serv_addr.sin_addr.s_addr = htonl(INADDR_ANY);
...
6: bind(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr))

We're creating an address serv_addr and binding it to the socket we created above. Note that we use INADDR_ANY as the IP address on which the socket is listening. This is the so-called IPv4 wildcard address. It binds your server to all interfaces available on your machine. That's more useful for me than binding to INADDR_LOOPBACK (127.0.0.1), because I'm running my code in a VM and accessing the server from my MacBook. Run ifconfig on your box to see the network interfaces that are available on it.

  • Great, we have a socket and it's assigned to an address. This socket can now be used as as passive socket (it's listening for connections) or an active socket (it's used to connect to a peer socket). We want a passive socket. We mark it as a passive socket by calling the listen function. The code looks like this:
if (listen(sockfd, SOMAXCONN) < 0) error("Couldn't listen");
  • The socket is now ready to accept connections. This can happen in a loop that looks like this:
struct sockaddr_in client_addr;
int cli_len = sizeof(client_addr);

while (1) {
    int newsockfd = accept(sockfd, (struct sockaddr *) &client_addr, 
        (socklen_t *) &cli_len);
    if (newsockfd < 0) error("Error on accept");
    // newsockfd is a file descriptor. 
    // Handle the connection by reading from and writing to this descriptor.
    ...
}

The accept function blocks until a connection request is received and creates a new socket when that happens. After that, it's your call how you'll handle this socket. You can fork a new process, create a new thread or in my case I have a thred pool of pre-forked threads which just pull these sockets from a queue and do what's necessary to handle a web server request.

That's the server. Now let's look at the client. Full code is in client.c. It's a pretty simple client that connects to localhost (127.0.0.1). So you can't use it if the server is running remotely.

You start by creating a socket with socket() and then calling connect() and passing the address to which you want to connect:

int sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd == -1) {
    error("socket");
}

uint16_t port = 8000;

struct sockaddr_in serv_addr;
serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons(port);
serv_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);

if (connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(struct sockaddr_in)) == -1) {
    error("connect");
}

If the connection is successful, you can just send your request to the server using the write() system call and read the response with the read() call.

So that's a simple web server and a client. The code still has a lot of bugs (the client will buffer overflow if I pass a request string longer than 512 chars, I'm not closing the connection explicitly, etc.), but it gets the job done.

A final interesting thing to do is to run Wireshark and see what packets are going between the client and the server. To do this, I started the server on my Virtual box Ubuntu (192.168.1.135) and ran curl from my MacBook (192.168.1.86). Here's what I saw in Wireshark:

Wireshark

You can see the three way TCP handshake in the beginning (SYN followed by SYN,ACK followed by ACK) and everything else in between. It's definitely something interesting to try out.

social