Introduction
Hello, and welcome reader. This is my first post here and I thought we'll start off with something simple: a basic HTTP server. I am still trying to figure out how to structure these posts so feel free to comment your suggestions.
Goals
Before we start building the server, I feel it's imperative to write down some exit criterion. The HTTP 1.1 spec is over 40 pages long, so we'll try to implement something barebones and then work our way up to a decent level. In this part specifically, I want to cover:
- creating a basic server shell – this is the scaffolding of the HTTP server we'll create and include the flags and the flag parser.
- experiment a little with TCP listeners – for the purpose of this exercise, we'll only use IPv4 listeners (but you're welcome to send in a PR if you'd want to add IPv6 support!) We'll concurrently handle requests, and we'll figure out a way to cleanly shutdown the server (that is, after it has processed all the requests or has hit a hard timeout.)
- implement an HTTP parser – we'll implement a basic HTTP message parser. I use the word basic here to indicate that we'll leave out certain things – gzip, multi-value headers, and a few other things you'll see down below.
- respond to a HTTP GET request – without query strings, without a trie-based router. Just a simple request, and a basic response.
The Shell
I assure you, I do not mean bash. Or this. Let's start with the scaffolding:
// Imports
var (
port = flag.Int("p", 8080, "sets the port")
host = flag.String("h", "", "sets the host")
)
func main() {
listenOn := fmt.Sprintf("%s:%d", *host, *port)
ctx, cancel := context.WithCancel(context.Background())
var lc net.ListenConfig
n, err := lc.Listen(ctx, "tcp", listenOn)
if err != nil {
log.Fatalf("error: listen: %s\n", err.Error())
}
c := make(chan os.Signal, 1)
signal.Notify(c, syscall.SIGINT, syscall.SIGTERM)
go func() {
log.Printf("signal: %s received\n", <-c)
cancel()
if err := n.Close(); err != nil {
log.Printf("error: listen: close: %s\n", err.Error())
}
}()
}
An interesting thing to note here might be lines 24-25
where we use net.ListenConfig
as opposed to the usual net.Listen
. We are doing this so we can send a context cancellation when we handle OS signals; also, internally this is what net.Listen
calls with the first parameter set to context.Background
.
Unfortunately, if we were to run this code block, it'll immediately exit because lc.Listen
just returns a TCPListener
object:
TCP Listener
Now, we need to write a listener loop which will work something like the following:
- wait for an incoming connection – this would mean that we need to block the program from exiting somehow;
- on receiving a connection, process it – since we are making a concurrent server, we'll dispatch a goroutine to handle this connection which means, read the request, parse the request, process the request, emit a response, close the connection (we are not going to respect
Keep-Alive
right now.)
Wait for an Incoming Connection
done := false
for !done {
select {
case <-ctx.Done():
log.Println("termination signal received, exiting listener loop")
done = true
default:
conn, err := n.Accept()
if err != nil && !errors.Is(err, net.ErrClosed) {
log.Printf("error: listen-loop: %s\n", err.Error())
continue
}
_ = conn
}
}
net.ErrClosed
is emitted when we try to Accept()
a connection from a closed TCPListener
object (and we close it in the goroutine above.)
Simple enough, let's move on!
Concurrently Process the Accepted Connection
We'll create a function which accepts a net.Conn
object and we'll dispatch it using the go
keyword. Later on, we'll keep a track of open connections in a HTTPServer
construct we create but for now this should suffice.
func handleConnection(conn net.Conn) {
defer func() {
if err := conn.Close(); err != nil {
log.Printf("error: handle: close: %s: %s", conn.RemoteAddr(), err.Error())
}
}()
content, err := io.ReadAll(conn)
if err != nil {
log.Printf("error: handle: %s: %s", conn.RemoteAddr(), err.Error())
return
}
if _, err := conn.Write(content); err != nil {
log.Printf("error: write: %s: %s", conn.RemoteAddr(), err.Error())
return
}
}
Right now, we're reading everything using io.ReadAll()
which can expose us to DoS vectors using a large payload, and this simply echos the request data sent to the server. We'll fix it later. Let's test it out.
HTTP Parser
With that aside, we can focus on writing the actual HTTP parser. We'll be following from the HTTP 1.1 specification and another RFC for the semantics.
Message Format
Let's see how the specification defines an HTTP message. We'll be reading Section 2.1 of RFC 9112 (the Spec.)
HTTP-message = start-line CRLF
*( field-line CRLF )
CRLF
[ message-body ]
What you're reading is the Augmented Backus-Naur Form (which is a short and sweet 7-pager you can read quite quickly and I highly recommend you do.) We'll dissect this piece-by-piece, let's start.
start-line
is either a request-line
or a status-line
depending if the message is a HTTP request or a response respectively. Here, we define a request-line
as:
request-line = method SP request-target SP HTTP-version
where method
is the HTTP method, request-target
has a few forms but we'll use the origin-form
and HTTP-version
will always be HTTP/1.1
. (You're welcome to read the definition if you'd like.) And we define a status-line
as:
status-line = HTTP-version SP status-code SP [ reason-phrase ]
Here, CRLF
is carriage return followed by a line feed so: \r\n
. *()
syntax in ABNF is used to denote 0 or more items, so the next line means 0 or more field-line
s or, as they are colloquially called, headers. We add one additional CRLF after which we send the request data (a JSON string, for example, in the case of a POST request.)
Let's say we need to make a request to https://www.example.com
, the HTTP message would look something like the following:
GET / HTTP/1.1\r\n
Host: www.example.com\r\n
\r\n
Note the Host
header: it's required in the how the origin-form
is supposed to be denoted.
Basic Parser
Let's start by creating a internal/http_parser.go
file and putting in some basic structures to represent the messages:
package main
type HTTPMethod int
const (
GET HTTPMethod = iota
POST
PUT
PATCH
DELETE
CONNECT
OPTIONS
)
type HTTPRequest struct {
Method HTTPMethod
Path string
Version string
}
type HTTPStatus struct {
Status int
ReasonPhrase string
Version string
}
type HTTPMessage struct {
Request HTTPRequest
Status HTTPStatus
Body []byte
}
We'll write a small function to convert string to HTTPMethod
and vice-versa. We'll implement the fmt.Stringer
(here) interface on the HTTPMethod
struct.
func (h HTTPMethod) String() string {
m := map[HTTPMethod]string{
GET: "GET",
POST: "POST",
PUT: "PUT",
PATCH: "PATCH",
DELETE: "DELETE",
CONNECT: "CONNECT",
OPTIONS: "OPTIONS",
}
return m[h]
}
func HTTPMethodFromString(s string) (HTTPMethod, error) {
m := map[string]HTTPMethod{
"GET": GET,
"POST": POST,
"PUT": PUT,
"PATCH": PATCH,
"DELETE": DELETE,
"CONNECT": CONNECT,
"OPTIONS": OPTIONS,
}
method, ok := m[s]
if !ok {
return HTTPMethod(-1), fmt.Errorf("error: method: parse: invalid method string %s", s)
}
return method, nil
}
Perfect, now we are going to parse out an HTTP message by reading it line by line. The function signature will look something like the following:
type HTTPMessageType int
const (
HTTPMessageRequest HTTPMessageType = iota
HTTPMessageResponse
)
func ParseHTTPMessage(r io.Reader) (HTTPMessage, error) {
reader := bufio.NewReader(r)
m := HTTPMessage{}
_ = reader
return m, nil
}
We are going to start parsing the startLine
first:
startLine, err := reader.ReadString('\n')
if err != nil {
return HTTPMessage{}, err
}
startLine = strings.TrimSpace(startLine)
requestLine, err := parseRequestLine(startLine)
if err != nil {
return HTTPMessage{}, err
}
Following this, let's check out how we'd parse the actual startLine
as in the parseRequestLine()
function.
func parseRequestLine(s string) (HTTPRequest, error) {
r := bufio.NewReader(bytes.NewBuffer([]byte(s)))
method, err := r.ReadString(' ')
if err != nil {
return HTTPRequest{}, err
}
m, err := HTTPMethodFromString(strings.TrimSpace(method))
if err != nil {
return HTTPRequest{}, err
}
requestPath, err := r.ReadString(' ')
if err != nil {
return HTTPRequest{}, err
}
return HTTPRequest{
Method: m,
Path: strings.TrimSpace(requestPath),
Version: "HTTP/1.1",
}, nil
}
For now, we'll statically assign HTTP/1.1
as the default HTTP version. We can move on to move to the actual message:
m := HTTPMessage{Headers: map[string]string{}, Request: requestLine}
for {
line, err := reader.ReadString('\n')
if err != nil {
return HTTPMessage{}, err
}
line = strings.TrimSpace(line)
if line == "" {
break
}
k, v, err := parseHeaderLine(line)
if err != nil {
return HTTPMessage{}, err
}
m.Headers[k] = v
}
And the parseHeaderLine()
is going to function like the following:
func parseHeaderLine(s string) (string, string, error) {
b := []byte(s)
r := bufio.NewReader(bytes.NewBuffer(b))
key, err := r.ReadString(':')
if err != nil {
return "", "", err
}
valueBytes := make([]byte, len(b)-len([]byte(key)))
if _, err := r.Read(valueBytes); err != nil {
return "", "", err
}
key = strings.ToLower(strings.TrimSpace(key[:len(key)-1]))
value := strings.ToLower(strings.TrimSpace(string(valueBytes)))
return key, value, nil
}
We're doing a few things here which are worth noting:
- We read till the first
:
(colon) and trim out the last character (sinceReadString()
includes the delimiter) by substring-ing thekey
tolen(key) - 1
. - We then read the value; now to calculate the length, it should be the total length - length of the key; we make a buffer of that size and
Read
the bytes into the buffer. - We finally call
string.ToLower
to ensure we normalise the header to lowercase.
Finally, let's parse out the body. We are going to make an assumption (which is true for all good HTTP clients: there is a Content-Length
header to the request which has a body. We are going to check if there is a header called content-length
, and if it exists, we'll parse out the value as an integer. If that value is more than 0, we can read it into the Body
field of the HTTPMessage
structure.
m := HTTPMessage{Headers: map[string]string{}, Request: requestLine}
contentLength := 0
for {
// ...
if k == "content-length" {
contentLength, err = strconv.Atoi(v)
if err != nil {
return HTTPMessage{}, err
}
}
}
if contentLength > 0 {
m.Body = make([]byte, contentLength)
if _, err := reader.Read(m.Body); err != nil {
return HTTPMessage{}, err
}
}
And this does it. We have successfully built a simple HTTP parser. We need to wire it up with the handleConnection
function to do something useful:
func handleConnection(conn net.Conn) {
defer func() {
if err := conn.Close(); err != nil {
log.Printf("error: handle: close: %s: %s", conn.RemoteAddr(), err.Error())
}
}()
message, err := internal.ParseHTTPMessage(conn)
if err != nil {
log.Printf("error: handle: parse: %s: %s", conn.RemoteAddr(), err.Error())
return
}
fmt.Println(message)
if _, err := conn.Write([]byte("HTTP/1.1 204 OK\r\n\r\n")); err != nil {
log.Printf("error: write: %s: %s", conn.RemoteAddr(), err.Error())
return
}
}
For now, let's just print out the message to stdout and see if this works.
And we're done! That's it for part one.
Conclusion
I hope you liked this first post and learnt something. In the next few posts, we are going to add slightly better error handling, and use a worker-model (sort of like nginx) to process the requests. Tune in for part two. The moment I get done with my procrastination!
Bonus
Remember the question I asked, why would io.ReadFull(r, m.Body)
return an error? Let's dig into it.
When we define a bufio.Reader
, it's a buffered reader which means that at any time it may contain some data in its buffer and it's already read from r
so it doesn't contain all the data.
Assume r
contains Hello World
in its underlying structures, and bufio.Reader
reads Hello
. Now, r
contains World
and the reader
contains Hello
. While the total size is 11
, if we try to read it from r
we'll get and EOF
error. Two fixes here:
- Change to
io.ReadFull(reader, m.Body)
; or, - do what we did: use the buffered reader itself so we can get the buffered data as well as any additional data in the underlying
r
.
Using ReadFull()
makes more sense personally since we need to read at least contentLength
bytes or error out.
HTTP Server - Part 1
In the first part of this series, we build a basic HTTP parser and lay down some TCP plumbing for everything to work. We conclude by testing out our server with a simple cURL request.