Building an HTTP Server in Rust

I started with a simple goal: build a small HTTP server in Rust. No frameworks. Just full control.

It worked.

But it wasn't a server yet.

At some point while learning Rust, I kept coming back to the same idea: What actually happens under the hood of something like Nginx or Apache?

Not how to use them — how they really work.

Eventually, curiosity turned into boredom with everything else, and I started building one myself.

Genesis

I called it Ferrox.

The name felt right for something built in Rust — simple, a bit heavy, and hard to bend once it's set.

Basic connection & Response

At the start, I did not use any external crates, only std.

use std::net::{TcpListener, TcpStream};
use std::io::{Read, Write};

fn handle_connection(mut stream: TcpStream) {
    // Read data from the buffer and then parse it (no use for now)
    let mut buffer: [u8; 1024] = [0; 1024];

    stream.read(&mut buffer).unwrap();

    println!("{}", String::from_utf8_lossy(&buffer));

    // crafting basic HTTP response
    let response: &str = "HTTP/1.1 200 OK\r\n\r\nHello, Ferrox!";

    // Writing this response to the stream
    stream.write(response.as_bytes()).unwrap();
    stream.flush().unwrap();
}

fn main() {
    // Bind the listener
    let listener: TcpListener = TcpListener::bind("127.0.0.1:80").unwrap();

    println!("Server running on http://127.0.0.1:80");

    // Accept incoming stream (no threading or async yet)
    for stream in listener.incoming() {
        let stream: TcpStream = stream.unwrap();
        handle_connection(stream);
    }
}

As you can see, this version is only the embryo of an HTTP server. It can already respond, but there is still no threading or concurrency.

While we're at it, let's take a look at something interesting here:

let response: &str = "HTTP/1.1 200 OK\r\n\r\nHello, Ferrox!";

This may look confusing at first, but that's how HTTP responses are formed:

HTTP/{VERSION} {STATUS} {MESSAGE}

Headers

Body

So in practice, we hardcoded the response here, and since we did not specify the content type, browsers will usually interpret it as text.

File serving

This is where the code started getting messy as more functionality was added.

fn handle_connection(mut stream: TcpStream) {
    let mut buffer: [u8; 1024] = [0; 1024];

    stream.read(&mut buffer).unwrap();

    // Basic request decoding, really fragile
    let request: std::borrow::Cow<'_, str> = String::from_utf8_lossy(&buffer);
    let first_line = request.lines().next().unwrap();
    let parts: Vec<&str> = first_line.split_whitespace().collect();

    // Hardcoded file for now
    let path = Path::new("www/index.html");
    let display = path.display();

    // Opening and reading the file
    let mut file = match File::open(&path) {
        Err(why) => panic!("couldn't open {}: {}", display, why),
        Ok(file) => file,
    };

    let mut s = String::new();
    match file.read_to_string(&mut s) {
        Err(why) => panic!("couldn't read {}: {}", display, why),
        Ok(_) => print!("{} contains:\n{}", display, s),
    }

    let (method, path, version) = (parts[0], parts[1], parts[2]);

    // Basic logging
    println!("Method: {}\nPath: {}\nVersion: {}", method, path, version);

    // Crafting the response, but this time with a specified content type, so the browser understands it's an HTML
    let response = format!("HTTP/1.1 200 OK\r\nContent-Type: text/html\r\n\r\n{}", s);

    stream.write(response.as_bytes()).unwrap();
    stream.flush().unwrap();
}

Note that this example loads a file into memory before serving it. Real HTTP servers do not do that, because if you ask one to serve a 15GB zip file, then exactly 15GB would be loaded into the server's RAM. That does not look very performant, does it? We'll deal with that problem later.

Routing

At first, routing was designed only to serve HTML files. I added MIME type detection later, but it already had basic path traversal protection.

const SERVING_DIR: &str = "www"; // Serving directory

fn handle_connection(mut stream: TcpStream) {
    let mut buffer: [u8; 1024] = [0; 1024];

    stream.read(&mut buffer).unwrap();

    let request: std::borrow::Cow<'_, str> = String::from_utf8_lossy(&buffer);
    let first_line = request.lines().next().unwrap();
    let parts: Vec<&str> = first_line.split_whitespace().collect();
    let (method, req_path, version) = (parts[0], parts[1], parts[2]);

    let path = PathBuf::from(SERVING_DIR).join(req_path.trim_start_matches('/'));

    // Canonicalize the path
    let mut canonical = match path.canonicalize() {
        Ok(p) => p,
        Err(_) => {
            println!("File not found");
            return;
        }
    };

    // If path is starting from a serving directory, it means that the file is legal to serve
    if !canonical.starts_with(SERVING_DIR) {
        println!("Illegal path."); // TODO: Forbidden
    }

    if canonical.is_dir() {
        // Try to display default index.html if request points to a directory
        canonical = canonical.join("index.html");
    }

    let display = canonical.display();

    let mut file = match File::open(&canonical) {
        Err(why) => panic!("couldn't open {}: {}", display, why),
        Ok(file) => file,
    };

    let mut s = String::new();
    match file.read_to_string(&mut s) {
        Err(why) => panic!("couldn't read {}: {}", display, why),
        Ok(_) => println!(
            "Method: {}\nPath: {}\nVersion: {}",
            method, req_path, version
        ),
    }

    let response = format!("HTTP/1.1 200 OK\r\nContent-Type: text/html\r\n\r\n{}", s);

    stream.write(response.as_bytes()).unwrap();
    stream.flush().unwrap();
}

Refactor

At this state, the flow had already become messy in a single file, so I split it up to keep the structure cleaner and introduce a bit of OOP.

src
 |- handlers
 |   |- static_files.rs - routing & serving
 |- http
 |   |- request.rs - parsing the request
 |   |- response.rs - response helpers
 |- main.rs - app entry point
 |- server.rs - networking

MIME & Bytes

The main goal of an HTTP server is to be able to serve any type of file requested by the client, or at least do its best to serve as many as possible. Since this example only handles HTML, it is too limited. Let's fix that.

We need to be able to determine the MIME type based on the file extension. Since there are hundreds, if not thousands, of them, it is much faster to use an existing solution. The mime_guess crate is perfect for that.
Earlier, this solution wrote raw strings into the response, which is not suitable for every file type. This part also needs to be reworked so it can serve everything as bytes, not as text.

Not everything is a string, but everything is bytes.

pub fn serve_file(file_path: &String) -> Result<Response, std::io::Error> {
    let path = PathBuf::from(SERVING_DIR).join(file_path.trim_start_matches('/'));
    let base = PathBuf::from(SERVING_DIR).canonicalize().expect("Serving dir must exist");

    let mut canonical = match path.canonicalize() {
        Ok(p) => p,
        Err(_) => {
            // Error templates were implemented in the meantime
            let body = render_error("404", "Not Found");
            return Ok(Response {
                status: "404 Not Found",
                body,
                content_type: mime::TEXT_HTML,
            });
        }
    };

    if !canonical.starts_with(&base) {
        let body = render_error("403", "Forbidden");

        return Ok(Response {
            status: "403 Forbidden",
            body,
            content_type: mime::TEXT_HTML,
        });
    }

    if canonical.is_dir() {
        canonical = canonical.join("index.html");
    }

    // this function reads the file as bytes.
    let body = std::fs::read(&canonical)?;

    // detect MIME type or fallback to text/plain if fails
    let mime = mime_guess::from_path(&canonical).first_or_text_plain();

    Ok(Response {
        status: "200 OK",
        body,
        content_type: mime,
    })
}

File streaming

As I mentioned before, Ferrox was loading files into memory before streaming them. This is very inefficient, so streaming a file piece by piece is much faster and does not fill the machine's RAM. Here's how I handled it.

Instead of:

stream.write_all(&bytes).unwrap();

I did this:

// Response struct determines body type
match &mut response.body {
    Body::Bytes(bytes) => {
        // For error templates that are generated at compile time, they are always text and the template is small, so it's not a problem to stream them at once
        stream.write_all(bytes)?;
    }
    Body::File(file) => {
        // If this is a served file, we stream it using this function from the std library
        std::io::copy(file, &mut stream)?;
    }
}

Directory indexing

This part is not strictly necessary to create a fully functioning HTTP server, but it is an excellent UX improvement.

The principle is very simple:

Determine if it's a directory
List all files
Append those files as <a> links inside an HTML template

fn index_files(path: PathBuf, display_path: &String) -> Result<Vec<u8>, std::io::Error> {
    let dir_entries = std::fs::read_dir(&path)?;
    let mut html_list = String::new();

    if display_path != "/" {
        // If it's not a root of a serving dir, we add a button to go back to the top
        html_list.push_str("<li><a href=\"..\">..</a></li>");
    }

    for entry in dir_entries.flatten() {
        let name = entry.file_name().to_string_lossy().to_string();

        // Skip sensitive entries like .env, .git, etc.
        if name.starts_with('.') { continue; }

        let href = if entry.file_type()?.is_dir() {
            format!("{}/", name)
        } else {
            name
        };

        // Do XSS protection before pushing the string, as we are basically doing SSR here
        html_list.push_str(&format!("<li><a href=\"{save_href}\">{save_href}</a></li>", save_href = encode_safe(&href)));
    }

    Ok(render_indexing(display_path, &html_list))
}

Threading & Timeouts

As the project grew, I realized how inefficient and vulnerable it still was. This included several critical points:

A blocking thread. As you already saw, there is only one loop handling all incoming connections.
Vulnerable to Slowloris attacks.

I used the threadpool crate to implement a thread-per-connection model, similar to the one Apache uses.

const MAX_WORKERS: usize = 4;
const READ_TIMEOUT_SEC: u64 = 5;
const WRITE_TIMEOUT_SEC: u64 = 5;

pub fn serve(addr: &str) {
    let listener = TcpListener::bind(addr).unwrap();
    // Create pool
    let pool = ThreadPool::new(MAX_WORKERS);

    println!("Ferrox running on http://{addr} with {MAX_WORKERS} workers");

    for stream in listener.incoming() {
        match stream {
            Ok(stream) => {
                // Timeout for reading
                let _ = stream.set_read_timeout(Some(Duration::from_secs(READ_TIMEOUT_SEC)));
                // Timeout for writing
                let _ = stream.set_write_timeout(Some(Duration::from_secs(WRITE_TIMEOUT_SEC)));

                // Handle TCP stream here
                pool.execute(move || {
                    if let Err(e) = handle(stream) {
                        eprintln!("Connection error: {}", e);
                    }
                });
            }
            Err(e) => eprintln!("Failed to accept connection: {}", e),
        }
    }
}

Async Era

threadpool is indeed a decent library, but for our use case, it may not be the best possible solution.

Earlier, we implemented the thread-per-request model — essentially the same approach used by the Apache web server. In this architecture, every incoming request spawns a new thread. While this used to be acceptable, it becomes highly inefficient when handling websites with many concurrent users.

Here's a visualization that explains the concept better:

Apache Architecture Visualisation

Source: ScalableThread

Creating thousands of operating system threads introduces significant overhead. Each thread consumes memory, requires scheduling by the kernel, and increases the amount of context switching the CPU must perform. As concurrency grows, this model scales poorly and quickly becomes a bottleneck.

To solve this, we need to move toward an asynchronous, event-driven architecture. Instead of dedicating one thread to every connection, an async runtime can efficiently manage thousands of concurrent requests using a much smaller number of threads. Instead of blocking a thread while waiting for network or file I/O, asynchronous tasks yield execution back to the runtime, allowing other tasks to continue running efficiently.

This is the same fundamental idea used by servers such as NGINX and modern Rust runtimes like Tokio.

Refactoring to Tokio

That's how I used the essentials of tokio to rewrite some parts of the project to use an asynchronous model. Here are the main things that were changed:

use std::io::{Error, ErrorKind, Result};
use std::time::Duration;
// Notice how we replaced some std imports with tokio ones
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::{TcpListener, TcpStream};

use crate::handlers::static_files::serve_file;
use crate::http::request::Request;
use crate::http::response::{Body, Response};
use crate::utils::logger;

const MAX_HEADER_SIZE: u64 = 8192; // 8KB
const CONNECTION_TIMEOUT_SEC: u64 = 10;

pub async fn serve(addr: &str) {
    // We now use Tokio's TcpListener
    let listener = TcpListener::bind(addr).await.expect("Failed to bind");

    println!("Ferrox running on http://{addr}");

    loop {
        let (stream, _) = match listener.accept().await {
            Ok(res) => res,
            Err(e) => {
                logger::error_log("core", format!("Failed to accept: {}", e));
                continue;
            }
        };

        // Spawn the connection handler as an async task
        tokio::spawn(async move {
            let duration = Duration::from_secs(CONNECTION_TIMEOUT_SEC);

            // I had to rewrite a timeout a little bit
            match tokio::time::timeout(duration, handle(stream)).await {
                Ok(Err(e)) => {
                    logger::error_log("core", format!("Connection error: {}", e));
                }
                Err(_) => {
                    logger::error_log("core", "Connection timed out".to_string());
                }
                Ok(Ok(())) => {
                }
            }
        });
    }
}


// This part remains mostly unchanged, aside from adding `.await` to asynchronous operations
async fn handle(mut stream: TcpStream) -> Result<()> {
    let mut full_data: Vec<u8> = Vec::new();
    let mut temp_buffer: [u8; 1024] = [0u8; 1024];

    loop {
        let bytes_read = stream.read(&mut temp_buffer).await?;

        if bytes_read == 0 {
            return Ok(());
        }

        full_data.extend_from_slice(&temp_buffer[..bytes_read]);

        if full_data.windows(4).any(|window| window == b"\r\n\r\n") {
            break;
        }

        if full_data.len() > MAX_HEADER_SIZE as usize {
            return Err(Error::new(
                ErrorKind::ArgumentListTooLong,
                "Max header size reached.",
            ));
        }
    }

    let request = match Request::parse(&full_data) {
        Ok(r) => r,
        Err(e) => {
            logger::error_log("parser", format!("Failed to parse http request: {}", e));

            let error_res = Response::error("400", "Bad Request");
            let _ = error_res.write_headers(&mut stream).await?;
            if let Body::Bytes(b) = error_res.body {
                let _ = stream.write_all(&b).await;
            }

            return Ok(());
        }
    };

    let mut response: Response = match serve_file(&request.path).await {
        Ok(r) => r,
        Err(e) => {
            logger::error_log("file", format!("Failed to server static file: {}", e));
            Response::error("500", "Internal Server Error")
        }
    };

    response.write_headers(&mut stream).await?;

    match &mut response.body {
        Body::Bytes(bytes) => {
            stream.write_all(bytes).await?;
        }
        Body::File(file) => {
            tokio::io::copy(file, &mut stream).await?;
        }
    }

    logger::access(&request, &response, &stream)?;

    Ok(())
}

The most difficult part was refactoring the serve() function to tokio. For the rest of it, I mostly had to change function signatures to async and add .await where necessary.

Basic configuration

As Ferrox evolved, one major limitation became impossible to ignore: it had no configuration system whatsoever. It's pretty obvious that every software should have a configuration in order to work regardless of the situation.

I used the well-known serde crate to parse a yaml file that contains configuration for Ferrox.

use serde::Deserialize;
use std::{collections::HashMap, fs};

// Essentials, port, and binding address.
#[derive(Deserialize, Debug, Clone)]
pub struct ServerConfig {
    pub port: u16,
    pub addr: String,
}

// Serving & logging directories.
#[derive(Deserialize, Debug, Clone)]
pub struct PathsConfig {
    pub serve_dir: String,
    pub log_dir: String
}

#[derive(Deserialize, Debug, Clone)]
pub struct Config {
    pub server: ServerConfig,
    pub paths: PathsConfig,
    // Default headers to send with every request (optional).
    #[serde(default)]
    pub headers: HashMap<String, String>,
}

impl Config {
    // Load the config.
    pub fn load(path: &str) -> Result<Self, Box<dyn std::error::Error>> {
        let config_str: String = fs::read_to_string(path)?;
        
        let config: Config = serde_yaml::from_str(&config_str)?; 
        
        Ok(config)
    }
}

Here is what the final configuration file looked like:

server:
  port: 80
  addr: 0.0.0.0 # or 127.0.0.1 if binding exclusively to localhost.

paths:
  serve_dir: www
  log_dir: logs

# Optional: some useful security headers
headers:
  X-Content-Type-Options: nosniff
  X-Frame-Options: DENY

Then, to apply it in a program, I had to pass the loaded config to all functions that depended on it. For example:

#[tokio::main]
async fn main() {
    // Cosplaying Docker...
    let config: Config = config::Config::load("ferrox-compose.yml").expect("Failed to load ferrox-compose.yml");
    // Share the configuration across threads without duplicating the underlying data.
    let shared_config: Arc<Config> = Arc::new(config);
    server::serve(shared_config).await;
}

This configuration system was enough for the initial versions of the server, but it still had obvious weaknesses. Most fields lacked sensible defaults, meaning even small mistakes in the YAML file could break startup entirely.

TLS Encryption

Of course, the crucial step to bring Ferrox to a production-ready HTTP Server is to ensure it accepts connections over HTTPS.

As you may already know, the difference is that S in HTTPS stands for secure, meaning the protocol must be encrypted. This basically became an industrial standard for every website on the web.

I used the Rust ecosystem's TLS implementation, based on rustls, accessed via tokio-rustls, which provides a modern, memory-safe alternative to traditional TLS libraries. The principle is very simple: wrap a TCP connection with TLS & add redirection from HTTP to HTTPS. Other things remain completely unchanged.

Here is the example of a TLS config builder:

fn load_tls_config(config: &Config) -> std::io::Result<ServerConfig> {
    // Reading certificate chain
    let cert_file = File::open(&config.tls.cert_path)?;
    let mut cert_reader = BufReader::new(cert_file);
    let certs: Vec<CertificateDer<'static>> =
        rustls_pemfile::certs(&mut cert_reader).collect::<std::result::Result<Vec<_>, _>>()?;

    // Reading private key
    let key_file = File::open(&config.tls.key_path)?;
    let mut key_reader = BufReader::new(key_file);
    let key: PrivateKeyDer<'static> =
        rustls_pemfile::private_key(&mut key_reader)?.ok_or_else(|| {
            std::io::Error::new(std::io::ErrorKind::InvalidData, "No private key found")
        })?;

    // Assembling TLS config
    let mut server_config = ServerConfig::builder()
        .with_no_client_auth()
        .with_single_cert(certs, key)
        .map_err(|e| std::io::Error::new(std::io::ErrorKind::InvalidData, e.to_string()))?;

    server_config.alpn_protocols = vec![b"http/1.1".to_vec()];

    Ok(server_config)
}

And that's how we handle https using this config builder:

pub async fn serve_https(config: Arc<Config>) {
    let addr = format!("{}:{}", config.server.addr, config.server.https_port);
    let listener = TcpListener::bind(&addr)
        .await
        .expect(&format!("Ferrox failed to bind on https://{addr}"));

    // I added TLS fields in config in the meantime...
    let tls_server_config = load_tls_config(&config).expect("Failed to load TLS configuration");
    // Wrapping the TLS configuration into an acceptor.
    // The acceptor performs the TLS handshake on incoming TCP connections and converts them into encrypted TLS streams.
    let tls_acceptor = TlsAcceptor::from(Arc::new(tls_server_config));

    println!("Ferrox running on https://{addr}");

    loop {
        let (stream, _) = match listener.accept().await {
            Ok(res) => res,
            Err(e) => {
                logger::error_log(&config, "core", format!("Failed to accept: {}", e)).await;
                continue;
            }
        };

        // Cloning using Arc pointer for thread-safety
        let task_config: Arc<Config> = Arc::clone(&config);
        let acceptor: TlsAcceptor = tls_acceptor.clone();
        
        let peer_ip = stream
            .peer_addr()
            .map(|a| a.ip())
            .unwrap_or(std::net::IpAddr::V4(std::net::Ipv4Addr::new(0, 0, 0, 0)));
        let local_ip = stream
            .local_addr()
            .map(|a| a.ip())
            .unwrap_or(std::net::IpAddr::V4(std::net::Ipv4Addr::new(0, 0, 0, 0)));

        tokio::spawn(async move {
            let timeout_duration = Duration::from_secs(CONNECTION_TIMEOUT_SEC);

            let task = async {
                // The actual TLS negotiation happens here. 
                // If the handshake succeeds, the returned tls_stream behaves similarly to a normal TCP stream from the application's perspective, 
                // except that all traffic is transparently encrypted and decrypted by rustls.
                match acceptor.accept(stream).await {
                    Ok(tls_stream) => {
                        if let Err(e) =
                            handle(tls_stream, task_config.clone(), peer_ip, local_ip).await
                        {
                            crate::utils::logger::error_log(
                                &task_config,
                                "core",
                                format!("Connection error: {}", e),
                            )
                            .await;
                        }
                    }
                    Err(e) => {
                        crate::utils::logger::error_log(
                            &task_config,
                            "tls",
                            format!("TLS handshake failed: {}", e),
                        )
                        .await;
                    }
                }
            };

            if let Err(_) = tokio::time::timeout(timeout_duration, task).await {
                logger::error_log(&task_config, "core", "Connection timed out".to_string()).await;
            }
        });
    }
}

On a lower level, inside the main function, the flow is handled like that:

#[tokio::main]
async fn main() {
    // Stopped cosplaying Docker...
    let config: Config = config::Config::load("ferrox.yml").expect("Failed to load ferrox.yml");
    let shared_config: Arc<Config> = Arc::new(config);
    // We can configure whether we want TLS in the config.
    if shared_config.tls.enabled {
        println!("Starting Ferrox with TLS enabled.");
        tokio::join!(
            // HTTPS handling
            server::serve_https(Arc::clone(&shared_config)),
            // Redirector from HTTP to HTTPS
            server::serve_http_redirect(Arc::clone(&shared_config))
        );
    } else {
        println!("Starting Ferrox in plain HTTP mode (TLS disabled).");
        // HTTP handling.
        server::serve_http(Arc::clone(&shared_config)).await;
    }
}

Optimization

At this point, I started to fortify current base of the project rather than expanding its features. This included refactoring of some parts, implementing performance enhancements and adding essential things like Keep-Alive.

We'll cover some of those in this chapter!

Zero-copy

Request parsing was the only place I implemented zero-copy technique. The idea was to parse the request without allocating memory. This enhanced performance of Ferrox significantly, since we avoid allocating new string for every request component.

/// Represents a parsed HTTP request line together with its header map.
pub struct Request<'a> {
    pub method: &'a str,
    pub path: &'a str,
    pub version: &'a str,
    headers: HashMap<&'a str, &'a str>,
}

impl<'a> Request<'a> {
    /// Parses a raw HTTP request buffer into a [`Request`] value.
    ///
    /// # Arguments
    ///
    /// * `buffer` - The raw bytes read from the client connection.
    pub fn parse(buffer: &'a [u8]) -> Result<Self> {
        let header_end = Self::find_headers_end(buffer)
            .ok_or(Error::new(ErrorKind::InvalidData, "No header terminator."))?;

        let header_bytes = &buffer[..header_end];
        let mut lines = header_bytes.split(|&b| b == b'\n');

        let first_line = lines
            .next()
            .ok_or(Error::new(ErrorKind::InvalidData, "Missing request line."))?;

        let first_line = Self::strip_cr(first_line);

        let (method, path, version) = Self::split_request_line(first_line)?;

        let mut headers: HashMap<&'a str, &'a str> = HashMap::new();

        for line in lines {
            let line = Self::strip_cr(line);

            if line.is_empty() {
                break;
            }

            let (key, value) = Self::parse_header(line)?;

            let key = std::str::from_utf8(key)
                .map_err(|_| Error::new(ErrorKind::InvalidData, "Invalid header key."))?;

            let val_str = std::str::from_utf8(value)
                .map_err(|_| Error::new(ErrorKind::InvalidData, "Invalid header value"))?;

            let val_str = val_str.trim();

            headers.insert(key, val_str);
        }

        Ok(Self {
            method: std::str::from_utf8(method)
                .map_err(|_| Error::new(ErrorKind::InvalidData, "Invalid UTF-8"))?,
            path: std::str::from_utf8(path)
                .map_err(|_| Error::new(ErrorKind::InvalidData, "Invalid UTF-8"))?,
            version: std::str::from_utf8(version)
                .map_err(|_| Error::new(ErrorKind::InvalidData, "Invalid UTF-8"))?,
            headers,
        })
    }

    /// Retrieves a header value using a case-insensitive search. Optimized for zero-copy.
    /// 
    /// # Arguments
    /// 
    /// * `search_key` - header key to retreive from request.
    pub fn header(&self, search_key: &str) -> Option<&'a str> {
        self.headers
            .iter()
            .find(|(k, _)| k.eq_ignore_ascii_case(search_key))
            .map(|(_, v)| *v)
    }

    // The rest of the impl remains the same...
}

Instead of allocating new String for the request method, path, version and headers, the parser stores string slices (&str) that directly reference the original request buffer.

The lifetime parameter 'a ensures that these references cannot outlive the buffer they point to. This allows the parser to avoid memory allocations while remaining completely memory-safe.

Keep-Alive

In network programming, one of the most resource-intensive operations is establishing a new connection. While opening a connection to transfer a single HTML file may not seem expensive, modern websites often require dozens of images, scripts, stylesheets, and other assets. Without connection reuse, a new TCP connection would have to be established for many of these requests, introducing unnecessary overhead.

That's where Keep-Alive becomes useful. Instead of closing the connection immediately after sending a response, the server keeps it open for a short period of time. During that time, the client can send additional HTTP requests through the same connection, avoiding the cost of repeatedly creating and closing TCP connections.

As a result, a single connection can be used to transfer multiple resources rather than just one. This reduces latency and improves overall performance.

This does not necessarily mean that every file required by the website will be transferred over a single connection. Browsers typically maintain multiple connections simultaneously to improve loading performance, allowing several resources to be downloaded in parallel.

That being said, let's take a look at how I implemented it in my case:

/// Reads a single HTTP request from the stream and sends the generated response.
///
/// # Arguments
///
/// * `stream` - The TCP stream connected to the client.
/// * `config` - Shared server configuration used for parsing, file serving, and logging.
async fn handle<S>(
    mut stream: S,
    config: Arc<Config>,
    peer_ip: IpAddr,
    local_ip: IpAddr,
) -> std::io::Result<()>
where
    S: tokio::io::AsyncRead + tokio::io::AsyncWrite + Unpin,
{
    // Creating a variable to store a leftover body that we will use in the loop.
    let mut leftover_buffer: Vec<u8> = Vec::new();

    loop {
        let previous_leftovers = std::mem::take(&mut leftover_buffer);

        // Timeout we use in Keep-Alive
        let read_result = tokio::time::timeout(
            Duration::from_secs(CONNECTION_TIMEOUT_SEC),
            read_request_head(&mut stream, MAX_HEADER_SIZE, previous_leftovers),
        )
        .await;

        let (request_head, leftover_body) = match read_result {
            Ok(Ok(head)) => head,
            Ok(Err(e)) => return Err(e),
            Err(_) => break,
        };

        leftover_buffer = leftover_body;

        // Close if request is empty.
        if request_head.is_empty() {
            break;
        }

        let request = match Request::parse(&request_head) {
            Ok(r) => r,
            Err(e) => {
                logger::error_log(
                    &config,
                    "parser",
                    format!("Failed to parse http request: {}", e),
                )
                .await;

                let error_res = Response::error("400", "Bad Request");
                let _ = error_res
                    .write_headers(&mut stream, &config, "close")
                    .await?;
                if let Body::Bytes(b) = error_res.body {
                    let _ = stream.write_all(&b).await;
                }

                break;
            }
        };

        // Determining connection type.
        let connection_type: &str = match request.header("Connection") {
            Some(t) => t, // Use the one requested by client.
            None => { // Fallback to default connection types if not specified.
                // For HTTP/1.1 the default type is keep-alive. For HTTP/1.0 it's close.
                if request.version == "HTTP/1.1" {
                    "keep-alive"
                } else {
                    "close"
                }
            }
        };

        let decoded_path = match decode(&request.path) {
            Ok(p) => p.into_owned(),
            Err(_) => {
                let error_res = Response::error("400", "Bad Request");
                let _ = error_res
                    .write_headers(&mut stream, &config, "close")
                    .await?;
                if let Body::Bytes(b) = error_res.body {
                    let _ = stream.write_all(&b).await;
                }

                break;
            }
        };

        let mut response: Response = match serve_file(
            &decoded_path,
            &config.paths.serve_dir,
            &config.server.router,
        )
        .await
        {
            Ok(r) => r,
            Err(e) => {
                logger::error_log(
                    &config,
                    "file",
                    format!("Failed to server static file: {}", e),
                )
                .await;
                Response::error("500", "Internal Server Error")
            }
        };

        response
            .write_headers(&mut stream, &config, connection_type)
            .await?;

        match &mut response.body {
            Body::Bytes(bytes) => {
                stream.write_all(bytes).await?;
            }
            Body::File(file) => {
                tokio::io::copy(file, &mut stream).await?;
            }
        }

        logger::access(&config, &request, &response, peer_ip, local_ip).await;

        // Break the loop if connection type is close.
        if connection_type.eq_ignore_ascii_case("close") {
            break;
        }
    }

    Ok(())
}

/// Reads request head and returns it's vector.
///
/// # Arguments
///
/// * `stream` - The TCP stream connected to the client.
/// * `max_header_size` - Max header size authorized.
async fn read_request_head<S>(
    stream: &mut S,
    max_header_size: u64,
    previous_leftover_body: Vec<u8>,
) -> std::io::Result<(Vec<u8>, Vec<u8>)>
where
    S: tokio::io::AsyncRead + Unpin,
{
    let mut full_data: Vec<u8> = previous_leftover_body;

    // Since TCP is a stream protocol, there is no guarantee that the entire request head will arrive in a single read operation. 
    // A header may be split across multiple packets, so the server accumulates data in a buffer and repeatedly checks whether the terminating sequence has been received.
    if let Some(pos) = full_data
        .windows(4)
        .position(|window| window == b"\r\n\r\n")
    {
        let header_end_index = pos + 4;
        let leftover_body = full_data.split_off(header_end_index);

        return Ok((full_data, leftover_body));
    }

    let mut temp_buffer: [u8; 1024] = [0u8; 1024];
    let mut search_start: usize = full_data.len();

    loop {
        let bytes_read: usize = stream.read(&mut temp_buffer).await?;
        let check_start: usize = search_start.saturating_sub(3);

        if bytes_read == 0 {
            if full_data.is_empty() {
                return Ok((vec![], vec![]));
            } else {
                return Err(Error::new(
                    ErrorKind::UnexpectedEof,
                    "Client disconnected before sending full request headers.",
                ));
            }
        }

        full_data.extend_from_slice(&temp_buffer[..bytes_read]);

        if let Some(pos) = full_data[check_start..]
            .windows(4)
            .position(|window| window == b"\r\n\r\n")
        {
            let header_end_index = check_start + pos + 4;
            let leftover_body = full_data.split_off(header_end_index);

            return Ok((full_data, leftover_body));
        }

        if full_data.len() > max_header_size as usize {
            return Err(Error::new(
                ErrorKind::ArgumentListTooLong,
                "Max header size reached.",
            ));
        }

        search_start = full_data.len();
    }
}

An additional detail is that a single TCP read may contain more than just the request head. The client may already have started sending the request body, or even the next request in a Keep-Alive connection.

For that reason, any bytes received after \r\n\r\n are preserved in leftover_body and passed back to the caller. During the next iteration of the Keep-Alive loop, those bytes are processed before reading additional data from the socket.

This prevents data from being accidentally discarded and allows multiple requests to be processed correctly over the same connection.

Conclusion

As you've seen throughout this article, Ferrox evolved from a simple TCP listener that could barely determine a file's Content-Type into a small but functional HTTP server capable of serving static websites.

While there is still plenty of work ahead, this project has already covered many of the fundamental concepts behind network programming and HTTP server development in Rust.

Although Ferrox benefits from Rust's memory safety guarantees, it is not intended to replace mature, production-ready, and fully HTTP-compliant solutions such as NGINX or LiteSpeed.

This article only covered some of the features and improvements implemented so far. Development is ongoing, and I plan to continue expanding the project with new functionality and optimizations.

If you found this article useful, consider starring the repository on GitHub and stay tuned for future updates and new features!

GitHub: https://github.com/Res-NeoTech/ferrox

Sgr A*