How to Create a Dockerfile: A Complete Guide

A Dockerfile is a plain-text script that tells Docker exactly how to build a container image. Every line is an instruction — from choosing a base operating system to copying your application files and defining how the container should start. Once you understand the structure, writing one becomes straightforward. But the right Dockerfile looks very different depending on what you're building and how you're deploying it.

What Is a Dockerfile, Exactly?

When Docker builds an image, it follows the Dockerfile step by step, creating a layered filesystem. Each instruction adds a new layer on top of the previous one. These layers are cached, which means Docker only rebuilds the layers that changed — making repeated builds significantly faster.

A container image built from a Dockerfile is portable and reproducible. The same image runs identically on a developer's laptop, a CI/CD pipeline, or a production server.

The Core Instructions You Need to Know

Here are the most commonly used Dockerfile instructions and what they do:

InstructionPurpose
FROMSets the base image (e.g., Ubuntu, Alpine, Node)
WORKDIRSets the working directory inside the container
COPYCopies files from your local machine into the image
RUNExecutes a command during the build process
ENVSets environment variables
EXPOSEDocuments which port the container listens on
CMDDefines the default command when the container starts
ENTRYPOINTSets a fixed executable that always runs

Writing Your First Dockerfile 🐳

Here's a simple, real example for a Node.js web application:

FROM node:20-alpine WORKDIR /app COPY package*.json ./ RUN npm install COPY . . EXPOSE 3000 CMD ["node", "server.js"] 

Breaking it down:

  • FROM node:20-alpine — Uses an official Node.js image built on Alpine Linux, which is lightweight and commonly used for production containers.
  • WORKDIR /app — All subsequent commands run inside /app inside the container.
  • COPY package*.json ./ followed by RUN npm install — This is intentional. Copying dependency files before copying your application code means Docker can cache the npm install layer. If your source code changes but package.json doesn't, Docker skips reinstalling dependencies.
  • COPY . . — Copies the rest of your application code.
  • EXPOSE 3000 — Signals to other developers and orchestration tools that this container uses port 3000. It doesn't actually publish the port; that happens at runtime.
  • CMD — Specifies the command that runs when the container starts.

The Build and Run Commands

Once your Dockerfile is written, you build it with:

docker build -t my-app:latest . 

The -t flag tags the image with a name. The . tells Docker to look for the Dockerfile in the current directory.

To run a container from that image:

docker run -p 3000:3000 my-app:latest 

This maps port 3000 on your host machine to port 3000 inside the container.

Key Variables That Shape How You Write a Dockerfile

There's no universal "best" Dockerfile structure — several factors determine what yours should look like.

Base Image Choice

The FROM instruction is one of your most consequential decisions. Options range from full OS images (like ubuntu:22.04) to language-specific images (like python:3.12) to minimal distroless images designed purely for running compiled binaries. Larger base images offer more built-in tools; smaller ones reduce attack surface and image size. Your choice depends on what your application needs at runtime.

Single-Stage vs. Multi-Stage Builds

For compiled languages like Go, Rust, or Java, a multi-stage build keeps the final image lean. You use one stage to compile the code and a second, minimal stage to run it:

FROM golang:1.22 AS builder WORKDIR /app COPY . . RUN go build -o myapp FROM alpine:latest COPY --from=builder /app/myapp . CMD ["./myapp"] 

The final image contains only the compiled binary — not the Go toolchain, source code, or build dependencies. For interpreted languages like Python or JavaScript, single-stage builds are often sufficient, though the tradeoffs shift depending on dependency size and security requirements.

Layer Ordering and Cache Efficiency

The order of instructions in your Dockerfile directly affects build speed. Instructions that change rarely (installing system dependencies, copying configuration) should appear early. Instructions that change frequently (copying application source code) should appear late. This maximizes the usefulness of Docker's layer cache.

Environment-Specific Configuration

Hardcoding secrets or environment-specific values into a Dockerfile is a common mistake. Sensitive values — API keys, database URLs, credentials — should be passed in at runtime using environment variables or secrets management tools, not baked into the image. The ENV instruction is appropriate for non-sensitive defaults; anything confidential belongs outside the image entirely. 🔒

Common Patterns Across Different Setups

Development environments often prioritize convenience — larger base images, volume mounts that reflect code changes in real time without rebuilding, and development tools included in the image.

Production images prioritize security and size — minimal base images, no unnecessary tools, non-root users (USER instruction), and deterministic dependency versions.

Monorepos or microservices often use a COPY strategy that carefully selects only the files relevant to a specific service, keeping build context small.

Stateful applications (databases, file processors) require careful thought about what belongs inside the container versus what should live in an external volume or managed service.

The right structure for a Python data pipeline looks very different from the right structure for a containerized static site server. Your stack, deployment target, team workflow, and security requirements all interact — and those specifics are what ultimately determine which patterns and base images make the most sense for your situation.