Startup scripts in SQL Server containers

I was messing around performing investigative work on a pod running SQL Server 2025 in Kubernetes the other day and noticed something…the sqlservr process is no longer PID 1 in its container.

Instead there is: –

Hmm, ok we have a script /opt/mssql/bin/launch_sqlservr.sh and then the sqlservr binary is called.

I swear this wasn’t always the case, have I seen that before? Started to doubt myself so spun up a pod running an older version of SQL (2019 CU5) and took a look: –

Ahh ok, there has been a change. Now those two processes there are expected, one is essentially a watcher process and the other is sql server (full details here: –
https://techcommunity.microsoft.com/blog/sqlserver/sql-server-on-linux-why-do-i-have-two-sql-server-processes/3204412)

I went and had a look at a 2022 image and that script is there as well…so there has been a change at some point to execute that script first in the container (not sure when and I’m not going back to check all the different images 🙂 )

Right, but what is that script doing?

Now this is a bit of a rabbit hole but from what I can work out, that script calls three other scripts: –

/opt/mssql/bin/permissions_check.sh
Checks the location and ownership of the master database.

/opt/mssql/bin/init_custom_setup.sh
Determines whether one-time SQL Server initialization should run on first startup.

/opt/mssql/bin/run_custom_setup.sh
If initialisation is enabled, wait for SQL Server to be ready, then use environment variables and the setup-scripts directory to perform a custom setup.

Oooooh, OK…custom setup available? Let’s have a look at that.

Essentially it comes down to whether or not SQL is spinning up for the first time (so we haven’t persisted data from one container to another) and if certain environment variables are set…these are: –

MSSQL_DB – used to create a database
MSSQL_USER – login/user for that database
MSSQL_PASSWORD – password for that login
MSSQL_SETUP_SCRIPTS_LOCATION – location for custom scripts

Nice…so let’s have a go at using those!

Here’s a SQL Server 2025 Kubernetes manifest using the first three: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset-test
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container-test
          image: mcr.microsoft.com/mssql/server:2025-RTM-ubuntu-22.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
            - name: MSSQL_DB
              value: "testdatabase"
            - name: MSSQL_USER
              value: "testuser"
            - name: MSSQL_PASSWORD
              value: "Testing112233"

Then if we look at the logs for SQL in that pod (I’ve stripped out the normal startup messages): –

Creating database testdatabase
2026-01-23 10:56:38.48 spid51      [DBMgr::FindFreeDatabaseID] Next available DbId EX locked: 5
2026-01-23 10:56:38.56 spid51      Starting up database 'testdatabase'.
2026-01-23 10:56:38.59 spid51      Parallel redo is started for database 'testdatabase' with worker pool size [2].
2026-01-23 10:56:38.60 spid51      Parallel redo is shutdown for database 'testdatabase' with worker pool size [2].
Creating login testuser with password defined in MSSQL_PASSWORD environment variable
Changed database context to 'testdatabase'.

There it is creating the database! Cool!

But what about the last environment variable, the custom scripts location?

From the startup scripts, this has a default value of /mssql-server-setup-scripts.d so let’s drop a script in there and see what happens.

To do this I created a simple T-SQL script to create a test database: –

CREATE DATABASE testdatabase2;

And then created a configmap in Kubernetes referencing that script: –

kubectl create configmap mssql-setup-scripts --from-file=./create-database.sql

Now we can reference that in our SQL manifest: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset-test
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container-test
          image: mcr.microsoft.com/mssql/server:2025-RTM-ubuntu-22.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          volumeMounts:
            - name: setup-scripts
              mountPath: /mssql-server-setup-scripts.d
              readOnly: true
      volumes:
        - name: setup-scripts
          configMap:
            name: mssql-setup-scripts

And now we have these entries in the SQL startup log: –

Executing custom setup script /mssql-server-setup-scripts.d/create-database.sql
2026-01-23 11:08:52.08 spid60      Starting up database 'testdatabase2'.

Ha, and there’s our script being executed and the database created!

I had a look around and couldn’t see this documented anywhere (it may be somewhere though) but hey, another way of customising SQL Server in a container.

Although in reality I’d probably be using a custom image for SQL Server but this was fun to dive into 🙂

Thanks for reading!

Performance tuning KubeVirt for SQL Server

Following on from my last post about Getting Started With KubeVirt & SQL Server, in this post I want to see if I can improve the performance from the initial test I ran.

In the previous test, I used SQL Server 2025 RC1…so wanted to change that to RTM (now that’s it’s been released) but I was getting some strange issues running in the StatefulSet. However, SQL Server 2022 seemed to have no issues and as much as I want to investigate what’s going on with 2025 (pretty sure it’s host based, not an issue with SQL 2025)…I want to dive into KubeVirt more…so let’s go with 2022 in both KubeVirt and the StatefulSet.

I also separated out the system databases, user database data, and user database log files onto separate volumes…here’s what the StatefulSet manifest looks like: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
      annotations:
        stork.libopenstorage.org/disableHyperconvergence: "true"
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/rhel/server:2022-CU22-rhel-9.1
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
            - name: MSSQL_DATA_DIR
              value: "/opt/sqlserver/data"
            - name: MSSQL_LOG_DIR
              value: "/opt/sqlserver/log"
          resources:
            requests:
              memory: "8192Mi"
              cpu: "4000m"
            limits:
              memory: "8192Mi"
              cpu: "4000m"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql/
            - name: sqldata
              mountPath: /opt/sqlserver/data/
            - name: sqllog
              mountPath: /opt/sqlserver/log/
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        storageClassName: px-fa-direct-access
    - metadata:
        name: sqldata
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: px-fa-direct-access
    - metadata:
        name: sqllog
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: px-fa-direct-access

And here’s what the KubeVirt VM manifest looks like: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        firmware:
          bootloader:
            efi:
              secureBoot: false
        resources: # requesting same limits and requests for guaranteed QoS
          requests:
            memory: "8Gi"
            cpu: "4"
          limits:
            memory: "8Gi"
            cpu: "4"
        devices:
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi
            # Disk 2: SQL System
            - name: sqlsystem
              disk:
                bus: scsi
            # Disk 3: SQL Data
            - name: sqldata
              disk:
                bus: scsi
            # Disk 4: SQL Log
            - name: sqllog
              disk:
                bus: scsi
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2022-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio
              bridge: {}
              ports:
                - port: 3389 # port for RDP
                - port: 1433 # port for SQL Server      
      networks:
        - name: default
          pod: {}
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqlsystem
          persistentVolumeClaim:
            claimName: sqlsystem
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2022-iso
          persistentVolumeClaim:
            claimName: sql2022-pvc

I then ran the hammerdb test again…running for 10 minutes with a 2 minute ramp up time. Here are the results: –

# Statefulset result
TEST RESULT : System achieved 46594 NOPM from 108126 SQL Server TPM

# KubeVirt result
TEST RESULT : System achieved 18029 NOPM from 41620 SQL Server TPM

Oooooook…that has made a difference! KubeVirt TPM is now up to 38% of the statefulset TPM. But I’m still seeing a high privileged CPU time in the KubeVirt VM: –

So I went through the docs and found that there are a whole bunch of options for VM configuration…the first one I tried was the Hyper-V feature. This should allow Windows to use paravirtualized interfaces instead of emulated hardware, reducing VM exit overhead and improving interrupt, timer, and CPU coordination performance.

Here’s what I added to the VM manifest: –

        features:
          hyperv: {} # turns on Hyper-V feature so the guest “thinks” it’s running under Hyper-V - needs the Hyper-V clock timer too, otherwise VM pod will not start
        clock:
          timer:
            hyperv: {} 

N.B. – for more information on what’s happening here, check out this link: –
https://www.qemu.org/docs/master/system/i386/hyperv.html

Stopped/started the VM and then ran the test again. Here’s the results: –

TEST RESULT : System achieved 40591 NOPM from 94406 SQL Server TPM

Wait, what!? That made a huge difference…it’s now 87% of the StatefulSet result! AND the privileged CPU time has come down: –

But let’s not stop there…let’s keep going and see if we can get TPM parity between KubeVirt and SQL in a StatefulSet.

There’s a bunch more flags that can be set for the Hyper-V feature and the overall VM, so let’s set some of those: –

        features:
          acpi: {} # ACPI support (power management, shutdown, reboot, device enumeration)
          apic: {} # Advanced Programmable Interrupt Controller (modern interrupt handling for Windows/SQL)
          hyperv: # turns on Hyper-V vendor feature block so the guest “thinks” it’s running under Hyper-V. - needs the Hyper-V clock timer too, otherwise VM pod will not start
            reenlightenment: {} # Allows guest to update its TSC frequency after migrations or time adjustments
            ipi: {} # Hyper-V IPI acceleration - faster inter-processor interrupts between vCPUs
            synic: {} # Hyper-V Synthetic Interrupt Controller - improves interrupt delivery
            synictimer: {} # Hyper-V synthetic timer - stable high-resolution guest time source
            spinlocks:
              spinlocks: 8191 # Prevents Windows spinlock stalls on SMP systems - avoids boot/timeouts under load
            reset: {} # Hyper-V reset infrastructure - cleaner VM resets
            relaxed: {} # Relaxed timing - reduces overhead when timing deviations occur under virtualization
            vpindex: {} # Per-vCPU indexing - improves Windows scheduler awareness of vCPU layout
            runtime: {} # Hyper-V runtime page support - gives guest better insight into hypervisor behavior
            tlbflush: {} # Hyper-V accelerated TLB flush - improves scalability on multi-vCPU workloads
            frequencies: {} # Exposes host CPU frequency data - allows proper scaling & guest timing
            vapic: {} # Virtual APIC support - reduces interrupt latency and overhead
        clock:
          timer:
            hyperv: {} # Hyper-V clock/timer - stable time source, recommended when using Hyper-V enlightenments

Memory and CPU wise…I went and added: –

       ioThreadsPolicy: auto # Automatically allocate IO threads for QEMU to reduce disk I/O contention
        cpu:
          cores: 4
          dedicatedCpuPlacement: true # Guarantees pinned physical CPUs for this VM to improve latency & stability
          isolateEmulatorThread: true # Pins QEMU’s emulator thread to a dedicated pCPU instead of sharing with vCPUs
          model: host-passthrough # Exposes all host CPU features directly to the VM
          numa:
            guestMappingPassthrough: {} # Mirrors host NUMA topology to the guest to reduce cross-node latency
        memory:
          hugepages:
            pageSize: 1Gi # Uses 1Gi hugepages for reduced TLB pressure

N.B. – this required configuring the host to reserve hugepages at boot

And then for disks…I installed the latest virtio drivers on the VM…switched the disks for the SQL system, data, and log files to use virtio instead of a scsi bus and then added for each disk: –

dedicatedIOThread: true

Other device settings added were: –

autoattachGraphicsDevice: false # Do not attach a virtual graphics/display device (VNC/SPICE) - removes unnecessary emulation
autoattachMemBalloon: false # Disable the VirtIO memory balloon - prevents dynamic memory changes, improves consistency
autoattachSerialConsole: true # Attach a serial console for debugging and virtctl console access
networkInterfaceMultiqueue: true # Enable multi-queue virtio-net so NIC traffic can use multiple RX/TX queues

All of this results in a bit of a monster manifest file for the VM: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        ioThreadsPolicy: auto # Automatically allocate IO threads for QEMU to reduce disk I/O contention
        cpu:
          cores: 4
          dedicatedCpuPlacement: true # Guarantees pinned physical CPUs for this VM - improves latency & stability
          isolateEmulatorThread: true # Pins QEMU’s emulator thread to a dedicated pCPU instead of sharing with vCPUs
          model: host-passthrough # Exposes host CPU features directly to the VM - best performance (but less portable)
          numa:
            guestMappingPassthrough: {} # Mirrors host NUMA topology to the guest - reduces cross-node memory latency
        memory:
          hugepages:
            pageSize: 1Gi # Uses 1Gi hugepages for reduced TLB pressure - better performance for large-memory SQL
        firmware:
          bootloader:
            efi:
              secureBoot: false # Disable Secure Boot (often required when using custom/older virtio drivers)
        features:
          acpi: {} # ACPI support (power management, shutdown, reboot, device enumeration)
          apic: {} # Advanced Programmable Interrupt Controller (modern interrupt handling for Windows/SQL)
          hyperv: # Enable Hyper-V enlightenment features for Windows guests to improve performance & timing
            reenlightenment: {} # Allows guest to update its TSC frequency after migrations or time adjustments
            ipi: {} # Hyper-V IPI acceleration - faster inter-processor interrupts between vCPUs
            synic: {} # Hyper-V Synthetic Interrupt Controller - improves interrupt delivery
            synictimer: {} # Hyper-V synthetic timer - stable high-resolution guest time source
            spinlocks:
              spinlocks: 8191 # Prevents Windows spinlock stalls on SMP systems - avoids boot/timeouts under load
            reset: {} # Hyper-V reset infrastructure - cleaner VM resets
            relaxed: {} # Relaxed timing - reduces overhead when timing deviations occur under virtualization
            vpindex: {} # Per-vCPU indexing - improves Windows scheduler awareness of vCPU layout
            runtime: {} # Hyper-V runtime page support - gives guest better insight into hypervisor behavior
            tlbflush: {} # Hyper-V accelerated TLB flush - improves scalability on multi-vCPU workloads
            frequencies: {} # Exposes host CPU frequency data - allows proper scaling & guest timing
            vapic: {} # Virtual APIC support - reduces interrupt latency and overhead
        clock:
          timer:
            hyperv: {} # Hyper-V clock/timer - stable time source, recommended when using Hyper-V enlightenments
        resources: # requests == limits for guaranteed QoS (exclusive CPU & memory reservation)
          requests:
            memory: "8Gi"
            cpu: "4"
            hugepages-1Gi: "8Gi"
          limits:
            memory: "8Gi"
            cpu: "4"
            hugepages-1Gi: "8Gi"
        devices:
          autoattachGraphicsDevice: false # Do not attach a virtual graphics/display device (VNC/SPICE) - removes unnecessary emulation
          autoattachMemBalloon: false # Disable the VirtIO memory balloon - prevents dynamic memory changes, improves consistency
          autoattachSerialConsole: true # Attach a serial console for debugging and virtctl console access
          networkInterfaceMultiqueue: true # Enable multi-queue virtio-net so NIC traffic can use multiple RX/TX queues
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi   # Keep OS disk on SCSI - simpler boot path once VirtIO storage is already in place
              cache: none
            # Disk 2: SQL System
            - name: sqlsystem
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Give this disk its own IO thread - reduces contention with other disks
            # Disk 3: SQL Data
            - name: sqldata
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Separate IO thread for data file I/O - improves parallelism under load
            # Disk 4: SQL Log
            - name: sqllog
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Separate IO thread for log writes - helps with low-latency sequential I/O
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2022-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio # High-performance paravirtualized NIC (requires NetKVM driver in the guest)
              bridge: {} # Bridge mode - VM gets an IP on the pod network (via the pod’s primary interface)
              ports:
                - port: 3389 # RDP
                - port: 1433 # SQL Server
      networks:
        - name: default
          pod: {} # Attach VM to the default Kubernetes pod network
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqlsystem
          persistentVolumeClaim:
            claimName: sqlsystem
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2022-iso
          persistentVolumeClaim:
            claimName: sql2022-pvc

And then I ran the tests again: –

# StatefulSet
TEST RESULT : System achieved 47200 NOPM from 109554 SQL Server TPM

# KubeVirt
TEST RESULT : System achieved 46563 NOPM from 108184 SQL Server TPM

BOOOOOOOOOM! OK, so that’s 98% of the TPM achieved in the StatefulSet. And there’s a bit of variance in those results so these are now pretty much the same!

Ok so it’s not the most robust performance testing ever done…and I am fully aware that testing in a lab like this is one thing, whereas running SQL Server in KubeVirt…even in a dev/test environment is a completely other situation. There are still questions over stability and resiliency BUT from this I hope that it shows that we shouldn’t be counting KubeVirt out as a platform for SQL Server, based on performance.

Thanks for reading!

Running SQL Server on KubeVirt – Getting Started

With all the changes that have happened with VMware since the Broadcom acquisition I have been asked more and more about alternatives for running SQL Server.

One of the options that has repeatedly cropped up is KubeVirt

KubeVirt provides the ability to run virtual machines in Kubernetes…so essentially could provide an option to “lift and shift” VMs from VMware to a Kubernetes cluster.

A bit of background on KubeVirt…it’s a CNCF project accepted in 2019 and moved to “incubating” maturity level in 2022…so it’s been around a while now. KubeVirt uses custom resources and controllers in order to create, deploy, and manage VMs in Kubernetes by using libvirt and QEMU under the hood to provision those virtual machines.

I have to admit, I’m skeptical about this…we already have a way to deploy SQL Server to Kubernetes, and I don’t really see the benefits of deploying an entire VM.

But let’s run through how to get up and running with SQL Server in KubeVirt. There are a bunch of pre-requisites required here so I’ll detail the setup that I’m using.

I went with a physical server for this, as I didn’t want to deal with any nested virtualisation issues (VMs within VMs) and I could only get ONE box…so I’m running a “compressed” Kubernetes cluster, aka the node is both a control and worker node. I also needed a storage provider, and as I work for Pure Storage…I have access to a FlashArray which I’ll provision persistent volumes from via Portworx (the PX-CSI offering to be exact). Portworx provides a CSI driver that exposes FlashArray storage to Kubernetes for PersistentVolume provisioning.

So it’s not an ideal setup…I’ll admit…but should be good enough to get up and running to see what KubeVirt is all about.

Let’s go ahead and get started with KubeVirt.

First thing to do is actually deploy KubeVirt to the cluster…I followed the guide here: –
https://kubevirt.io/user-guide/cluster_admin/installation/

export RELEASE=$(curl https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt) # set the latest KubeVirt release
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${RELEASE}/kubevirt-operator.yaml # deploy the KubeVirt operator
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${RELEASE}/kubevirt-cr.yaml # create the KubeVirt CR (instance deployment request) which triggers the actual installation

Let’s wait until all the components are up and running: –

kubectl -n kubevirt wait kv kubevirt --for condition=Available

Here’s what each of these components does: –

virt-api        - The API endpoint used by Kubernetes and virtctl to interact with VM and VMI subresources.
virt-controller - Control-plane component that reconciles VM and VMI objects, creates VMIs, and manages migrations.
virt-handler    - Node-level component responsible for running and supervising VMIs and QEMU processes on each node.
virt-operator   - Manages the installation, upgrades, and lifecycle of all KubeVirt core components.

There’s two pods for the controllers and operator as they are deployments with a default replicas value of 2…I’m running on a one node cluster so could scale those down but I’ll leave the defaults for now.

More information on the architecture of KubeVirt can be found here: –
https://kubevirt.io/user-guide/architecture/

And I found this blog post really useful!
https://arthurchiao.art/blog/kubevirt-create-vm/

The next tool we’ll need is the Containerized Data Importer, this is the backend component that will allow us to upload ISO files to the Kubernetes cluster, which will then be mounted as persistent volumes when we deploy a VM. The guide I followed was here : –
https://github.com/kubevirt/containerized-data-importer

export VERSION=$(curl -s https://api.github.com/repos/kubevirt/containerized-data-importer/releases/latest | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-operator.yaml
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-cr.yaml

And again let’s wait for all the components to be up and running: –

kubectl get all -n cdi

Right, the NEXT tool we’ll need is virtctl this is the CLI that allows us to deploy/configure/manage VMs in KubeVirt: –

export VERSION=$(curl https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt)
wget https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/virtctl-${VERSION}-linux-amd64

And confirm that it’s installed (add to your $PATH environment variable): –

virtctl version

Okey dokey, now need to upload our ISO files for Windows and SQL Server to the cluster.

Note I’m referencing the storage class from my storage provider (PX-CSI) here. Also, I could not get this to work from my desktop, I had to upload the ISO files to the Kubernetes node and run there. The value for the –uploadproxy-url flag is the IP address of the cdi-uploadproxy service: –

Uploading the Windows ISO (I went with Windows Server 2025): –

virtctl image-upload pvc win2025-pvc --size 10Gi \
--image-path=./en-us_windows_server_2025_updated_oct_2025_x64_dvd_6c0c5aa8.iso \
--uploadproxy-url=https://10.97.56.82:443 \
--storage-class px-fa-direct-access \
--insecure

And uploading the SQL Server 2025 install ISO: –

virtctl image-upload pvc sql2025-pvc --size 10Gi \
--image-path=./SQLServer2025-x64-ENU.iso \
--uploadproxy-url=https://10.97.56.82:443 \
--storage-class px-fa-direct-access \
--insecure

Let’s confirm the resulting persistent volumes: –

kubectl get persistent volumes

Ok, so the next step is to pull down a container image so that it can be referenced in the VM yaml. This image contains the VirtIO drivers needed for Windows to detect the VM’s virtual disks and network interfaces: –

sudo ctr images pull docker.io/kubevirt/virtio-container-disk:latest
sudo ctr images ls | grep virtio

The final thing to do is create the PVCs/PVs that will be used for the OS, SQL data files, and SQL log files within the VM. The yaml is: –

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: winos
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 100Gi
  storageClassName: px-fa-direct-access
  volumeMode: Block
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sqldata
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 50Gi
  storageClassName: px-fa-direct-access
  volumeMode: Block
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sqllog
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 25Gi
  storageClassName: px-fa-direct-access
  volumeMode: Block

And then create!

kubectl apply -f pvc.yaml

Right, now we can create the VM! Below is the yaml I used to create the VM: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        firmware:
          bootloader:
            efi: # uefi boot
              secureBoot: false # disable secure boot
        resources: # requesting same limits and requests for guaranteed QoS
          requests:
            memory: "8Gi"
            cpu: "4"
          limits:
            memory: "8Gi"
            cpu: "4"
        devices:
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi
            # Disk 2: SQL Data
            - name: sqldata
              disk:
                bus: scsi
            # Disk 3: SQL Log
            - name: sqllog
              disk:
                bus: scsi
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2025-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio
              bridge: {}
              ports:
                - port: 3389 # port for RDP
                - port: 1433 # port for SQL Server      
      networks:
        - name: default
          pod: {}
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2025-iso
          persistentVolumeClaim:
            claimName: sql2025-pvc

Let’s deploy the VM: –

kubectl apply -f win2025.yaml

And let’s confirm: –

kubectl get vm

So now we’re ready to start the VM and install windows: –

virtctl start win2025

This will start an instance of the VM we created…to monitor the startup: –

kubectl get vm
kubectl get vmi
kubectl get pods

So we have a virtual machine, an instance of that virtual machine, and a virt-launcher pod…which is actually running the virtual machine by launching the QEMU process for the virtual machine instance.

Once the VM instance has been started, we can connect to it via VNC and run through the Windows installation process. I’m using TigerVNC here.

virtctl vnc win2025 --vnc-path "C:\Tools\vncviewer64-1.15.0.exe" --vnc-type=tiger

Hit any key to boot from the ISO (you’ll need to go into the boot options) but we’re now running through a normal Windows install process!

When the option to select the drive to install Windows appears, we have to load the drivers from the ISO we mounted from the virtio-container-disk:latest container
image: –

Once those are loaded, we’ll be able to see all the disks attached to the VM and continue the install process.

When the install completes, we’ll need to check the drivers in Device Manager: –

Go through and install any missing drivers (check disks and anything under “other devices”).

OK because VNC drives me nuts…once we have Windows installed, we’ll open up remote connections within Windows and then deploy a node port service to the cluster to open up port 3389…which will let us RDP to the VM: –

apiVersion: v1
kind: Service
metadata:
  name: win2025-rdp
spec:
  ports:
  - port: 3389
    protocol: TCP
    targetPort: 3389
  selector:
    vm.kubevirt.io/name: win2025
  type: NodePort

Confirm service and port: –

kubectl get services

Once we can RDP, we can continue to configure Windows (if we want to) but the main thing now is to get SQL Server 2025 installed. Don’t forget to online and format the disks for the SQL Server data and log files!

The ISO file containing the SQL install media is mounted within the VM…so it’s just a normal install. Run through the install and confirm it’s successful: –

Once the installation is complete…let’s deploy another node port service to allow us to connect to SQL in the VM: –

apiVersion: v1
kind: Service
metadata:
  name: win2025-sql
spec:
  ports:
  - port: 1433
    protocol: TCP
    targetPort: 1433
  selector:
    vm.kubevirt.io/name: win2025
  type: NodePort

Confirm the service: –

kubectl get services

And let’s attempt to connect to SQL Server in SSMS: –

And there is SQL Server running in KubeVirt!

Ok, let’s run a performance test to see how it compares with SQL deployed to the same Kubernetes cluster as a statefulset. I used Anthony Nocentino’s containerised HammerDB tool for this…here are the results: –

# Statefulset result
TEST RESULT : System achieved 45319 NOPM from 105739 SQL Server TPM

# KubeVirt result
TEST RESULT : System achieved 5962 NOPM from 13929 SQL Server TPM

OK, well that’s disastrous! 13% of the transaction per minute achieved for the SQL instance in the statefulset on the same cluster!

I also noticed a very high CPU privileged time when running the test against the database in the KubeVirt instance, which indicates that the VM is spending a lot of time in kernel or virtualization overhead. This is more than likely caused by incorrectly configured drivers, so it’s definitely not an optimal setup.

So OK, this might not be a perfectly fair test, but the gap is still significant. And it’s a lot of effort to go through just to get an instance of SQL Server up and running. But now that we do have a VM running SQL Server, I’ll explore how (or if) we can clone that VM so we don’t have to repeat this entire process for each new deployment…I’ll cover that in a later blog post. I’ll also see if I can address the performance issues.

But to round things off…deploying SQL as a statefulset to a Kubernetes cluster would still be my recommendation.

Thanks for reading!

Vertically scaling SQL Server online in Kubernetes


UPDATE – JANUARY 2026 – This feature has now been moved to stable in Kubernetes v1.35…full details are here: –
https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/

Whilst this is great unfortunately I did a bit more digging with regards to SQL Server (which OK, I should have done initially) and it seems that SQL Server will not see the new limits without a restart. So we can increase the limits in the pod without a restart but SQL will not see them.

My recommendation is to set the resizePolicy to RestartContainer in the manifest: –

resizePolicy:
- resourceName: cpu
  restartPolicy: RestartContainer
- resourceName: memory
  restartPolicy: RestartContainer

Then when increasing the memory via: –

kubectl patch pod mssql-statefulset-0 --subresource resize --patch `
'{\"spec\":{\"containers\":[{\"name\":\"mssql-container\", \"resources\":{\"requests\":{\"memory\":\"4096Mi\"}, \"limits\":{\"memory\":\"4096Mi\"}}}]}}'

This will restart the container within the pod, not the whole pod which is quicker but does mean a slight outage.


ORIGINAL ARTICLE

One of the new features in Kubernetes v1.33 is the ability to resize CPU and memory resources for containers online, aka without having to recreate the pod the container is running in. In the past, when adjusting a pod’s resources, Kubernetes would delete the existing pod and create a new one via a controller.

Not a problem for applications that can have multiple replicas running, but for SQL Server this would cause a disruption as we (generally) only have one pod running SQL Server in a statefulset. Let’s see this in action.

First we’ll deploy this simple statefulset to Kubernetes: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/server:2022-CU16-ubuntu-20.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          resources:
            requests:
              memory: "2048Mi"
              cpu: "2000m"
            limits:
              memory: "2048Mi"
              cpu: "2000m"

The important part here is the CPU and memory settings: –

          resources:
            requests:
              memory: "2048Mi"
              cpu: "2000m"
            limits:
              memory: "2048Mi"
              cpu: "2000m"

N.B. – you may have noticed that the limits and requests here are the same value. This is to set a “Guaranteed” Quality of Service for the pod…it’s a recommended best practice for SQL Server in Kubernetes, more info is here: –
https://learn.microsoft.com/en-us/sql/linux/sql-server-linux-kubernetes-best-practices-statefulsets

Let’s apply that manifest: –

kubectl apply -f ./sqlserver.yaml

In the past we had to edit our statefulset to increase these values: –

kubectl edit sts mssql-statefulset

This would recreate the pod with the new limits/requests: –

But now as of Kubernetes v1.33 we can scale pods without a restart! See here for the more info: –
https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/

One thing to note…the code to do this in the official docs will error out if you’re running kubectl on Windows (sigh). In order for this to run successfully on Windows a backslash has to be added before any double quote character.

So in order to increase the memory of the pod running, we would run: –

kubectl patch pod mssql-statefulset-0 --subresource resize --patch `
'{\"spec\":{\"containers\":[{\"name\":\"mssql-container\", \"resources\":{\"requests\":{\"memory\":\"4000Mi\"}, \"limits\":{\"memory\":\"4000Mi\"}}}]}}'

And then if we check out the pod’s yaml: –

kubectl get pod mssql-statefulset-0 -o yaml

And there we are, cool!

So to do this for CPU, we would run: –

kubectl patch pod mssql-statefulset-0 --subresource resize --patch `
'{\"spec\":{\"containers\":[{\"name\":\"mssql-container\", \"resources\":{\"requests\":{\"cpu\":\"4000m\"}, \"limits\":{\"cpu\":\"4000m\"}}}]}}'

Ok, I appreciate this code isn’t exactly the easiest to type out! Thankfully we can now add a –subresource resize flag to an edit command: –

kubectl edit pod mssql-statefulset-0 --subresource resize

And this will allow us to update the CPU and memory limits/requests of the pod without a restart!

Thanks for reading!

Visualising SQL Server in Kubernetes

The other day I came across an interesting repo on github, KubeDiagrams.

What this repo does is generate Kubernetes architecture diagrams from Kubernetes manifest files…nice!

Deploying applications to Kubernetes can get complicated fast…especially with stateful applications such as SQL Server.

So having the ability to easily generate diagrams is really helpful…because we all should be documenting everything, right? 🙂

Plus I’m rubbish at creating diagrams!

So let’s have a look at how this works. First, install the dependencies via pip: –

pip install pyyaml
pip install diagrams

And install graphviz: –

sudo apt install graphviz

Great, now pull down the repo: –

git clone https://github.com/philippemerle/KubeDiagrams.git

And we’re good to go! So here’s an example manifest file to deploy a SQL Server statefulset to Kubernetes: –

apiVersion: v1
kind: Namespace
metadata:
  name: mssql
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: mssql-sc
provisioner: docker.io/hostpath
reclaimPolicy: Delete
volumeBindingMode: Immediate
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-headless
  namespace: mssql
spec:
  clusterIP: None
  selector:
    name: mssql-pod
  ports:
    - name: mssql-port
      port: 1433
      targetPort: 1433
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
  namespace: mssql
spec:
  serviceName: "mssql-headless"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/server:2022-CU16-ubuntu-20.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          resources:
            requests:
              memory: "1024Mi"
              cpu: "500m"
            limits:
              memory: "2048Mi"
              cpu: "2000m"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql
            - name: sqldata
              mountPath: /opt/sqlserver/data
            - name: sqllog
              mountPath: /opt/sqlserver/log
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
        namespace: mssql
      spec:
        accessModes:
         - ReadWriteOncePod
        storageClassName: mssql-sc
        resources:
          requests:
            storage: 25Gi
    - metadata:
        name: sqldata
        namespace: mssql
      spec:
        accessModes:
         - ReadWriteOncePod
        storageClassName: mssql-sc
        resources:
          requests:
            storage: 25Gi
    - metadata:
        name: sqllog
        namespace: mssql
      spec:
        accessModes:
         - ReadWriteOncePod
        storageClassName: mssql-sc
        resources:
          requests:
            storage: 25Gi
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-service
  namespace: mssql
spec:
  ports:
  - name: mssql-ports
    port: 1433
    targetPort: 1433
  selector:
    name: mssql-pod
  type: LoadBalancer

I’m using Docker Desktop here (hence the provisioner: docker.io/hostpath in the storage class). What this’ll create is a namespace, storage class, headless service for the statefulset, the statefulset itself, three persistent volume claims, and a load balanced service to connect to SQL.

Quite a lot of objects for a simple SQL Server deployment, right? (ahh I know it’s a statefulset, but you know what I mean)

So let’s point KubeDiagrams at the manifest: –

./kube-diagrams mssql-statefulset.yaml

And here’s the output!

Pretty cool, eh?

I noticed a couple of quirks. The docs say it’ll work with any version 3 install of python. I had 3.8 installed but had to upgrade to 3.9.

Also I had to add namespace: mssql to the PVCs in the statefulset, otherwise KubeDiagrams threw a warning: –

Error: ‘sqlsystem/mssql/PersistentVolumeClaim/v1’ resource not found!
Error: ‘sqldata/mssql/PersistentVolumeClaim/v1’ resource not found!
Error: ‘sqllog/mssql/PersistentVolumeClaim/v1’ resource not found

But other than those, it works really well and is a great way to visualise objects in Kubernetes.

Massive thank you to the creator, Philippe Merle!

Thanks for reading!