Performance tuning KubeVirt for SQL Server

Following on from my last post about Getting Started With KubeVirt & SQL Server, in this post I want to see if I can improve the performance from the initial test I ran.

In the previous test, I used SQL Server 2025 RC1…so wanted to change that to RTM (now that’s it’s been released) but I was getting some strange issues running in the StatefulSet. However, SQL Server 2022 seemed to have no issues and as much as I want to investigate what’s going on with 2025 (pretty sure it’s host based, not an issue with SQL 2025)…I want to dive into KubeVirt more…so let’s go with 2022 in both KubeVirt and the StatefulSet.

I also separated out the system databases, user database data, and user database log files onto separate volumes…here’s what the StatefulSet manifest looks like: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
      annotations:
        stork.libopenstorage.org/disableHyperconvergence: "true"
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/rhel/server:2022-CU22-rhel-9.1
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
            - name: MSSQL_DATA_DIR
              value: "/opt/sqlserver/data"
            - name: MSSQL_LOG_DIR
              value: "/opt/sqlserver/log"
          resources:
            requests:
              memory: "8192Mi"
              cpu: "4000m"
            limits:
              memory: "8192Mi"
              cpu: "4000m"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql/
            - name: sqldata
              mountPath: /opt/sqlserver/data/
            - name: sqllog
              mountPath: /opt/sqlserver/log/
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        storageClassName: px-fa-direct-access
    - metadata:
        name: sqldata
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: px-fa-direct-access
    - metadata:
        name: sqllog
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: px-fa-direct-access

And here’s what the KubeVirt VM manifest looks like: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        firmware:
          bootloader:
            efi:
              secureBoot: false
        resources: # requesting same limits and requests for guaranteed QoS
          requests:
            memory: "8Gi"
            cpu: "4"
          limits:
            memory: "8Gi"
            cpu: "4"
        devices:
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi
            # Disk 2: SQL System
            - name: sqlsystem
              disk:
                bus: scsi
            # Disk 3: SQL Data
            - name: sqldata
              disk:
                bus: scsi
            # Disk 4: SQL Log
            - name: sqllog
              disk:
                bus: scsi
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2022-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio
              bridge: {}
              ports:
                - port: 3389 # port for RDP
                - port: 1433 # port for SQL Server      
      networks:
        - name: default
          pod: {}
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqlsystem
          persistentVolumeClaim:
            claimName: sqlsystem
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2022-iso
          persistentVolumeClaim:
            claimName: sql2022-pvc

I then ran the hammerdb test again…running for 10 minutes with a 2 minute ramp up time. Here are the results: –

# Statefulset result
TEST RESULT : System achieved 46594 NOPM from 108126 SQL Server TPM

# KubeVirt result
TEST RESULT : System achieved 18029 NOPM from 41620 SQL Server TPM

Oooooook…that has made a difference! KubeVirt TPM is now up to 38% of the statefulset TPM. But I’m still seeing a high privileged CPU time in the KubeVirt VM: –

So I went through the docs and found that there are a whole bunch of options for VM configuration…the first one I tried was the Hyper-V feature. This should allow Windows to use paravirtualized interfaces instead of emulated hardware, reducing VM exit overhead and improving interrupt, timer, and CPU coordination performance.

Here’s what I added to the VM manifest: –

        features:
          hyperv: {} # turns on Hyper-V feature so the guest “thinks” it’s running under Hyper-V - needs the Hyper-V clock timer too, otherwise VM pod will not start
        clock:
          timer:
            hyperv: {} 

N.B. – for more information on what’s happening here, check out this link: –
https://www.qemu.org/docs/master/system/i386/hyperv.html

Stopped/started the VM and then ran the test again. Here’s the results: –

TEST RESULT : System achieved 40591 NOPM from 94406 SQL Server TPM

Wait, what!? That made a huge difference…it’s now 87% of the StatefulSet result! AND the privileged CPU time has come down: –

But let’s not stop there…let’s keep going and see if we can get TPM parity between KubeVirt and SQL in a StatefulSet.

There’s a bunch more flags that can be set for the Hyper-V feature and the overall VM, so let’s set some of those: –

        features:
          acpi: {} # ACPI support (power management, shutdown, reboot, device enumeration)
          apic: {} # Advanced Programmable Interrupt Controller (modern interrupt handling for Windows/SQL)
          hyperv: # turns on Hyper-V vendor feature block so the guest “thinks” it’s running under Hyper-V. - needs the Hyper-V clock timer too, otherwise VM pod will not start
            reenlightenment: {} # Allows guest to update its TSC frequency after migrations or time adjustments
            ipi: {} # Hyper-V IPI acceleration - faster inter-processor interrupts between vCPUs
            synic: {} # Hyper-V Synthetic Interrupt Controller - improves interrupt delivery
            synictimer: {} # Hyper-V synthetic timer - stable high-resolution guest time source
            spinlocks:
              spinlocks: 8191 # Prevents Windows spinlock stalls on SMP systems - avoids boot/timeouts under load
            reset: {} # Hyper-V reset infrastructure - cleaner VM resets
            relaxed: {} # Relaxed timing - reduces overhead when timing deviations occur under virtualization
            vpindex: {} # Per-vCPU indexing - improves Windows scheduler awareness of vCPU layout
            runtime: {} # Hyper-V runtime page support - gives guest better insight into hypervisor behavior
            tlbflush: {} # Hyper-V accelerated TLB flush - improves scalability on multi-vCPU workloads
            frequencies: {} # Exposes host CPU frequency data - allows proper scaling & guest timing
            vapic: {} # Virtual APIC support - reduces interrupt latency and overhead
        clock:
          timer:
            hyperv: {} # Hyper-V clock/timer - stable time source, recommended when using Hyper-V enlightenments

Memory and CPU wise…I went and added: –

       ioThreadsPolicy: auto # Automatically allocate IO threads for QEMU to reduce disk I/O contention
        cpu:
          cores: 4
          dedicatedCpuPlacement: true # Guarantees pinned physical CPUs for this VM to improve latency & stability
          isolateEmulatorThread: true # Pins QEMU’s emulator thread to a dedicated pCPU instead of sharing with vCPUs
          model: host-passthrough # Exposes all host CPU features directly to the VM
          numa:
            guestMappingPassthrough: {} # Mirrors host NUMA topology to the guest to reduce cross-node latency
        memory:
          hugepages:
            pageSize: 1Gi # Uses 1Gi hugepages for reduced TLB pressure

N.B. – this required configuring the host to reserve hugepages at boot

And then for disks…I installed the latest virtio drivers on the VM…switched the disks for the SQL system, data, and log files to use virtio instead of a scsi bus and then added for each disk: –

dedicatedIOThread: true

Other device settings added were: –

autoattachGraphicsDevice: false # Do not attach a virtual graphics/display device (VNC/SPICE) - removes unnecessary emulation
autoattachMemBalloon: false # Disable the VirtIO memory balloon - prevents dynamic memory changes, improves consistency
autoattachSerialConsole: true # Attach a serial console for debugging and virtctl console access
networkInterfaceMultiqueue: true # Enable multi-queue virtio-net so NIC traffic can use multiple RX/TX queues

All of this results in a bit of a monster manifest file for the VM: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        ioThreadsPolicy: auto # Automatically allocate IO threads for QEMU to reduce disk I/O contention
        cpu:
          cores: 4
          dedicatedCpuPlacement: true # Guarantees pinned physical CPUs for this VM - improves latency & stability
          isolateEmulatorThread: true # Pins QEMU’s emulator thread to a dedicated pCPU instead of sharing with vCPUs
          model: host-passthrough # Exposes host CPU features directly to the VM - best performance (but less portable)
          numa:
            guestMappingPassthrough: {} # Mirrors host NUMA topology to the guest - reduces cross-node memory latency
        memory:
          hugepages:
            pageSize: 1Gi # Uses 1Gi hugepages for reduced TLB pressure - better performance for large-memory SQL
        firmware:
          bootloader:
            efi:
              secureBoot: false # Disable Secure Boot (often required when using custom/older virtio drivers)
        features:
          acpi: {} # ACPI support (power management, shutdown, reboot, device enumeration)
          apic: {} # Advanced Programmable Interrupt Controller (modern interrupt handling for Windows/SQL)
          hyperv: # Enable Hyper-V enlightenment features for Windows guests to improve performance & timing
            reenlightenment: {} # Allows guest to update its TSC frequency after migrations or time adjustments
            ipi: {} # Hyper-V IPI acceleration - faster inter-processor interrupts between vCPUs
            synic: {} # Hyper-V Synthetic Interrupt Controller - improves interrupt delivery
            synictimer: {} # Hyper-V synthetic timer - stable high-resolution guest time source
            spinlocks:
              spinlocks: 8191 # Prevents Windows spinlock stalls on SMP systems - avoids boot/timeouts under load
            reset: {} # Hyper-V reset infrastructure - cleaner VM resets
            relaxed: {} # Relaxed timing - reduces overhead when timing deviations occur under virtualization
            vpindex: {} # Per-vCPU indexing - improves Windows scheduler awareness of vCPU layout
            runtime: {} # Hyper-V runtime page support - gives guest better insight into hypervisor behavior
            tlbflush: {} # Hyper-V accelerated TLB flush - improves scalability on multi-vCPU workloads
            frequencies: {} # Exposes host CPU frequency data - allows proper scaling & guest timing
            vapic: {} # Virtual APIC support - reduces interrupt latency and overhead
        clock:
          timer:
            hyperv: {} # Hyper-V clock/timer - stable time source, recommended when using Hyper-V enlightenments
        resources: # requests == limits for guaranteed QoS (exclusive CPU & memory reservation)
          requests:
            memory: "8Gi"
            cpu: "4"
            hugepages-1Gi: "8Gi"
          limits:
            memory: "8Gi"
            cpu: "4"
            hugepages-1Gi: "8Gi"
        devices:
          autoattachGraphicsDevice: false # Do not attach a virtual graphics/display device (VNC/SPICE) - removes unnecessary emulation
          autoattachMemBalloon: false # Disable the VirtIO memory balloon - prevents dynamic memory changes, improves consistency
          autoattachSerialConsole: true # Attach a serial console for debugging and virtctl console access
          networkInterfaceMultiqueue: true # Enable multi-queue virtio-net so NIC traffic can use multiple RX/TX queues
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi   # Keep OS disk on SCSI - simpler boot path once VirtIO storage is already in place
              cache: none
            # Disk 2: SQL System
            - name: sqlsystem
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Give this disk its own IO thread - reduces contention with other disks
            # Disk 3: SQL Data
            - name: sqldata
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Separate IO thread for data file I/O - improves parallelism under load
            # Disk 4: SQL Log
            - name: sqllog
              disk:
                bus: virtio
              cache: none
              dedicatedIOThread: true # Separate IO thread for log writes - helps with low-latency sequential I/O
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2022-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio # High-performance paravirtualized NIC (requires NetKVM driver in the guest)
              bridge: {} # Bridge mode - VM gets an IP on the pod network (via the pod’s primary interface)
              ports:
                - port: 3389 # RDP
                - port: 1433 # SQL Server
      networks:
        - name: default
          pod: {} # Attach VM to the default Kubernetes pod network
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqlsystem
          persistentVolumeClaim:
            claimName: sqlsystem
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2022-iso
          persistentVolumeClaim:
            claimName: sql2022-pvc

And then I ran the tests again: –

# StatefulSet
TEST RESULT : System achieved 47200 NOPM from 109554 SQL Server TPM

# KubeVirt
TEST RESULT : System achieved 46563 NOPM from 108184 SQL Server TPM

BOOOOOOOOOM! OK, so that’s 98% of the TPM achieved in the StatefulSet. And there’s a bit of variance in those results so these are now pretty much the same!

Ok so it’s not the most robust performance testing ever done…and I am fully aware that testing in a lab like this is one thing, whereas running SQL Server in KubeVirt…even in a dev/test environment is a completely other situation. There are still questions over stability and resiliency BUT from this I hope that it shows that we shouldn’t be counting KubeVirt out as a platform for SQL Server, based on performance.

Thanks for reading!

Running SQL Server on KubeVirt – Getting Started

With all the changes that have happened with VMware since the Broadcom acquisition I have been asked more and more about alternatives for running SQL Server.

One of the options that has repeatedly cropped up is KubeVirt

KubeVirt provides the ability to run virtual machines in Kubernetes…so essentially could provide an option to “lift and shift” VMs from VMware to a Kubernetes cluster.

A bit of background on KubeVirt…it’s a CNCF project accepted in 2019 and moved to “incubating” maturity level in 2022…so it’s been around a while now. KubeVirt uses custom resources and controllers in order to create, deploy, and manage VMs in Kubernetes by using libvirt and QEMU under the hood to provision those virtual machines.

I have to admit, I’m skeptical about this…we already have a way to deploy SQL Server to Kubernetes, and I don’t really see the benefits of deploying an entire VM.

But let’s run through how to get up and running with SQL Server in KubeVirt. There are a bunch of pre-requisites required here so I’ll detail the setup that I’m using.

I went with a physical server for this, as I didn’t want to deal with any nested virtualisation issues (VMs within VMs) and I could only get ONE box…so I’m running a “compressed” Kubernetes cluster, aka the node is both a control and worker node. I also needed a storage provider, and as I work for Pure Storage…I have access to a FlashArray which I’ll provision persistent volumes from via Portworx (the PX-CSI offering to be exact). Portworx provides a CSI driver that exposes FlashArray storage to Kubernetes for PersistentVolume provisioning.

So it’s not an ideal setup…I’ll admit…but should be good enough to get up and running to see what KubeVirt is all about.

Let’s go ahead and get started with KubeVirt.

First thing to do is actually deploy KubeVirt to the cluster…I followed the guide here: –
https://kubevirt.io/user-guide/cluster_admin/installation/

export RELEASE=$(curl https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt) # set the latest KubeVirt release
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${RELEASE}/kubevirt-operator.yaml # deploy the KubeVirt operator
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${RELEASE}/kubevirt-cr.yaml # create the KubeVirt CR (instance deployment request) which triggers the actual installation

Let’s wait until all the components are up and running: –

kubectl -n kubevirt wait kv kubevirt --for condition=Available

Here’s what each of these components does: –

virt-api        - The API endpoint used by Kubernetes and virtctl to interact with VM and VMI subresources.
virt-controller - Control-plane component that reconciles VM and VMI objects, creates VMIs, and manages migrations.
virt-handler    - Node-level component responsible for running and supervising VMIs and QEMU processes on each node.
virt-operator   - Manages the installation, upgrades, and lifecycle of all KubeVirt core components.

There’s two pods for the controllers and operator as they are deployments with a default replicas value of 2…I’m running on a one node cluster so could scale those down but I’ll leave the defaults for now.

More information on the architecture of KubeVirt can be found here: –
https://kubevirt.io/user-guide/architecture/

And I found this blog post really useful!
https://arthurchiao.art/blog/kubevirt-create-vm/

The next tool we’ll need is the Containerized Data Importer, this is the backend component that will allow us to upload ISO files to the Kubernetes cluster, which will then be mounted as persistent volumes when we deploy a VM. The guide I followed was here : –
https://github.com/kubevirt/containerized-data-importer

export VERSION=$(curl -s https://api.github.com/repos/kubevirt/containerized-data-importer/releases/latest | grep '"tag_name":' | sed -E 's/.*"([^"]+)".*/\1/')
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-operator.yaml
kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-cr.yaml

And again let’s wait for all the components to be up and running: –

kubectl get all -n cdi

Right, the NEXT tool we’ll need is virtctl this is the CLI that allows us to deploy/configure/manage VMs in KubeVirt: –

export VERSION=$(curl https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt)
wget https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/virtctl-${VERSION}-linux-amd64

And confirm that it’s installed (add to your $PATH environment variable): –

virtctl version

Okey dokey, now need to upload our ISO files for Windows and SQL Server to the cluster.

Note I’m referencing the storage class from my storage provider (PX-CSI) here. Also, I could not get this to work from my desktop, I had to upload the ISO files to the Kubernetes node and run there. The value for the –uploadproxy-url flag is the IP address of the cdi-uploadproxy service: –

Uploading the Windows ISO (I went with Windows Server 2025): –

virtctl image-upload pvc win2025-pvc --size 10Gi \
--image-path=./en-us_windows_server_2025_updated_oct_2025_x64_dvd_6c0c5aa8.iso \
--uploadproxy-url=https://10.97.56.82:443 \
--storage-class px-fa-direct-access \
--insecure

And uploading the SQL Server 2025 install ISO: –

virtctl image-upload pvc sql2025-pvc --size 10Gi \
--image-path=./SQLServer2025-x64-ENU.iso \
--uploadproxy-url=https://10.97.56.82:443 \
--storage-class px-fa-direct-access \
--insecure

Let’s confirm the resulting persistent volumes: –

kubectl get persistent volumes

Ok, so the next step is to pull down a container image so that it can be referenced in the VM yaml. This image contains the VirtIO drivers needed for Windows to detect the VM’s virtual disks and network interfaces: –

sudo ctr images pull docker.io/kubevirt/virtio-container-disk:latest
sudo ctr images ls | grep virtio

The final thing to do is create the PVCs/PVs that will be used for the OS, SQL data files, and SQL log files within the VM. The yaml is: –

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: winos
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 100Gi
  storageClassName: px-fa-direct-access
  volumeMode: Block
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sqldata
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 50Gi
  storageClassName: px-fa-direct-access
  volumeMode: Block
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sqllog
spec:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 25Gi
  storageClassName: px-fa-direct-access
  volumeMode: Block

And then create!

kubectl apply -f pvc.yaml

Right, now we can create the VM! Below is the yaml I used to create the VM: –

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: win2025
spec:
  runStrategy: Manual # VM will not start automatically
  template:
    metadata:
      labels:
        app: sqlserver
    spec:
      domain:
        firmware:
          bootloader:
            efi: # uefi boot
              secureBoot: false # disable secure boot
        resources: # requesting same limits and requests for guaranteed QoS
          requests:
            memory: "8Gi"
            cpu: "4"
          limits:
            memory: "8Gi"
            cpu: "4"
        devices:
          disks:
            # Disk 1: OS
            - name: osdisk
              disk:
                bus: scsi
            # Disk 2: SQL Data
            - name: sqldata
              disk:
                bus: scsi
            # Disk 3: SQL Log
            - name: sqllog
              disk:
                bus: scsi
            # Windows installer ISO
            - name: cdrom-win2025
              cdrom:
                bus: sata
                readonly: true
            # VirtIO drivers ISO
            - name: virtio-drivers
              cdrom:
                bus: sata
                readonly: true
            # SQL Server installer ISO
            - name: sql2025-iso
              cdrom:
                bus: sata
                readonly: true
          interfaces:
            - name: default
              model: virtio
              bridge: {}
              ports:
                - port: 3389 # port for RDP
                - port: 1433 # port for SQL Server      
      networks:
        - name: default
          pod: {}
      volumes:
        - name: osdisk
          persistentVolumeClaim:
            claimName: winos
        - name: sqldata
          persistentVolumeClaim:
            claimName: sqldata
        - name: sqllog
          persistentVolumeClaim:
            claimName: sqllog
        - name: cdrom-win2025
          persistentVolumeClaim:
            claimName: win2025-pvc
        - name: virtio-drivers
          containerDisk:
            image: kubevirt/virtio-container-disk
        - name: sql2025-iso
          persistentVolumeClaim:
            claimName: sql2025-pvc

Let’s deploy the VM: –

kubectl apply -f win2025.yaml

And let’s confirm: –

kubectl get vm

So now we’re ready to start the VM and install windows: –

virtctl start win2025

This will start an instance of the VM we created…to monitor the startup: –

kubectl get vm
kubectl get vmi
kubectl get pods

So we have a virtual machine, an instance of that virtual machine, and a virt-launcher pod…which is actually running the virtual machine by launching the QEMU process for the virtual machine instance.

Once the VM instance has been started, we can connect to it via VNC and run through the Windows installation process. I’m using TigerVNC here.

virtctl vnc win2025 --vnc-path "C:\Tools\vncviewer64-1.15.0.exe" --vnc-type=tiger

Hit any key to boot from the ISO (you’ll need to go into the boot options) but we’re now running through a normal Windows install process!

When the option to select the drive to install Windows appears, we have to load the drivers from the ISO we mounted from the virtio-container-disk:latest container
image: –

Once those are loaded, we’ll be able to see all the disks attached to the VM and continue the install process.

When the install completes, we’ll need to check the drivers in Device Manager: –

Go through and install any missing drivers (check disks and anything under “other devices”).

OK because VNC drives me nuts…once we have Windows installed, we’ll open up remote connections within Windows and then deploy a node port service to the cluster to open up port 3389…which will let us RDP to the VM: –

apiVersion: v1
kind: Service
metadata:
  name: win2025-rdp
spec:
  ports:
  - port: 3389
    protocol: TCP
    targetPort: 3389
  selector:
    vm.kubevirt.io/name: win2025
  type: NodePort

Confirm service and port: –

kubectl get services

Once we can RDP, we can continue to configure Windows (if we want to) but the main thing now is to get SQL Server 2025 installed. Don’t forget to online and format the disks for the SQL Server data and log files!

The ISO file containing the SQL install media is mounted within the VM…so it’s just a normal install. Run through the install and confirm it’s successful: –

Once the installation is complete…let’s deploy another node port service to allow us to connect to SQL in the VM: –

apiVersion: v1
kind: Service
metadata:
  name: win2025-sql
spec:
  ports:
  - port: 1433
    protocol: TCP
    targetPort: 1433
  selector:
    vm.kubevirt.io/name: win2025
  type: NodePort

Confirm the service: –

kubectl get services

And let’s attempt to connect to SQL Server in SSMS: –

And there is SQL Server running in KubeVirt!

Ok, let’s run a performance test to see how it compares with SQL deployed to the same Kubernetes cluster as a statefulset. I used Anthony Nocentino’s containerised HammerDB tool for this…here are the results: –

# Statefulset result
TEST RESULT : System achieved 45319 NOPM from 105739 SQL Server TPM

# KubeVirt result
TEST RESULT : System achieved 5962 NOPM from 13929 SQL Server TPM

OK, well that’s disastrous! 13% of the transaction per minute achieved for the SQL instance in the statefulset on the same cluster!

I also noticed a very high CPU privileged time when running the test against the database in the KubeVirt instance, which indicates that the VM is spending a lot of time in kernel or virtualization overhead. This is more than likely caused by incorrectly configured drivers, so it’s definitely not an optimal setup.

So OK, this might not be a perfectly fair test, but the gap is still significant. And it’s a lot of effort to go through just to get an instance of SQL Server up and running. But now that we do have a VM running SQL Server, I’ll explore how (or if) we can clone that VM so we don’t have to repeat this entire process for each new deployment…I’ll cover that in a later blog post. I’ll also see if I can address the performance issues.

But to round things off…deploying SQL as a statefulset to a Kubernetes cluster would still be my recommendation.

Thanks for reading!

Recovering data with crash consistent snapshots

When we talk about snapshots of SQL Server there are two types, application consistent snapshots and crash consistent snapshots.

Application consistent snapshots require freezing IO on a database allowing for a checkpoint to be performed in order to write all dirty pages to disk.

In the past, application consistent snapshots relied on the third party software to call the SQL Writer service but now with SQL Server 2022 T-SQL snapshot backups we have the ability to use T-SQL commands to freeze IO on a database in order for us to take an application consistent snapshot of the database.

This is great as we can now easily take a snapshot of a database, bring it up in restoring mode (overwriting the existing database or as a new database), and then perform a point-in-time restore with any transaction log backups that we have.

Crash consistent snapshots are different in that they just take a snapshot of the disks that the database files are on, without freezing IO on the database. This loses us the ability to perform point-in-time restores but they can be useful in certain situations…for example, the recovery of data that was incorrectly updated in say a relatively static table.

I am not suggesting for one second that snapshots replace native SQL Server backups…not at all. What I am saying is that snapshots can be used to complement native SQL backups.

In this post I want to run through one such situation.

Note – Different storage vendors will allow snapshots to be taken in different ways. I work for Pure Storage so in this example…I’m going to be using the Pure Storage powershell module to take the snapshot of volumes on a Pure FlashArray. The commands may differ for different vendors but the concepts remain the same.

So the setup here is one SQL Server instance, running on VMWare with vVols presented from a Pure Storage FlashArray. The server has three volumes presented to it, on three separate disks, presented as C:\, E:\, and F:\ drives respectively. I’m using the AdventureWorks database which is hosted on the E:\ drive and we’ll overwrite the volume presented as the F:\ drive with the snapshot taken.

The vVols on the array are in a Pure Storage construct called a Protection Group. A Protection Group is a logical entity allowing us to group volumes together…so for example…if we had our database’s data and log files on separate volumes…we could group them in a Protection Group and take a snapshot of that group. This allows for that snapshot to contain individual snapshots of the two volumes at the same time.

But for simplicity’s sake in this example…the database is hosted on one volume…presented as the E:\ drive on the server.

What we’re going to simulate here is an incorrect UPDATE statement executed against the HumanResources.Employee table in the AdventureWorks database. So the steps are: –

  • Take a snapshot of the volume that hosts the database using the functions from the Pure Storage powershell module
  • Run an UPDATE statement without a WHERE clause
  • Overwrite another volume on the server with the snapshot taken
  • Attach the database, and retrieve the data

OK, let’s run through the process. First thing, install the Pure Storage powershell module: –

Install-Module PureStoragePowerShellSDK

Then import the module into our powershell session: –

Import-Module PureStoragePowerShellSDK

Now set variables for the storage array management IP address, the volume we’re going to take the snapshot, and the name of the protection group: –

$FlashArrayIp = "<<STORAGE ARRAY MANAGENENT IP>>"
$SourceVolume = "<<VOLUME NAME ON THE ARRAY>>"
$ProtectionGroup = "<<PROTECTION GROUP NAME>>"

Set your credentials to connect to the array: –

$Cred = Get-Credential

Then connect to the array, storing the connection in a variable: –

$FlashArray = New-PfaArray -EndPoint $FlashArrayIp -Credentials $Cred -IgnoreCertificateError

Then take the snapshot: –

New-PFAProtectionGroupSnapshot -Array $FlashArray -ProtectionGroupName $ProtectionGroup

Excellent, we have our crash consistent snapshot of the protection group that the database resides in.

OK, now we’re going to simulate an incorrect data update in the AdventureWorks database….in this case an UPDATE statement executed without a WHERE clause: –

UPDATE [AdventureWorks2019].[HumanResources].[Employee]
SET JobTitle = 'Sales Representative'

Has happened to us all, right? 🙂

Accidentally missing off a WHERE clause and updating all the records in a table?

Not great, we need to get this data back asap.

So we have the option of a point in time restore…either overwriting the existing database or doing this side-by-side and then updating the data from the restored database.

But we have our crash consistent snapshot.

What we can do is overwrite another volume on the same (or different) server, reattach the database, and then retrieve the data.

So let’s run through how to do that.

First thing…a disk has to be presented to the server that is the same size as the disk that we took the snapshot of. In this example, it’s the F:\ drive on the same server that the snapshot was taken (of the E:\ drive).

OK…so let’s run through overwriting that disk on the server with the snapshot taken.

In a new powershell session, import the Pure Storage powershell module: –

Import-Module PureStoragePowerShellSDK

Then set a whole bunch of variables: –

$TargetVM = "<<TARGET VM NAME>>"

$TargetDisk = "<<TARGET DISK ID>>" # this is retrieved by Get-Disk on the target server

$SourceVolume = "<<SOURCE VOLUME NAME ON THE ARRAY>>"
$TargetVolume = "<<TARGET VOLUME NAME ON THE ARRAY>>"

$ProtectionGroup = "<<PROTECTION GROUP NAME>>"

$FlashArrayIp = "<<STORAGE ARRAY MANAGENENT IP>>"

Set your credentials to connect to the array: –

$Cred = Get-Credential

Then connect to the array, storing the connection in a variable: –

$FlashArray = New-PfaArray -EndPoint $FlashArrayIp -Credentials $Cred -IgnoreCertificateError

Create session on target VM: –

$TargetVMSession = New-PSSession -ComputerName $TargetVM

Offline the target volume:-

Invoke-Command -Session $TargetVMSession -ScriptBlock { Get-Disk | ? { $_.SerialNumber -eq $using:TargetDisk } | Set-Disk -IsOffline $True }

Confirm that the volume is offline: –

Invoke-Command -Session $TargetVMSession -ScriptBlock { Get-Disk | Select-Object Number, SerialNumber, OperationalStatus | Format-Table}

Get most recent snapshot, the one that we took earlier: –

$MostRecentSnapshot = Get-PfaProtectionGroupSnapshots -Array $FlashArray -Name $ProtectionGroup | Sort-Object created -Descending | Select-Object -Property name -First 1
$MostRecentSnapshot.Name

Perform the volume overwrite: –

New-PfaVolume -Array $FlashArray -VolumeName $TargetVolume -Source ($MostRecentSnapshot.name + ".$SourceVolume") -Overwrite

Online the target volume: –

Invoke-Command -Session $TargetVMSession -ScriptBlock { Get-Disk | ? { $_.SerialNumber -eq $using:TargetDisk } | Set-Disk -IsOffline $False }

Confirm that the volume is online: –

Invoke-Command -Session $TargetVMSession -ScriptBlock { Get-Disk | Select-Object Number, SerialNumber, OperationalStatus | Format-Table}

And to further confirm, list database files on new volume: –

Invoke-Command -Session $TargetVMSession -ScriptBlock { Get-ChildItem F:\SQLData1}

Awesome, OK we have the data back. Now we have some work to do in SQL Server.

Attach the database from files on restored volume: –

CREATE DATABASE [AdventureWorks2019_REATTACH]
 ON  PRIMARY 
( NAME = N'AdventureWorks2017', FILENAME = N'F:\SQLData1\AdventureWorks2019.mdf')
 LOG ON 
( NAME = N'AdventureWorks2017_log', FILENAME = N'F:\SQLTLog1\AdventureWorks2019_log.ldf')
 FOR ATTACH
GO

Confirm that the data is correct in the newly attached database: –

SELECT * FROM [AdventureWorks2019_REATTACH].[HumanResources].[Employee]

Looks good! Now it’s just a matter of updating the data in the original database: –

UPDATE e
SET e.JobTitle = e1.JobTitle
FROM [AdventureWorks2019].[HumanResources].[Employee] e
INNER JOIN [AdventureWorks2019_REATTACH].[HumanResources].[Employee] e1
ON e.BusinessEntityID = e1.BusinessEntityID

Looks good! And that’s how to use snapshots to recover data that has been incorrectly altered.

Thanks for reading!

Granting read access to SQL Server services with Just Enough Administration

We’ve all been there as DBAs…people requesting access to the servers that we look after to be able to view certain things.

I’ve always got, well, twitchy with giving access to servers tbh…but what if we could completely restrict what users could do via powershell?

Enter Just Enough Administration. With JEA we can grant remote access via powershell sessions to servers and limit what users can do.

So let’s run through an example. Here we’re going to create a configuration to allow users to view the status of the MSSQLSERVER and SQLSERVERAGENT services only.

Firstly, let’s create a session configuration file: –

New-PSSessionConfigurationFile -SessionType RestrictedRemoteServer  -Path .\JeaSqlConfig.pssc

I stripped out pretty much all the default settings in the file to leave it as this: –

@{

# Version number of the schema used for this document
SchemaVersion = '2.0.0.0'

# ID used to uniquely identify this document
GUID = '60732de2-33cc-420b-a745-5596b27cf761'

# Author of this document
Author = 'Andrew Pruski'

# Description of the functionality provided by these settings
Description = 'Allow users to view and restart SQL Server services'

# Session type defaults to apply for this session configuration. Can be 'RestrictedRemoteServer' (recommended), 'Empty', or 'Default'
SessionType = 'RestrictedRemoteServer'

# Directory to place session transcripts for this session configuration
TranscriptDirectory = 'C:\JEA\SQLSERVER'

# Whether to run this session configuration as the machine's (virtual) administrator account
RunAsVirtualAccount = $true

# User roles (security groups), and the role capabilities that should be applied to them when applied to a session
RoleDefinitions = @{ 'DOMAIN\testuser' = @{ RoleCapabilityFiles = 'C:\JEA\SQLSERVER\JeaSqlConfig.psrc' }; } 

}

The important part in the file is this: –

RoleDefinitions = @{ 'DOMAIN\testuser' = @{ RoleCapabilityFiles = 'C:\JEA\SQLSERVER\JeaSqlConfig.psrc' }; }

This defines the user(s) that have access to the server…in this case [DOMAIN\testuser]. However we haven’t set what that user can do…for that we need a role capability file. Also note, we’re not granting any other permissions on the server to this user…permissions and capabilities are solely defined in JEA config files.

Now create the role capability file: –

New-PSRoleCapabilityFile -Path .\JeaSqlConfig.psrc

Again, I stripped out all the defaults and left the file as: –

@{

# ID used to uniquely identify this document
GUID = '44de606d-8ac0-4b27-bd3f-07cad2378717'

# Author of this document
Author = 'apruski'

# Description of the functionality provided by these settings
Description = 'JEA Role Capability File for SQL Server Services'

# Company associated with this document
CompanyName = 'Pure Storage'

# Copyright statement for this document
Copyright = '(c) 2022 Andrew Pruski. All rights reserved.'

# Cmdlets to make visible when applied to a session
VisibleCmdlets = @{ Name = 'Get-Service'; Parameters = @{ Name = 'Name'; ValidateSet = 'MSSQLSERVER', 'SQLSERVERAGENT' }}

}

The last part of the file is again the important part. Here the file is saying that the user can run the Get-Service cmdlet for the MSSQLSERVER and SQLSERVERAGENT services on the target server.

OK, now copy the files to the target server: –

Invoke-Command -ComputerName <<SERVERNAME>> -Script {New-Item C:\JEA\SQLSERVER -Type Directory}

Copy-Item .\JeaSqlConfig* \\<<SERVERNAME>>\C$\JEA\SQLSERVER

OK, now we can create the configuration on the target server. A word of warning however…this can be done via a remote powershell session but it sometimes errors out. If you get an error, RDP to the server and then run the command: –

Register-PSSessionConfiguration -Name SqlConfig -Path C:\JEA\SQLSERVER\JeaSqlConfig.pssc

To view the configuration: –

Get-PSSessionConfiguration -Name SqlConfig


N.B. – I’m using my lab for my Chaos Engineering demos to set this up, that’s why the domain for the user is “Chaos” 🙂

Ok, now we can test. Create a credential for the test user and open a remote powershell session to the target server:-

$Cred = Get-Credential
Enter-Pssession -ComputerName <<SERVERNAME>> -ConfigurationName sqlconfig -Credential $Cred

And then test viewing the MSSQLSERVER service: –

Get-Service -Name MSSQLSERVER

The results should display as normal: –

However if they try to view another service, an error will display: –

Similarly, if they try to run a different cmdlet: –

And that’s how to use JEA to give user’s access to view ONLY the SQL Server services on a target server. That’s a very basic demo of JEA as there’s a tonne of cool stuff that you can do with it but I hope that’s useful.

Thanks for reading!