EightKB 2025

EightKB is back!

The biggest online SQL Server internals conference is back in 2025…it’s all happening on August the 21st!

We’ve open our call for speakers, you can submit here: –
https://sessionize.com/eightkb-august-2025/

We’re looking for experts, not necessarily expert speakers. If you haven’t presented before we offer mentoring as part of our speaker program to help you prepare for your session so that you can enjoy presenting on the day.

As a speaker this is your chance to really go all out! If you’ve ever wanted to deep dive into a topic, this is the event to do so. No topic is too advanced…you can do as many (or as little or none at all!) demos as you would like. Field questions during the session or respond after the event…completely up to you.

One thing we are doing slightly different this year is that only four of the sessions have to focus on SQL Server internals. The fifth session can be on ANY TECH TOPIC YOU LIKE, as long as it’s 300 level and above!

As ever, speakers do not have to use a slide template, and we don’t ask for speakers to add our logo to their deck. We just want you to turn up and enjoy presenting!

After the event, we’ll provide feedback of your session from the attendees and an unbranded video of your session that you can use however you would like.

And back by popular demand, our mind-melting t-shirt and hoodie range! EightKB is 100% funded by the sales of these, so if you have enjoyed our previous conferences…please consider buying a shirt!

https://www.bonfire.com/store/eightkb/

Hope to see you there!

Visualising SQL Server in Kubernetes

The other day I came across an interesting repo on github, KubeDiagrams.

What this repo does is generate Kubernetes architecture diagrams from Kubernetes manifest files…nice!

Deploying applications to Kubernetes can get complicated fast…especially with stateful applications such as SQL Server.

So having the ability to easily generate diagrams is really helpful…because we all should be documenting everything, right? 🙂

Plus I’m rubbish at creating diagrams!

So let’s have a look at how this works. First, install the dependencies via pip: –

pip install pyyaml
pip install diagrams

And install graphviz: –

sudo apt install graphviz

Great, now pull down the repo: –

git clone https://github.com/philippemerle/KubeDiagrams.git

And we’re good to go! So here’s an example manifest file to deploy a SQL Server statefulset to Kubernetes: –

apiVersion: v1
kind: Namespace
metadata:
  name: mssql
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: mssql-sc
provisioner: docker.io/hostpath
reclaimPolicy: Delete
volumeBindingMode: Immediate
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-headless
  namespace: mssql
spec:
  clusterIP: None
  selector:
    name: mssql-pod
  ports:
    - name: mssql-port
      port: 1433
      targetPort: 1433
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
  namespace: mssql
spec:
  serviceName: "mssql-headless"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
    spec:
      securityContext:
        fsGroup: 10001
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/server:2022-CU16-ubuntu-20.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          resources:
            requests:
              memory: "1024Mi"
              cpu: "500m"
            limits:
              memory: "2048Mi"
              cpu: "2000m"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql
            - name: sqldata
              mountPath: /opt/sqlserver/data
            - name: sqllog
              mountPath: /opt/sqlserver/log
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
        namespace: mssql
      spec:
        accessModes:
         - ReadWriteOncePod
        storageClassName: mssql-sc
        resources:
          requests:
            storage: 25Gi
    - metadata:
        name: sqldata
        namespace: mssql
      spec:
        accessModes:
         - ReadWriteOncePod
        storageClassName: mssql-sc
        resources:
          requests:
            storage: 25Gi
    - metadata:
        name: sqllog
        namespace: mssql
      spec:
        accessModes:
         - ReadWriteOncePod
        storageClassName: mssql-sc
        resources:
          requests:
            storage: 25Gi
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-service
  namespace: mssql
spec:
  ports:
  - name: mssql-ports
    port: 1433
    targetPort: 1433
  selector:
    name: mssql-pod
  type: LoadBalancer

I’m using Docker Desktop here (hence the provisioner: docker.io/hostpath in the storage class). What this’ll create is a namespace, storage class, headless service for the statefulset, the statefulset itself, three persistent volume claims, and a load balanced service to connect to SQL.

Quite a lot of objects for a simple SQL Server deployment, right? (ahh I know it’s a statefulset, but you know what I mean)

So let’s point KubeDiagrams at the manifest: –

./kube-diagrams mssql-statefulset.yaml

And here’s the output!

Pretty cool, eh?

I noticed a couple of quirks. The docs say it’ll work with any version 3 install of python. I had 3.8 installed but had to upgrade to 3.9.

Also I had to add namespace: mssql to the PVCs in the statefulset, otherwise KubeDiagrams threw a warning: –

Error: ‘sqlsystem/mssql/PersistentVolumeClaim/v1’ resource not found!
Error: ‘sqldata/mssql/PersistentVolumeClaim/v1’ resource not found!
Error: ‘sqllog/mssql/PersistentVolumeClaim/v1’ resource not found

But other than those, it works really well and is a great way to visualise objects in Kubernetes.

Massive thank you to the creator, Philippe Merle!

Thanks for reading!

Side Projects

Anyone (everyone?) who has ever tried to learn a programming language knows that to really learn, you need a project.

I’m a DBA by trade and as such…haven’t really spent that much time learning programming languages so over the last year I’ve been playing around with a couple of projects using Python and Go. AI is everywhere at the moment so I thought I’d throw that in there because, well, why not?

So here I thought I’d briefly show two projects that when I have some free time (ha ha ha) I bounce from error to error with.

I’ve linked the gihub repos here as well…so you can have a look at the code (be kind). One thing I will admit to from the start…ChatGPT has 100% been used. And why not?

It’s a tool that’s available, and allows someone like me to get something working that I can then improve on. Although I will say…it doesn’t half spit out some rubbish sometimes…I think I’ve spent most of my time working out why the code it generates doesn’t work!

But, eventually, I got a both projects into a working demo state…so here they are!

Prusk-sql

The first one is a command line tool to connect to a SQL instance in a container, display the SQL version, and have an optional flag to list the databases.

OK, so there are other (better) tools out there to do this BUT this came around as I’ve spent quite a bit of time working with Helm and Kubernetes, both of which are written in Go…so I thought I’d learn a little bit!

When I’m testing SQL Server in Kubernetes/containers the first thing I do is confirm the instance is up and running by retrieving the SQL version and then list the databases present. So this helps by combining those checks into an easy one-liner…

https://github.com/dbafromthecold/prusk-sql

Burrito-Bot

If I didn’t know what I was doing with prusk-sql, I really don’t know what I’m doing here!

The idea is to have a chat bot that I can ask for burrito recommendations (I know, I know…niche or what?). The basic plan is to load data from Yelp from all restaurants in a city (starting with Dublin…but soon…THE WORLD…or maybe Cork) and then ping an LLM to respond to any input from the user.

And it works! Kinda…there’s a lot of work to be done here. I want to strip out pretty much all of the existing code and replace with something like python ChatterBot. I also need to work out how I can get it to give me recommendations for restaurants in a specific area of the city.

But for now, I’m just happy that it doesn’t spam errors!

https://github.com/dbafromthecold/burrito-bot

Anyhoo, those are the two projects I’ve been dabbling with…I hope that you are having fun with your side projects. May your errors be clear and actually point to the problem 🙂

Thanks for reading!

SQL Server Distributed Availability Groups and Kubernetes

A while back I wrote about how to use a Cross Platform (or Clusterless) Availability Group to seed a database from a Windows SQL instance into a pod in Kubernetes.

I was talking with a colleague last week and they asked, “What if the existing Windows instance is already in an Availability Group?”

This is a fair question, as it’s fairly rare (in my experience) to run a standalone SQL instance in production…most instances are in some form of HA setup, be it a Failover Cluster Instance or an Availability Group.

Failover Cluster Instances will work with a clusterless Availability Group but it’s a different story when it comes to existing Availability Groups.

A Linux node cannot be added to an existing Windows Availability Group (trust me, I tried for longer than I’m going to admit) so the only way to do it is to use a Distributed Availability Group.

So let’s run through the process!

Here is the existing Windows Availability Group: –

Just a standard, 2 node AG with one database already synchronized across the nodes. It’s that database we are going to seed over to the pod running on the Kubernetes cluster using a Distributed Availability Group.

So here’s the Kubernetes cluster: –

kubectl get nodes

Four nodes, one control plane node and three worker nodes.

OK, so first thing to do is deploy a statefulset running one SQL Server pod (using a file called sqlserver-statefulset.yaml): –

kubectl apply -f .\sqlserver-statefulset.yaml

Here’s the manifest of the statefulset: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
      annotations:
        stork.libopenstorage.org/disableHyperconvergence: "true"
    spec:
      securityContext:
        fsGroup: 10001
      hostAliases:
      - ip: "10.225.115.129"
        hostnames:
        - "z-ap-sql-10"
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/server:2022-CU15-ubuntu-20.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_ENABLE_HADR
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql
            - name: sqldata
              mountPath: /var/opt/sqlserver/data
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
        storageClassName: mssql-sc
    - metadata:
        name: sqldata
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: mssql-sc

Like my last post, this is pretty stripped down. No resources limits, tolerations etc. It has two persistent volumes, one for the system databases and one for the user databases from a storage class already configured in the cluster.

One thing to note: –

hostAliases:
- ip: "10.225.115.129"
  hostnames:
  - "z-ap-sql-10"

Here an entry in the pod’s hosts file is being created for the listener of the Windows Availability Group.

Next thing to do is deploy two services, one so that we can connect to the SQL instance (on port 1433) and one for the AG (port 5022): –

kubectl apply -f .\sqlserver-services.yaml

Here’s the manifest for the services: –

apiVersion: v1
kind: Service
metadata:
  name: mssql-service
spec:
  ports:
  - name: mssql-ports
    port: 1433
    targetPort: 1433
  selector:
    name: mssql-pod
  type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-ha-service
spec:
  ports:
  - name: mssql-ha-ports
    port: 5022
    targetPort: 5022
  selector:
    name: mssql-pod
  type: LoadBalancer

NOTE – We could use just one service with multiple ports configured but I’m keeping them separate here to try and keep things as clear as possible.

Check that everything looks OK: –

kubectl get all

Now we need to create master key, login, and user in all instances: –

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE LOGIN [dbm_login] WITH PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE USER dbm_user FOR LOGIN dbm_login;

Then create a certificate in the SQL instance in the pod: –

CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'Mirroring_certificate', EXPIRY_DATE = '20301031'

Backup that certificate: –

BACKUP CERTIFICATE dbm_certificate
TO FILE = '/var/opt/mssql/data/dbm_certificate.cer'
WITH PRIVATE KEY (
        FILE = '/var/opt/mssql/data/dbm_certificate.pvk',
        ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>'
    );

Copy the certificate locally: –

kubectl cp mssql-statefulset-0:var/opt/mssql/data/dbm_certificate.cer ./dbm_certificate.cer
kubectl cp mssql-statefulset-0:var/opt/mssql/data/dbm_certificate.pvk ./dbm_certificate.pvk

And then copy the files to the Windows boxes: –

Copy-Item dbm_certificate.cer \\z-ap-sql-02\E$\SQLBackup1\ -Force
Copy-Item dbm_certificate.pvk \\z-ap-sql-02\E$\SQLBackup1\ -Force
Copy-Item dbm_certificate.cer \\z-ap-sql-03\E$\SQLBackup1\ -Force
Copy-Item dbm_certificate.pvk \\z-ap-sql-03\E$\SQLBackup1\ -Force

Once the files are on the Windows boxes, we can create the certificate in each Windows SQL instance: –

CREATE CERTIFICATE dbm_certificate
    AUTHORIZATION dbm_user
    FROM FILE = 'E:\SQLBackup1\dbm_certificate.cer'
    WITH PRIVATE KEY (
    FILE = 'E:\SQLBackup1\dbm_certificate.pvk',
    DECRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>'
)

OK, great! Now we need to create a mirroring endpoint in the SQL instance in the pod: –

CREATE ENDPOINT [Hadr_endpoint]
    STATE = STARTED
    AS TCP (   
        LISTENER_PORT = 5022,
        LISTENER_IP = ALL)
    FOR DATA_MIRRORING (
        ROLE = ALL,
        AUTHENTICATION = WINDOWS CERTIFICATE [dbm_certificate],
        ENCRYPTION = REQUIRED ALGORITHM AES
        );
ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];

There are already endpoints in the Windows instances, but we need to update them to use the certificate for authentication: –

ALTER ENDPOINT [Hadr_endpoint]
    STATE = STARTED
    AS TCP (   
        LISTENER_PORT = 5022,
        LISTENER_IP = ALL)
    FOR DATABASE_MIRRORING (
    AUTHENTICATION = WINDOWS CERTIFICATE [dbm_certificate]
    , ENCRYPTION = REQUIRED ALGORITHM AES
    );
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];

Now we can create a one node Clusterless Availability Group in the SQL instance in the pod: –

CREATE AVAILABILITY GROUP [AG2]
   WITH (CLUSTER_TYPE=NONE)
   FOR
   REPLICA ON   
      'mssql-statefulset-0' WITH   
         (  
         ENDPOINT_URL = 'TCP://mssql-statefulset-0.com:5022',   
		 FAILOVER_MODE = MANUAL
		,AVAILABILITY_MODE = SYNCHRONOUS_COMMIT
		,BACKUP_PRIORITY = 50
		,SEEDING_MODE = AUTOMATIC
		,SECONDARY_ROLE(ALLOW_CONNECTIONS = NO)
	)    

No listener here, we are going to use the mssql-ha-service as the endpoint for the Distributed Availability Group.

OK, so on the primary node of the Windows Availability Group, we can create the Distributed Availability Group: –

CREATE AVAILABILITY GROUP [DistributedAG]  
   WITH (DISTRIBUTED)   
   AVAILABILITY GROUP ON  
      'AG1' WITH    
      (   
         LISTENER_URL = 'tcp://Z-AP-SQL-10:5022',    
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      ),   
      'AG2' WITH    
      (   
         LISTENER_URL = 'tcp://10.225.115.131:5022',   
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      );    

We could use a host file entry for the URL in AG2 (I did that in the previous post) but here we’ll just use the IP address of the mssql-ha-service.

OK, nearly there! We now have to join the Availability Group in the SQL instance in the pod: –

ALTER AVAILABILITY GROUP [DistributedAG]  
   JOIN   
   AVAILABILITY GROUP ON  
      'AG1' WITH    
      (   
         LISTENER_URL = 'tcp://Z-AP-SQL-10:5022',    
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      ),   
      'AG2' WITH    
      (   
         LISTENER_URL = 'tcp://10.225.115.131:5022',   
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      );    

And that should be it! If we now connect to the SQL instance in the pod…the database is there!

There it is! OK, one thing I haven’t gone through here is how to get auto-seeding working from Windows into a Linux SQL instance. I went through how that works in my previous post (here) but the gist is as long as the database data and log files are located under the Windows SQL instance’s default data and log path…they will auto-seed to the Linux SQL instance’s default data and log paths.

So that’s how to seed a database from a SQL instance that is in a Windows Availability Group into a SQL instance running in a pod in a Kubernetes cluster using a Distributed Availability Group!

Phew! Thanks for reading!

Key mappings in Visual Studio Code

At a recent conference I was asked how I executed code during my presentation.

Running code during a presentation should be as smooth as possible…so in VS Code I have a couple of custom key mappings set up and I thought I’d write a quick blog detailing what they are and what they do.

This is the first set: –

    {
        "key": "ctrl+/",
        "command": "workbench.action.terminal.focus",
        "when": "!terminalFocus"
    },
    {
        "key": "ctrl+/",
        "command": "workbench.action.focusActiveEditorGroup",
        "when": "terminalFocus"
    }

What this does is allow me to switch from an editor window to the terminal and back again using Ctrl+/

Like this…

The next key mapping I have is…

    {
        "key": "ctrl+;",
        "command": "extension.multiCommand.execute",
        "args": {
            "sequence": [
                "workbench.action.terminal.runSelectedText",
                "workbench.action.focusActiveEditorGroup"
            ]
        }
    }

This requires an extension called multi-command and what it allows me to do is select code in an editor and execute it in the terminal by pressing Ctrl+;

It also allows for multiple lines of code to be highlighed but if nothing is highlighted, it’ll just run whatever is on the existing line. Once the code is executed, it’ll return the cursor to the editor (instead of remaining in the terminal).

Like this…

These key mappings allow me to run through code in presentations nice and smoothly…so the session can focus on the code itself and not me stumbling around trying to run it!

You can set up these key mappings in your install of VS Code by opening the command palette and searching for “Preferences: Open Keyboard Shortcuts (JSON)”. This’ll open a keybindings.json file in the editor.

I’ve dropped my full file into a Gist here.

Thanks for reading!