Side Projects

Anyone (everyone?) who has ever tried to learn a programming language knows that to really learn, you need a project.

I’m a DBA by trade and as such…haven’t really spent that much time learning programming languages so over the last year I’ve been playing around with a couple of projects using Python and Go. AI is everywhere at the moment so I thought I’d throw that in there because, well, why not?

So here I thought I’d briefly show two projects that when I have some free time (ha ha ha) I bounce from error to error with.

I’ve linked the gihub repos here as well…so you can have a look at the code (be kind). One thing I will admit to from the start…ChatGPT has 100% been used. And why not?

It’s a tool that’s available, and allows someone like me to get something working that I can then improve on. Although I will say…it doesn’t half spit out some rubbish sometimes…I think I’ve spent most of my time working out why the code it generates doesn’t work!

But, eventually, I got a both projects into a working demo state…so here they are!

Prusk-sql

The first one is a command line tool to connect to a SQL instance in a container, display the SQL version, and have an optional flag to list the databases.

OK, so there are other (better) tools out there to do this BUT this came around as I’ve spent quite a bit of time working with Helm and Kubernetes, both of which are written in Go…so I thought I’d learn a little bit!

When I’m testing SQL Server in Kubernetes/containers the first thing I do is confirm the instance is up and running by retrieving the SQL version and then list the databases present. So this helps by combining those checks into an easy one-liner…

https://github.com/dbafromthecold/prusk-sql

Burrito-Bot

If I didn’t know what I was doing with prusk-sql, I really don’t know what I’m doing here!

The idea is to have a chat bot that I can ask for burrito recommendations (I know, I know…niche or what?). The basic plan is to load data from Yelp from all restaurants in a city (starting with Dublin…but soon…THE WORLD…or maybe Cork) and then ping an LLM to respond to any input from the user.

And it works! Kinda…there’s a lot of work to be done here. I want to strip out pretty much all of the existing code and replace with something like python ChatterBot. I also need to work out how I can get it to give me recommendations for restaurants in a specific area of the city.

But for now, I’m just happy that it doesn’t spam errors!

https://github.com/dbafromthecold/burrito-bot

Anyhoo, those are the two projects I’ve been dabbling with…I hope that you are having fun with your side projects. May your errors be clear and actually point to the problem 🙂

Thanks for reading!

SQL Server Distributed Availability Groups and Kubernetes

A while back I wrote about how to use a Cross Platform (or Clusterless) Availability Group to seed a database from a Windows SQL instance into a pod in Kubernetes.

I was talking with a colleague last week and they asked, “What if the existing Windows instance is already in an Availability Group?”

This is a fair question, as it’s fairly rare (in my experience) to run a standalone SQL instance in production…most instances are in some form of HA setup, be it a Failover Cluster Instance or an Availability Group.

Failover Cluster Instances will work with a clusterless Availability Group but it’s a different story when it comes to existing Availability Groups.

A Linux node cannot be added to an existing Windows Availability Group (trust me, I tried for longer than I’m going to admit) so the only way to do it is to use a Distributed Availability Group.

So let’s run through the process!

Here is the existing Windows Availability Group: –

Just a standard, 2 node AG with one database already synchronized across the nodes. It’s that database we are going to seed over to the pod running on the Kubernetes cluster using a Distributed Availability Group.

So here’s the Kubernetes cluster: –

kubectl get nodes

Four nodes, one control plane node and three worker nodes.

OK, so first thing to do is deploy a statefulset running one SQL Server pod (using a file called sqlserver-statefulset.yaml): –

kubectl apply -f .\sqlserver-statefulset.yaml

Here’s the manifest of the statefulset: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
      annotations:
        stork.libopenstorage.org/disableHyperconvergence: "true"
    spec:
      securityContext:
        fsGroup: 10001
      hostAliases:
      - ip: "10.225.115.129"
        hostnames:
        - "z-ap-sql-10"
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/server:2022-CU15-ubuntu-20.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_ENABLE_HADR
              value: "1"
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql
            - name: sqldata
              mountPath: /var/opt/sqlserver/data
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
        storageClassName: mssql-sc
    - metadata:
        name: sqldata
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: mssql-sc

Like my last post, this is pretty stripped down. No resources limits, tolerations etc. It has two persistent volumes, one for the system databases and one for the user databases from a storage class already configured in the cluster.

One thing to note: –

hostAliases:
- ip: "10.225.115.129"
  hostnames:
  - "z-ap-sql-10"

Here an entry in the pod’s hosts file is being created for the listener of the Windows Availability Group.

Next thing to do is deploy two services, one so that we can connect to the SQL instance (on port 1433) and one for the AG (port 5022): –

kubectl apply -f .\sqlserver-services.yaml

Here’s the manifest for the services: –

apiVersion: v1
kind: Service
metadata:
  name: mssql-service
spec:
  ports:
  - name: mssql-ports
    port: 1433
    targetPort: 1433
  selector:
    name: mssql-pod
  type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-ha-service
spec:
  ports:
  - name: mssql-ha-ports
    port: 5022
    targetPort: 5022
  selector:
    name: mssql-pod
  type: LoadBalancer

NOTE – We could use just one service with multiple ports configured but I’m keeping them separate here to try and keep things as clear as possible.

Check that everything looks OK: –

kubectl get all

Now we need to create master key, login, and user in all instances: –

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE LOGIN [dbm_login] WITH PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE USER dbm_user FOR LOGIN dbm_login;

Then create a certificate in the SQL instance in the pod: –

CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'Mirroring_certificate', EXPIRY_DATE = '20301031'

Backup that certificate: –

BACKUP CERTIFICATE dbm_certificate
TO FILE = '/var/opt/mssql/data/dbm_certificate.cer'
WITH PRIVATE KEY (
        FILE = '/var/opt/mssql/data/dbm_certificate.pvk',
        ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>'
    );

Copy the certificate locally: –

kubectl cp mssql-statefulset-0:var/opt/mssql/data/dbm_certificate.cer ./dbm_certificate.cer
kubectl cp mssql-statefulset-0:var/opt/mssql/data/dbm_certificate.pvk ./dbm_certificate.pvk

And then copy the files to the Windows boxes: –

Copy-Item dbm_certificate.cer \\z-ap-sql-02\E$\SQLBackup1\ -Force
Copy-Item dbm_certificate.pvk \\z-ap-sql-02\E$\SQLBackup1\ -Force
Copy-Item dbm_certificate.cer \\z-ap-sql-03\E$\SQLBackup1\ -Force
Copy-Item dbm_certificate.pvk \\z-ap-sql-03\E$\SQLBackup1\ -Force

Once the files are on the Windows boxes, we can create the certificate in each Windows SQL instance: –

CREATE CERTIFICATE dbm_certificate
    AUTHORIZATION dbm_user
    FROM FILE = 'E:\SQLBackup1\dbm_certificate.cer'
    WITH PRIVATE KEY (
    FILE = 'E:\SQLBackup1\dbm_certificate.pvk',
    DECRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>'
)

OK, great! Now we need to create a mirroring endpoint in the SQL instance in the pod: –

CREATE ENDPOINT [Hadr_endpoint]
    STATE = STARTED
    AS TCP (   
        LISTENER_PORT = 5022,
        LISTENER_IP = ALL)
    FOR DATA_MIRRORING (
        ROLE = ALL,
        AUTHENTICATION = WINDOWS CERTIFICATE [dbm_certificate],
        ENCRYPTION = REQUIRED ALGORITHM AES
        );
ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];

There are already endpoints in the Windows instances, but we need to update them to use the certificate for authentication: –

ALTER ENDPOINT [Hadr_endpoint]
    STATE = STARTED
    AS TCP (   
        LISTENER_PORT = 5022,
        LISTENER_IP = ALL)
    FOR DATABASE_MIRRORING (
    AUTHENTICATION = WINDOWS CERTIFICATE [dbm_certificate]
    , ENCRYPTION = REQUIRED ALGORITHM AES
    );
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];

Now we can create a one node Clusterless Availability Group in the SQL instance in the pod: –

CREATE AVAILABILITY GROUP [AG2]
   WITH (CLUSTER_TYPE=NONE)
   FOR
   REPLICA ON   
      'mssql-statefulset-0' WITH   
         (  
         ENDPOINT_URL = 'TCP://mssql-statefulset-0.com:5022',   
		 FAILOVER_MODE = MANUAL
		,AVAILABILITY_MODE = SYNCHRONOUS_COMMIT
		,BACKUP_PRIORITY = 50
		,SEEDING_MODE = AUTOMATIC
		,SECONDARY_ROLE(ALLOW_CONNECTIONS = NO)
	)    

No listener here, we are going to use the mssql-ha-service as the endpoint for the Distributed Availability Group.

OK, so on the primary node of the Windows Availability Group, we can create the Distributed Availability Group: –

CREATE AVAILABILITY GROUP [DistributedAG]  
   WITH (DISTRIBUTED)   
   AVAILABILITY GROUP ON  
      'AG1' WITH    
      (   
         LISTENER_URL = 'tcp://Z-AP-SQL-10:5022',    
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      ),   
      'AG2' WITH    
      (   
         LISTENER_URL = 'tcp://10.225.115.131:5022',   
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      );    

We could use a host file entry for the URL in AG2 (I did that in the previous post) but here we’ll just use the IP address of the mssql-ha-service.

OK, nearly there! We now have to join the Availability Group in the SQL instance in the pod: –

ALTER AVAILABILITY GROUP [DistributedAG]  
   JOIN   
   AVAILABILITY GROUP ON  
      'AG1' WITH    
      (   
         LISTENER_URL = 'tcp://Z-AP-SQL-10:5022',    
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      ),   
      'AG2' WITH    
      (   
         LISTENER_URL = 'tcp://10.225.115.131:5022',   
         AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,   
         FAILOVER_MODE = MANUAL,   
         SEEDING_MODE = AUTOMATIC   
      );    

And that should be it! If we now connect to the SQL instance in the pod…the database is there!

There it is! OK, one thing I haven’t gone through here is how to get auto-seeding working from Windows into a Linux SQL instance. I went through how that works in my previous post (here) but the gist is as long as the database data and log files are located under the Windows SQL instance’s default data and log path…they will auto-seed to the Linux SQL instance’s default data and log paths.

So that’s how to seed a database from a SQL instance that is in a Windows Availability Group into a SQL instance running in a pod in a Kubernetes cluster using a Distributed Availability Group!

Phew! Thanks for reading!

Key mappings in Visual Studio Code

At a recent conference I was asked how I executed code during my presentation.

Running code during a presentation should be as smooth as possible…so in VS Code I have a couple of custom key mappings set up and I thought I’d write a quick blog detailing what they are and what they do.

This is the first set: –

    {
        "key": "ctrl+/",
        "command": "workbench.action.terminal.focus",
        "when": "!terminalFocus"
    },
    {
        "key": "ctrl+/",
        "command": "workbench.action.focusActiveEditorGroup",
        "when": "terminalFocus"
    }

What this does is allow me to switch from an editor window to the terminal and back again using Ctrl+/

Like this…

The next key mapping I have is…

    {
        "key": "ctrl+;",
        "command": "extension.multiCommand.execute",
        "args": {
            "sequence": [
                "workbench.action.terminal.runSelectedText",
                "workbench.action.focusActiveEditorGroup"
            ]
        }
    }

This requires an extension called multi-command and what it allows me to do is select code in an editor and execute it in the terminal by pressing Ctrl+;

It also allows for multiple lines of code to be highlighed but if nothing is highlighted, it’ll just run whatever is on the existing line. Once the code is executed, it’ll return the cursor to the editor (instead of remaining in the terminal).

Like this…

These key mappings allow me to run through code in presentations nice and smoothly…so the session can focus on the code itself and not me stumbling around trying to run it!

You can set up these key mappings in your install of VS Code by opening the command palette and searching for “Preferences: Open Keyboard Shortcuts (JSON)”. This’ll open a keybindings.json file in the editor.

I’ve dropped my full file into a Gist here.

Thanks for reading!

SQL Server Cross Platform Availability Groups and Kubernetes

Say we have a database that we want to migrate a copy of into Kubernetes for test/dev purposes, and we don’t want to backup/restore.

How can it be done?

Well, with cross platform availability groups! We can deploy a pod to our Kubernetes cluster, create the availability group, and then auto-seed our database!

We’ll run through how to do this but first, I just want to point out that everything here firmly belongs in this category…

Ok, now that’s out the way…let’s see how this can be done.

We’ll need a “normal” instance of SQL Server running on Windows and a Kubernetes cluster.

Here I have one instance of SQL Server running SQL 2022 CU15 (HA enabled): –

And a Kubernetes cluster running v1.29.4: –

OK, now we need to deploy a SQL instance to the Kubernetes cluster. Going to use a statefulset for this: –

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mssql-statefulset
spec:
  serviceName: "mssql"
  replicas: 1
  selector:
    matchLabels:
      name: mssql-pod
  template:
    metadata:
      labels:
        name: mssql-pod
    spec:
      securityContext:
        fsGroup: 10001
      hostAliases:
      - ip: "10.225.115.136"
        hostnames:
        - "z-ap-sql-01"
      containers:
        - name: mssql-container
          image: mcr.microsoft.com/mssql/server:2022-CU15-ubuntu-20.04
          ports:
            - containerPort: 1433
              name: mssql-port
          env:
            - name: MSSQL_PID
              value: "Developer"
            - name: ACCEPT_EULA
              value: "Y"
            - name: MSSQL_AGENT_ENABLED
              value: "1"
            - name: MSSQL_ENABLE_HADR
              value: "1"
            - name: MSSQL_DATA_DIR
              value: /var/opt/sqlserver/data
            - name: MSSQL_LOG_DIR
              value: /var/opt/sqlserver/data
            - name: MSSQL_SA_PASSWORD
              value: "Testing1122"
          volumeMounts:
            - name: sqlsystem
              mountPath: /var/opt/mssql
            - name: sqldata
              mountPath: /var/opt/sqlserver/data
  volumeClaimTemplates:
    - metadata:
        name: sqlsystem
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
        storageClassName: mssql-sc
    - metadata:
        name: sqldata
      spec:
        accessModes:
         - ReadWriteOnce
        resources:
          requests:
            storage: 25Gi
        storageClassName: mssql-sc

This is fairly stripped down for this example. No tolerations, resource limits etc. Two persistent volumes will be created, one for the system databases and one for the user database data and log from a storage class already configured in the cluster.

A couple of things to note…

- name: MSSQL_ENABLE_HADR
              value: "1"

Here, HA is being enabled.

      hostAliases:
      - ip: "10.225.115.136"
        hostnames:
        - "z-ap-sql-01"

And here, an entry in the pod’s hosts file is being added for the SQL instance running on Windows.

OK, let’s deploy that (file is sqlserver-statefulset.yaml): –

kubectl apply -f sqlserver-statefulset.yaml

Next thing to do is deploy two services, one so that we can connect to the SQL instance (on port 1433) and one for the AG (port 5022): –

apiVersion: v1
kind: Service
metadata:
  name: mssql-service
spec:
  ports:
  - name: mssql-ports
    port: 1433
    targetPort: 1433
  selector:
    name: mssql-pod
  type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
  name: mssql-ha-service
spec:
  ports:
  - name: mssql-ha-ports
    port: 5022
    targetPort: 5022
  selector:
    name: mssql-pod
  type: LoadBalancer

And let’s deploy that (file is sqlserver-service.yaml): –

kubectl apply -f sqlserver-service.yaml

Check that everything looks OK: –

kubectl get all

Great! Ok, now an entry in the SQL on Windows hosts file needs to be created for the external IP address of the service listening on port 5022.

In this example: –

10.225.115.132 mssql-statefulset-0

Confirm that we can connect to the SQL instance in Kubernetes in SSMS: –

Let’s start building the availability group!

Following the Microsoft guide here: –
https://learn.microsoft.com/en-us/sql/linux/sql-server-linux-availability-group-cross-platform

Create a login and user on the Windows SQL instance: –

CREATE LOGIN dbm_login WITH PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE USER dbm_user FOR LOGIN dbm_login;
GO

And then a master key and certificate: –

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'dbm';
BACKUP CERTIFICATE dbm_certificate
TO FILE = 'C:\Temp\dbm_certificate.cer'
WITH PRIVATE KEY (
        FILE = 'C:\Temp\dbm_certificate.pvk',
        ENCRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>'
    );
GO

Grab the dbm_certificate.cer and dbm_certificate.pvk files, copy to your local machine, and then copy them into the Kubernetes pod: –

kubectl cp dbm_certificate.cer mssql-statefulset-0:/var/opt/mssql/data
kubectl cp dbm_certificate.pvk mssql-statefulset-0:/var/opt/mssql/data

One copied, run the following on the SQL instance in the Kubernetes pod to create a login/user, master key, and the certificate: –

CREATE LOGIN dbm_login WITH PASSWORD = '<C0m9L3xP@55w0rd!>';
CREATE USER dbm_user FOR LOGIN dbm_login;
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<M@st3rKeyP@55w0rD!>'
GO

CREATE CERTIFICATE dbm_certificate
    AUTHORIZATION dbm_user
    FROM FILE = '/var/opt/mssql/data/dbm_certificate.cer'
    WITH PRIVATE KEY (
    FILE = '/var/opt/mssql/data/dbm_certificate.pvk',
    DECRYPTION BY PASSWORD = '<C0m9L3xP@55w0rd!>'
)
GO

And then create the endpoint for the availability group on both the Windows SQL instance and the instance in the Kubernetes pod: –

CREATE ENDPOINT [Hadr_endpoint]
    AS TCP (LISTENER_IP = (0.0.0.0), LISTENER_PORT = 5022)
    FOR DATA_MIRRORING (
        ROLE = ALL,
        AUTHENTICATION = CERTIFICATE dbm_certificate,
        ENCRYPTION = REQUIRED ALGORITHM AES
        );
ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [dbm_login];
GO

Fantastic! Now we can create the availability group!

Run on the Windows SQL instance: –

CREATE AVAILABILITY
GROUP [ag1]
WITH (CLUSTER_TYPE = NONE)
FOR REPLICA
    ON N'z-ap-sql-01'
WITH (
    ENDPOINT_URL = N'tcp://z-ap-sql-01:5022',
    AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
    SEEDING_MODE = AUTOMATIC,
    FAILOVER_MODE = MANUAL,
    SECONDARY_ROLE(ALLOW_CONNECTIONS = ALL)
    ),
    N'mssql-statefulset-0'
WITH (
    ENDPOINT_URL = N'tcp://mssql-statefulset-0:5022',
    AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
    SEEDING_MODE = AUTOMATIC,
    FAILOVER_MODE = MANUAL,
    SECONDARY_ROLE(ALLOW_CONNECTIONS = ALL)
    )
GO

And then join the SQL instance in the Kubernetes pod: –

ALTER AVAILABILITY GROUP [ag1] JOIN WITH (CLUSTER_TYPE = NONE);
ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;
GO

Once that has completed, we have a cross platform availability group!

Right, final thing to do is get our database into the AG! For this, we are going to auto-seed the database to the Linux instance.

However, Windows and Linux have different file paths! But auto-seeding does work across platform, details are here: –
https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/automatic-seeding-secondary-replicas

As long as the database’s data and log files are located under the Windows SQL instance’s default data and log file paths, the database will be seeded to the Linux instance’s default data and log file paths.

Let’s see this in action! The default file paths for my Windows instance are: –

And the database’s files reside on those file paths: –

And the default file paths for the SQL instance in the pod are: –

So now let’s add the database: –

ALTER AVAILABILITY GROUP [AG1] ADD DATABASE [testdatabase1];
GO

And boom! We have the database auto-seeded into the SQL instance running in the Kubernetes pod: –

Interestingly, if we look at the database’s file paths in SSMS for the instance in the pod: –

SQL in the pod thinks that the files are located at the Windows locations…but that can’t be true…it’s running on Linux!

If we go and check the actual location of the files…they are located at the default file paths for the SQL instance in the pod: –

OK, I fully admit this pretty out there…but it’s just a bit of fun to see what we can do with SQL Server Availability Groups and Kubernetes.

Thanks for reading!

EightKB 2024 – Schedule and Registration

Hello Hello, We. Are. Back!

The schedule for EightKB 2024 Edition has been announced!

We’re kicking off at 1pm UTC on August 8th (8th of the 8th…get it? 🙂 )…here’s the schedule: –

N.B. – If you click on the session on the website, it shows the session abstract…nice touch eh?

Once again we have five top class speakers delivering five great, in-depth sessions on various SQL Server internals topics.

We only have five slots in each event which meant we ended up not picking some amazing sessions. I want to say a MASSIVE thank you to all who submitted…session selection is the WORST part of running a conference! Really, truly…thank you to everyone who submitted.

Registration is now open, it’s completely free and you can sign up here: –
https://eightkb.online/

The event will be in Zoom and we’ll have chat going in the EightKB channel in the SQL Community Slack…please come and hang out with us there!

EightKB is a 100% community driven with no sponsors so, we have our own Bonfire store selling t-shirts!

Don’t they look snazzy?!

Any money generated from the store will be put straight back into the event.

Hope to see you on August the 8th!