Database CI/CD and Schema Migration with Snowflake

Estimated: 20 mins
Database CI/CD and Schema Migration with Snowflake

A series of articles about Database CI/CD and Schema Migration with Snowflake


This tutorial will guide you step-by-step to set up database change management for Snowflake in Bytebase. With Bytebase, a team can have a formalized review and rollout process to make Snowflake schema change and data change.

Here we have to mention the informative blog post Embracing Agile Software Delivery and DevOps with Snowflake, which provided valuable insights and inspired us to implement similar processes in our product.

You’ll have a GUI and the full migration history. You can use Bytebase Free Plan to finish the tutorial. There is also a bonus section about schema drift detection for those advanced users if needed.

Prerequisites

Before you start this tutorial, make sure:

  • You have a Snowflake account with the role ACCOUNTADMIN.
  • You have Docker installed locally.

Step 1 - Start Bytebase in Docker

  1. Make sure your docker daemon is running, and start the Bytebase docker container by typing the following command in the terminal.

    docker run --rm --init \
      --name bytebase \
      --publish 8080:8080 --pull always \
      --volume ~/.bytebase/data:/var/opt/bytebase \
      bytebase/bytebase:2.22.3
  2. Bytebase is running successfully in Docker, and you can visit it via localhost:8080. docker

  3. Visit localhost:8080 in your browser. Register the first admin account which will be granted Workspace Admin. bb-register

Step 2 - Add Snowflake account in Bytebase

In Bytebase, ​​an Instance could be your on-premises MySQL instance, an AWS RDS instance etc, in this tutorial, a Snowflake account.

  1. Visit localhost:8080 and login as Workspace Admin. bb-login

  2. Click Add Instance. bb-add-instance

  3. Add a Snowflake instance. You need to pay attention to some fields: bb-create-instance

    Environment: choose Test, if you choose Prod, you will need approval for all future change requests. In this tutorial, let's try to keep it simple. (However, it’s all configurable later.)

    sf-account-locator

    Account Locator: Go to your Snowflake account, you can find it in the URL, or from the locator field (but lower case).

    sf-locator

    If the account is located in the AWS US West (Oregon) region, then it would be something like xy12345, otherwise, the format will be <<account_locator>>.<<cloud_region_id>>.<<cloud>> such as xy12345.us-east-2.aws. See official doc.

    Username and password: The ones you use to log into your Snowflake account. sf-login

    sf-connection-info

    Connection info

    Option 1: ACCOUNTADMIN. Make sure your account has DEFAULT_ROLE=ACCOUNTADMIN and DEFAULT_WAREHOUSE set in Snowflake, as shown below. sf-role-list sf-edit-user

    Option 2: Granular role. Assigned the proper permission according to the instructions.

Step 3 - Create a Project with Snowflake instance

In Bytebase, Project is the container to group logically related Databases, Issues and Users together, which is similar to the project concept in other dev tools such as Jira, GitLab. So before you deal with the database, a project must be created.

  1. After the instance is created, click Projects on the top bar.

  2. Click New Project to create a new project TestSnowflake, key is TS, mode is standard. Click Create. bb-new-project

Step 4 - Create a database in Snowflake via Bytebase

In Bytebase, a Database is the one created by CREATE DATABASE xxx. A database always belongs to a single Project. Issue represents a specific collaboration activity between Developer and DBA such as creating a database, altering a schema. It's similar to the issue concept in other issue management tools.

  1. After the project is created, go to the project and click New DB. bb-new-db

  2. Fill the form with Name - DB_DEMO_BB (BB is short for Bytebase), Environment - Test, and Instance - Snowflake instance. Click Create. bb-create-db

  3. Bytebase will create an issue “CREATE DATABASE ….” automatically. Because it’s for the Test environment, the issue will run without waiting for your approval by default. Click Resolve, and the issue is Done. The database is created. bb-go-home

  4. Go back to the home page by clicking Home on the left sidebar. If it’s the first time you use Bytebase, it’ll show a celebration. On the home page, you can see the project, the database, and the issue you just resolved. bb-created-database

Step 5 - Create a table in Snowflake via Bytebase

In Step 4, you actually created an issue in UI workflow and then executed it. Let’s make it more explicit.

  1. Go to project TestSnowflake, and click Alter Schema. bb-prj-alter-schema

  2. Choose DB_DEMO_BB and click Next. It could generate a pipeline if you have different databases for different environments. bb-alter-schema-test

  3. Input title, SQL, and Assignee, and click Create.

CREATE SCHEMA DEMO_UI;
CREATE TABLE HELLO_WORLD
(
  FIRST_NAME VARCHAR,
  LAST_NAME VARCHAR
);

bb-is-new-create-table

  1. Bytebase will do some basic checks and then execute the SQL. Since it’s for Test environment, the issue is automatically approved by default. Click Resolve issue. bb-is-create-table-run

  2. The issue status will become Done. bb-is-create-table-done

  3. On the issue page, click view migration. You will see diff for each migration. bb-view-migration

  4. You can also go to Migration History under the project to view the full history. Or go into a specific database to view its history. bb-prj-mh bb-db-mh

Bonus Section - Schema Drift Detection

To follow this section, you need to have Team Plan or Enterprise Plan (you can start 14 days trial directly in the product without credit card). trial-14-days

Now you can see the full migration history of DB_DEMO_BB. However, what is Establish new baseline? When should it be used?

By adopting Bytebase, we expect teams to use Bytebase exclusively for all schema changes. Meanwhile, if someone has made Snowflake schema change outside of Bytebase, obviously Bytebase won’t know it. And because Bytebase has recorded its own copy of schema, when Bytebase compares that with the live schema having that out-of-band schema change, it will notice a discrepancy and surface a schema drift anomaly. If that change is intended, then you should use baseline to reconcile the schema state again.

In this section, you’ll be guided through this process.

  1. Go to Snowflake, and add a COLUMN there. Make sure the new column is added. sf-alter-add-age

  2. Wait for 10 mins (as Bytebase does the check roughly every 10 mins). Go back to Bytebase, and you can find the Schema Drift on database DB_DEMO_BB bb-db-schema-drift

The Anomaly Center also surfaces the drift bb-ac-drift

  1. Click View diff, you will see the exact drift. bb-view-drift

  2. Use baseline to reconcile the schema state from the live database schema. Go to DB_DEMO_BB > Migration History and click Establish new baseline. bb-db-establish-new-baseline

  3. It will create an issue. Click Resolve to make it done. bb-is-baseline-done

  4. Go back to DB_DEMO_BB or Anomaly Center, and you will find the Drift is gone. bb-db-no-anomalies bb-ac-no-anomaly

Summary and Next

Now you have connected Snowflake with Bytebase, and tried out the UI workflow to do schema change. Bytebase will record the full migration history for you. With Team or Enterprise Plan, you can even have schema drift detection.

In the next article, you’ll try out GitOps workflow, which will store your Snowflake schema in GitHub and trigger the change upon committing the change to the repository, to bring your Snowflake change workflow to the next level of Database DevOps - Database as Code.

Edit this page on GitHub

Subscribe to Newsletter

By subscribing, you agree with Bytebase's Terms of Service and Privacy Policy.