7-minute read

As adoption of data-intensive applications escalates, many organizations are opting to migrate from their legacy on-premise data platforms to Snowflake’s cloud-based data warehouse solution, and with good reason. Among other benefits, Snowflake separates storage and compute resources, allowing businesses to use and pay for the resources they actually use and to scale each independently, thus optimizing use of their budgetary dollars. Done right, a migration to Snowflake can save money and accelerate progress toward becoming a data-driven organization. But as you might expect, there are some common pitfalls to be avoided and also some best practices that can help ensure businesses get the greatest bang for their budgetary bucks.

In this article, we’ll explore four categories of best practices that we’ve implemented to help our clients ensure a smooth migration and make the best use of the features and resources Snowflake offers:

  1. Plan for cost management.
  2. Coordinate complementary tools
  3. Build training and communication into the process … and start early
  4. Have a plan in place for monitoring, management, and measurement

 

1. Plan for cost management

One of the biggest differences between Snowflake and on-premise data platforms is the cost structure. On-premise resources that have been bought and paid for can be used without concern over running up additional costs. Because Snowflake offers a “data warehouse as a service” (DWaaS), the pricing model is pay-as-you-go.

If an on-prem platform is the equivalent of having your own car in the garage, Snowflake is more like using Uber: you don’t have to worry about gas or maintenance, but every trip incurs a discrete cost.

When managed well, Snowflake’s subscription model can make it significantly cheaper than an on-premise platform; conversely, poor management can lead to substantial unnecessary costs. To prevent a case of sticker shock when the first bill arrives—or when you find out your Snowflake credits have been prematurely depleted—make sure that cost management is part of your initial migration planning process. Be sure to consider questions such as

• Who in the organization should have access to Snowflake, and what privileges will each person have? Snowflake allows admins to restrict account access to specific IP addresses and to set network policies on a per-user basis. Make sure that access is granted only to users who absolutely need it and who understand how the per-query pricing model works.

• What are your organization’s typical data workflows and usage scenarios, and what are the storage and compute requirements for each? Once you understand your data workflows, you can do cost modeling for the appropriately sized compute warehouses.

• Which data needs to be moved, and which data can remain in its current place? Snowflake has powerful data sharing options that allow data to stay in one spot with explicit permissions for different use cases, whereas in the past you may have had to copy data to multiple environments.

• How will your disaster recovery and mitigation planning change? Snowflake’s Time Travel and data protection features may meet your business needs for data continuity so that you can decrease your disaster recovery storage costs.

2. Coordinate complementary tools

Your current on-premise data platform may include a suite of tools to manage data from source to reporting—for example, Microsoft’s SQL Server includes SSIS, SSAS, SSRS, and Power BI. Snowflake has focused on flexibility, performance, and portability solely around compute and data sharing, and you have the opportunity to build a new data platform using best-of-breed products to complement Snowflake’s strengths.

Snowflake’s extended ecosystem encompasses a wide range of third-party solutions, and some vendors are “certified” partners that can roll up costs into the same cloud provider that Snowflake is using. Their partner network encompasses tools for

• Data pipelines / connectors (e.g. Fivetran)

• Data integration / ELT (e.g. Matillion)

• Reporting and analytics (e.g. Microsoft Power BI, Tableau)

• Data governance (e.g. Informatica)

• Data catalogs (e.g. data.gov, Collibra)

As you plan your Snowflake migration, consider which tools you will be using and how to coordinate them for optimal performance. You may also want to consider Snowflake’s certified partners to simplify billing.

3. Build training and communication into the process … and start early

Think ahead to the day when you’ve completed your data migration to Snowflake. Will all your users be ready, and will your data team know how to start using the platform right away?

 

The earlier you consider the training and communication requirements for a successful migration, the better prepared your users will be once the transition is complete—and the sooner you can begin realizing the benefits of the Snowflake platform. Be sure to consider the unique knowledge requirements of each user group:

 

Data developers

Role: Interacting directly with the Snowflake platform

Examples of key considerations:

• The impact of changing data warehouse compute settings when optimizing query performance

• Design patterns to resolve differences in SQL between Snowflake and the original databases; it’s also useful to understand and share anti-patterns that could cause expensive queries.

• How to use the Time Travel feature to restore data objects in case of accidental deletion or modification

 

Dashboard developers

Role: Connecting to Snowflake

Examples of key considerations:

• Impact of their dashboard designs

• How to plan a refresh strategy (live connect versus once-daily refreshes) based on underlying changes in the data

• Key differences between connecting with Snowflake and connecting with other sources; a training guide (or even a video walkthrough) should be created for less-technical users.

 

Data analysts

Role: Connecting directly to Snowflake’s web user interface

Examples of key considerations:

• Key differences in using the online Snowflake UI, since they will be accustomed to using other querying tools

• Cost and impact of large queries, especially in cases where sampling would have met their needs

• Benefits of unique features such as data profiling, saving queries, etc.

 

Data product owners

Role: Granting and managing access to their data

Example of a key consideration:

• Both the benefits and the costs of high-performance requests, and the impact on the overall design of their product

 

DBAs

Role: Managing databases

Examples of key considerations:

• What developers will be doing differently

• Data-sharing concepts

• Price modeling for different warehouse sizes

• Documenting and designing workloads

• Appropriate management tasks

4. Have a plan in place for monitoring, management, and measurement

Migrating to a Snowflake environment is not a “set it and forget it” undertaking—far from it. To make sure your investment of money, time, and resources is delivering a positive return for your organization, create an ongoing plan for the following responsibilities:

Monitoring

How will you monitor usage of your organization’s Snowflake resources? How will you verify that the people who have been granted access are using it, or that queries are not taking up more time—and therefore racking up more costs—than necessary? And how will you monitor user satisfaction for different types of workloads (e.g. reporting, complex transformations, streaming use cases, etc.)?

Management

How will you ensure that Snowflake resources are being used cost-effectively, securely, and in accordance with your standards? Regarding DataOps, what controls will be needed to ensure thoughtful processes for migrating from experimental to production environments?

Measurement

Which KPIs will you use to verify the overall success of the migration to Snowflake, and how will you identify opportunities for improvement and/or for additional training?

Snowflake offers businesses the opportunity to share data seamlessly, eliminate resource contention, easily scale virtual warehouses, and realize a host of other benefits—all while managing costs. The pay-as-you-go pricing model enables organizations to directly link costs with the value extracted from their data, providing greater granularity in value cost management. And with careful planning, mindful execution, and ongoing diligence, migrating to Snowflake can be one of the best moves they ever made.

Like what you see?

Paul Lee

Mick Wagner is a Senior Solutions Architect in the Advanced Analytics practice at Logic20/20.

Author