Aspire Journeys

Advanced Snowflake

  • 20 Courses | 25h 35m 10s
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)
The Advanced Snowflake journey is meticulously designed to provide data engineers and advanced users with the skills to fully leverage Snowflake's platform for data transformation, optimization, advanced analytics, and data governance. This comprehensive journey is divided into four key tracks, each focusing on a specialized aspect of data engineering. The curriculum emphasizes performance optimization strategies, leveraging Snowpark for complex data transformations, applying machine learning techniques, and ensuring robust data governance and security. By the end of this journey, participants will have deep expertise in managing high-performance workloads, implementing machine learning models, and maintaining data security on Snowflake.

Track 1: Performance Monitoring and Optimization

This track equips learners with the tools and techniques needed to optimize Snowflake performance for large-scale data engineering tasks. You will explore the strategies for scaling workloads with virtual and multi-cluster warehouses, query optimization through data clustering and caching, and monitoring performance with query profiling and resource utilization tracking. Learners will also explore handling geospatial and semi-structured data, working with transient and dynamic tables, and optimizing queries through secure and materialized views.

  • 5 Courses | 7h 10m 16s

Track 2: Data Transformation Using Snowpark

In this in-depth track, learners dive into Snowpark, Snowflake’s powerful framework for scalable data manipulation and transformation. Through hands-on experience with Snowpark DataFrames and integration with external systems like Kafka and Spark, learners will master tasks such as filtering, aggregating, and joining data. The track also covers the creation and management of user-defined functions (UDFs) and stored procedures, as well as data quality assurance using Soda and real-time data ingestion techniques.

  • 4 Courses | 5h 47s

Track 3: Continuous Data Pipelines

This track introduces learners about continuous data pipelines in Snowflake. Participants will learn how to create and configure dynamic tables and the usage and internal workings of streams for change data capture (CDC), stream types, and standard stream contents during insert, update, and delete operations. The final section of this track will be exploring continuous data processing tasks, creating and execute scheduled serverless and user-managed scheduled tasks, and implementing task graphs and child tasks.

  • 4 Courses | 4h 21m 14s

Track 4: Advanced Analytics and Machine Learning

This track introduces learners to the world of machine learning within Snowflake. Participants will learn to design and deploy ML models using Snowpark and popular tools like scikit-learn. The track covers key areas such as data preprocessing, model training, hyperparameter tuning, and deployment through MLOps. Learners will also explore the application of large language models (LLMs) in Snowflake Cortex for tasks like sentiment analysis, translation, and summarization, as well as advanced techniques like time series forecasting and anomaly detection.

  • 7 Courses | 9h 2m 53s

COURSES INCLUDED

Snowflake Performance: Scaling and Autoscaling Warehouses
By understanding Snowflake's warehouse configurations and scaling, you can optimize performance and ensure efficient resource utilization in Snowflake. In this course, you will discover the features and architecture of Snowflake, gaining a foundational understanding of how it operates as a cloud data platform. Then you will explore the different editions of Snowflake and their billing structures, helping you make informed decisions about which version best fits your organizational needs. Next, you will learn how to choose between resizing warehouses and multi-cluster warehouses within Snowflake. You will focus on creating warehouses and selecting appropriate configurations to optimize performance and cost. You will also find out how to scale up warehouses to handle increasing workloads, ensuring efficient data processing. You will investigate advanced topics such as multi-cluster warehouses and their modes of operation and examine how to dynamically adjust resources based on workload demands, leading to cost efficiency and improved performance. Finally, you will use resource monitors to effectively track and manage warehouse usage, ensuring optimal utilization without unnecessary costs.
11 videos | 1h 39m has Assessment available Badge
Snowflake Performance: Query Acceleration and Caching
Optimizing data queries in Snowflake involves partitioning and clustering large datasets to ensure quicker access and improved performance. Mastering query acceleration and caching techniques is essential for faster response times and accurate, up-to-date query results. In this course, you will learn about the Snowflake data model, focusing on how data is structured and managed within the platform. Then you will explore the importance of partitions and clustering, essential techniques for optimizing data queries by dividing large datasets into smaller, manageable pieces for quicker access and improved performance. Next, you will investigate techniques to enhance query performance, such as query acceleration and caching. You will enable query acceleration for warehouses for faster response times in complex queries and assess which queries benefit most from these enhancements. Additionally, you will use SnowSQL for efficient data loading. Finally, you will discover the intricacies of query result caching, see how your query structure affects caching, and manage caching effectively, including turning it off to ensure accurate and up-to-date query results.
10 videos | 1h 23m has Assessment available Badge
Snowflake Performance: Clustering and Search Optimization
Clustering and search optimization in Snowflake are crucial for enhancing query performance, reducing data retrieval times, and effectively managing large datasets. These techniques streamline data access, ensuring scalable and efficient data handling. In this course, you will explore how clustering helps improve the performance of point lookup and range queries. You will investigate the importance of choosing the appropriate clustering key and examine various approaches to implementing clustering, focusing on performance and scalability. Next, you will discover different methods for evaluating your clustering strategies and see how clustering can make your data retrieval queries more performant. You will also be introduced to search optimization in Snowflake to improve point lookup queries by building an auxiliary data structure to help quickly access data. Then you will compare search optimization and clustering to understand their effective use cases and gain insights on refining searches with complex predicates using AND and OR Clauses and optimizing searches on specific columns. Finally, you will work with VARIANTS, OBJECTS, and ARRAYS for versatile data management and improve queries with semi-structured data.
15 videos | 2h 16m has Assessment available Badge
Snowflake Performance: Iceberg Tables, External Tables, and Views
Understanding the concept of views in Snowflake is vital for creating and querying various types of views. Views allow us to use role-based access control to manage permissions and data security effectively and are essential for organizing and simplifying data and ensuring that only authorized users can access sensitive information. In this course, you will discover how to create and query standard, materialized, and secure views in Snowflake. Then you will configure role-based access control to allow users to access specific views and you will use materialized views to improve the performance of your queries and secure views to control access to the details of the underlying table. Next, you will learn how to query data stored in external cloud locations using Snowflake, integrate Snowflake with Google Cloud Storage buckets, and create an external table to access and query data stored on the Google Cloud using Snowflake. Finally, you will create and configure Iceberg tables in Snowflake for a modern, high-performance format that aids in managing large-scale datasets, providing features like schema evolution, partitioning, and time travel for enhanced data management.
12 videos | 1h 50m has Assessment available Badge

COURSES INCLUDED

Data Transformation Using the Snowpark API
The Snowpark API is a framework for writing code in Python, Java, or Scala to work with Snowflake. The Snowpark libraries make it very easy to programmatically implement complex data transformations on Snowflake data using DataFrames. In this course, learn how to use Snowpark with Snowflake, build and execute Snowpark handlers, create and query Snowflake tables, perform data transformations, and use external libraries in Snowpark handlers. Next, discover how to connect to Snowflake from a Jupyter Notebook, create and query tables with Snowpark APIs, handle DataFrames in Snowpark, and implement the commands on DataFrame objects. Finally, explore how to perform DataFrame joins and set operations, leverage views in Snowpark with Snowpark APIs, work with semi-structured Snowpark data, and gain insights by creating and querying tables with semi-structured JSON data. Upon course completion, you will be able to use the Snowpark API with Snowflake.
12 videos | 1h 49m has Assessment available Badge
Snowpark pandas and User-defined Functions
DataFrames are the core Snowpark table abstraction. Snowpark supports the Snowpark pandas API and its DataFrames, as well as the rich functionality related to user-defined functions (UDFs). In this course, learn how to work with the Snowpark pandas API, create Snowflake Notebooks, use Snowpark pandas via the Modin plugin, and convert between Snowpark pandas and Snowpark DataFrame objects. Next, explore Snowflake UDFs, UDAFs, UDTFs, and stored procedures. Finally, discover how to register and invoke permanent and anonymous UDFs in Snowflake and register UDFs from SQL and Python files. After completing this course, you will be able to use Snowpark pandas DataFrames and register and invoke user-defined functions (UDFs).
9 videos | 1h 20m has Assessment available Badge
Snowpark UDTFs, UDAFs, and Stored Procedures
Snowpark offers powerful tools for developers to write custom code in the form of UDFs, UDTFs, UDAFs, and stored procedures, each of which is implemented using extremely powerful handlers. In this course, learn about Snowflake UDTFs and partitioning, register and invoke UDTFs, construct a UDTF to normalize denormalized JSON data, and implement stateful processing using the end_partition and init functions. Next, discover how to partition rows to sort within a partition using UDTFs, explore Snowflake UDAFs and UDAF handler class methods, perform aggregation operations, and implement UDAFs that use Python objects and user-defined classes. Finally, examine Snowflake stored procedures and differentiate them from UDFs, UDTFs, and UDAFs, as well as register and invoke stored procedures and write Python functions using the Snowpark APIs. Upon completion of this course, you will be able to outline and use snowpark UDTFs, UDAFs, and stored procedures.
13 videos | 1h 51m has Assessment available Badge

COURSES INCLUDED

Continuous Data Pipelines and Dynamic Tables in Snowflake
Snowflake offers powerful support for the construction of complex data pipelines. This support includes constructs such as dynamic tables, streams, and tasks. In this course, learn about continuous data pipelines in Snowflake, including Snowflake's support for continuous data loading and transformation, change data capture, and recurring operations, and configure dynamic tables to manage and automate these processes. Next, discover how to create and configure dynamic tables, set properties to control their behavior, and verify the change tracking property of base tables. Finally, explore how to connect and manage dependencies between dynamic tables, create dynamic tables that depend on other dynamic tables, ensure their refresh modes are compatible, and configure dynamic tables to refresh on demand. After completing this course, you will be able to describe continuous data pipelines and use and configure dynamic tables in Snowflake.
8 videos | 1h 3m has Assessment available Badge
Streams and Change Data Capture in Snowflake
Streams are Snowflake's construct for change data capture (CDC) and process only changes in an underlying table or view. Used with dynamic tables and tasks, streams are an important and powerful building block of pipelines in Snowflake. In this course, learn about the usage and internal workings of streams for change data capture (CDC), stream types, and standard stream contents during insert, update, and delete operations. Next, discover how to create and read standard streams, combine stream contents with the target table for inserts and updates, and the effects of insert, update, and delete operations on standard stream contents. Finally, explore append-only streams, the relationship between streams and transactions, repeatable read isolations in streams, stream behavior within transactions, and how to implement streams on views. Upon course completion, you will be able to outline streams and change data capture in Snowflake.
11 videos | 1h 27m has Assessment available Badge
Using Tasks and Architecting Snowflake Data Pipelines
Tasks are Snowflake constructs that can execute code on a fixed schedule or when a stream has data to consume. Tasks are similar to cron jobs but are more complex because they can be chained into complex dependency networks called task graphs. In this course, learn about continuous data processing tasks, create and execute scheduled serverless and user-managed scheduled tasks, and implement task graphs and child tasks. Next, use dummy root nodes to bypass Snowflake's root node restrictions, create and use triggered tasks, and construct an architecture that utilizes streams, tasks, stages, and dynamic tables to feed into a dynamic dashboard. Finally, discover how to implement data pipelines with a stage, scheduled task, and table, add dynamic pipelines and triggered tasks to data pipelines, and create dashboards in Snowflake to consume data from different tables. Upon completion of this course, you will be able to use tasks and architect Snowflake data pipelines.
14 videos | 1h 49m has Assessment available Badge

COURSES INCLUDED

Snowpark ML APIs and the Model Registry
Snowflake has several powerful AI/ML features. These are available under two broad categories: Snowflake Cortex for LLM-related activities and Snowflake ML for more traditional ML model-building. In this course, explore how Snowflake integrates AI/ML capabilities across its platform, how Snowpark ML APIs support model training with popular libraries, and how to perform hyperparameter tuning to optimize model performance. Next, learn how to configure Python and Jupyter for Snowflake ML and set up a virtual environment to run a Jupyter Notebook that leverages Snowflake ML APIs. Finally, discover how to connect to Snowflake using the Snowpark API, work with the Snowflake Model Registry, and manage models. Upon course completion, you will be able to outline Snowpark ML APIs and the Snowflake Model Registry.
10 videos | 1h 20m has Assessment available Badge
Snowflake Feature Store and Datasets
Snowflake ML has now introduced Snowflake Feature Store, which can improve collaboration, break down data silos, and facilitate feature reuse. Datasets are another great new feature, offering data versioning to drive model result reproducibility. In this course, learn how to use Snowflake Datasets for various data operations, including materializing DataFrames into datasets and managing versions, building a Snowflake ML pipeline for logistic regression, and creating and applying model tags. Next, discover how to utilize Snowpark-optimized warehouses for hyperparameter tuning, register tuned models with the Snowflake Model Registry, and create feature stores and entities using Snowpark APIs. Finally, explore how to build managed feature views and the workflow of feature stores. After course completion, you will be able to use Snowflake Feature Store and datasets.
14 videos | 1h 58m has Assessment available Badge
Using Streamlit with Snowflake
Streamlit is an open-source library used to build interactive, visual-heavy web applications that work with a variety of DataFrames, including Snowpark DataFrames. For that reason, it is a natural fit with Snowflake. In this course, learn how to use the Streamlit library to create interactive web applications within Snowflake, build basic Streamlit apps directly in Snowsight, and enhance your Streamlit apps by adding visualizations using seaborn and Matplotlib. Next, discover how to implement various UI controls, build a UI where the user selects their ideal model type, and access the model registry within a Streamlit app. Finally, explore how to share your completed Streamlit app with other users in view-only mode. Upon course completion, you will be able to use Streamlit with Snowflake.
8 videos | 1h 3m has Assessment available Badge
Anomaly Detection with Snowflake ML Functions
Snowflake ML functions offer powerful SQL functionality for several common use cases, including anomaly detection, time series forecasting, and classification. In this course, learn about the types of models available in Snowflake ML functions, when to use functions for different machine learning Snowflake tasks, and the required data formats for input into anomaly detection and forecasting models. Next, examine how to use Snowflake ML functions to implement anomaly detection, interpret the output of the anomaly detection model, and tune model sensitivity and save model results. Finally, discover how to add exogenous variables for anomaly detection model enhancement and extend a model to work with multi-series data. After completing this course, you will be able to implement anomaly detection with Snowflake ML functions.
12 videos | 1h 36m has Assessment available Badge
Snowflake Forecasting Models and the AI & ML Studio
Snowflake ML functions support forecasting models that share many characteristics with anomaly detection models. In addition, the new AI & ML Studio provides a point-and-click interface for building models for forecasting, classification, and anomaly detection. In this course, learn how to create and use Snowflake forecasting models, build a time series forecasting model with ML functions, and enhance your time series forecasting models. Next, examine how to handle multiple time series concurrently, use Snowflake AI & ML Studio for forecasting, and generate SQL code to build and execute a time series forecasting model. Finally, discover how to use Snowflake AI & ML Studio for classification tasks, generate and execute SQL code to invoke ML classification functions, and evaluate classification model output.
12 videos | 1h 28m has Assessment available Badge
Snowflake Cortex for LLMs, RAG, and Search
Snowflake Cortex is a suite of AI-related features that facilitate working with large language models (LLMs). These include Snowflake Copilot, AI-powered Universal Search, and Cortex Search for Retrieval Augmented Generation (RAG). In this course, learn how to use Snowflake Cortex to work with LLMs, adjust hyperparameters to control LLM output, and use the COMPLETE Cortex LLM function. Next, discover how to use Cortex LLM functions directly from SQL and Python and utilize advanced features like Snowflake Copilot, Universal Search, and Document AI for user experience enhancement through AI-driven search and document processing. Finally, explore how to customize LLMs through Cortex Fine-Tuning and implement Retrieval Augmented Generation using Cortex Search. Upon course completion, you will be able to work with LLMs in Snowflake Cortex.
12 videos | 1h 36m has Assessment available Badge

EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE TRACKS

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.7 of 400 users Rating 4.7 of 400 users (400)
Journey MLOps
Rating 4.7 of 7 users Rating 4.7 of 7 users (7)
Rating 4.7 of 3 users Rating 4.7 of 3 users (3)