SKILL BENCHMARK
Data Transformation with Snowpark Proficiency (Advanced Level)
- 25m
- 25 questions
The Data Transformation with Snowpark Proficiency (Advanced Level) benchmark measures your ability to use Snowpark to programmatically transform and query data without the data leaving Snowflake. You will be evaluated on your skills in using Snowpark pandas DataFrames, distinguishing them from Snowpark DataFrames, registering and invoking user-defined functions (UDFs), and implementing and analyzing UDTFs, UDAFs, and stored procedures in Snowpark. Learners who score high on this benchmark demonstrate that they have good experience working with Snowpark and can transform data and work on advanced Snowflake projects with minimal supervision.
Topics covered
- connect to Snowflake from Jupyter Notebook and create and query tables using Snowpark APIs
- construct a UDTF that normalizes denormalized JSON data
- convert Snowpark pandas and Snowpark DataFrame objects and contrast their behaviors
- create and query Snowflake tables using Snowpark
- create a Snowflake Notebook and use Snowpark pandas via the Modin plugin
- create a table with semi-structured JSON data and query it using Snowpark
- identify the uses of stored procedures and contrast them with UDFs, UDTFs, and UDAFs in Snowflake
- implement the end_partition and init functions in a UDTF to achieve stateful processing
- implement the Snowpark equivalents of SELECT, FROM, WHERE, GROUP BY, and ORDER BY on DataFrame objects
- implement UDAFs to use Python objects and objects of user-defined classes
- implement union, intersect, and difference operations and joins on Snowpark DataFrames
- make standard Anaconda libraries and custom Python code available within a Snowpark handler
- outline UDAFs and methods for a UDAF handler class
- outline UDFs, UDTFs, UDAFs, and stored procedures and compare and contrast them
- outline UDTFs and partitioning in Snowflake and methods for UDTF handler class implementation
- partition rows and sort within a partition using UDTFs
- perform data transformations equivalent to SQL queries with group_by and order_by clauses in Snowpark
- register and invoke stored procedures
- register and invoke UDAFs to perform aggregation operations
- register and invoke UDTFs
- register anonymous UDFs using the udf function and session.udf.register, and invoke them using call_udf
- register permanent UDFs using the @udf decorator and then invoke them from different sessions
- register UDFs from SQL and Python files
- use the Snowpark APIs to create views from any DataFrame object
- write a Python function using the Snowpark APIs and then directly deploy it to a stored procedure