SKILL BENCHMARK
Data for Leaders Competency (Intermediate Level)
- 20m
- 20 questions
The Data for Leaders Competency benchmark will measure whether a learner has had exposure to data concepts and terminology. You will be evaluated on your ability to recognize key concepts of data such as big data, data governance and management, and emerging new age architecture. A learner who scores high on this benchmark demonstrates that they have the basic data skills to understand and grasp various data related technologies, tools, and frameworks.
Topics covered
- compare and contrast various data warehousing schemas, such as Star, Snowflake, etc.
- compare key differences in ETL (extract, transform, load) and ELT (extract, load, transform) systems and describe how ETL is used with traditional data architectures and ELT with modern ones
- define key characteristics and requirements for a reliable data collection pipeline
- describe situations when normalization or denormalization is needed and name the key steps of each process
- describe the architecture of a modern data lake
- describe the concept of data mart and how it can be used for business decision-making through data mining
- describe the differences between a data lake and a data warehouse
- describe the process of deciphering correlations, market trends, patterns, and customer behavior using big data
- identify how document databases are designed to store and query data as JSON-like documents and outline their benefits and use cases
- list and describe five main challenges when dealing with big data
- list the processes essential to preparing data and specify the goal of each process
- list widely used data management architectures
- name and describe the features of Hadoop HDFS and identify common in-memory storage systems including Kudu, Elasticsearch, and CockroachDB
- name and describe the most commonly used ETL tools and software
- name the components utilized in a cloud computing architecture
- name the major issues involved and strategies used when trying to achieve data compliance
- name the steps and processes essential to any data science project
- outline how stream processing enables quick decision-making by creating actionable real-time insights
- recognize how Spark offers an open-source, scalable, massively parallel, in-memory solution for analytics applications
- specify how multi-model databases combine different types of database models into one integrated database engine