SKILL BENCHMARK
Big Data Awareness (Entry Level)
- 15m
- 15 questions
The Big Data Awareness benchmark measures whether a learner has exposure to big data concepts, including what big data is, various sources of big data, formats, applications, and use cases for big data analytics. A learner who scores high on this benchmark demonstrates that they have the foundational knowledge of big data.
Topics covered
- briefly describe traditional data and data warehousing architecture
- compare and contrast parallel and distributed computing systems
- compare key differences in ETL (extract, transform, load) and ELT (extract, load, transform) systems and describe how ETL is used with traditional data architectures and ELT with modern ones
- compare structured and unstructured data and describe how the ability to extract value from unstructured data is important when dealing with big data
- define the big 7 characteristics that define big data: volume, velocity, variety, variability, veracity, visualization, and value
- describe how business intelligence analytics has developed from traditional to modern approaches
- describe the concept of big data and the history behind it
- describe the difference between data warehousing and big data and specify the impact that big data has had on data warehousing
- describe the difference between horizontal and vertical scaling and specify why horizontal scaling is the best choice with respect to big data
- distinguish between raw data, information, applicable knowledge, and general wisdom
- identify the sources that are capable of generating big data
- list and describe the limitations of traditional data architecture, including limitations on speed, scalability, compatibility, and consumption
- list and describe the limitations of using ETL systems when working with data, including limitations on performance, scalability, and structure
- list the most commonly used data sources and formats
- specify why real-time processing is advantageous when dealing with large amount of data