Architecting Data: Data Architecture
- 8 Courses | 7h 2m 46s
- 3 Books | 13h 41m
Data architecture is a combination of of rules, models, policies, and standards that govern the type of data is collected, and how it is managed within an organization.
GETTING STARTED
Scalable Data Architectures: Getting Started
-
2m 37s
-
9m 11s
COURSES INCLUDED
Scalable Data Architectures: Getting Started
Explore theoretical foundations of the need for and characteristics of scalable data architectures in this 8-video course. Learn to use data warehouses to store, process, and analyze big data. Key concepts covered here include how to recognize the need to scale architectures to keep up with needs for storage and processing of big data; how to identify characteristics of data warehouses ideally suiting them to tasks of big data analysis and processing; and how to distinguish between relational databases and data warehouses. Next, learn to recognize specific characteristics of systems meant for online transaction processing and online analytical processing, and how data warehouses are an example of online analytical processing (OLAP) systems. Then, learn to identify various components of data warehouses enabling them to work with varied sources, extract and transform big data, and generate reports of analysis operations efficiently. Finally, study features of Amazon Redshift enabling big data to be processed at scale; features of data warehouses, contrasted with those of relational databases; and two options available to scale compute capacity.
8 videos |
52m
Assessment
Badge
Scalable Data Architectures: Using Amazon Redshift
Using a hands-on lab approach, explore how to use Amazon Redshift to set up and configure a data warehouse on the cloud in this 9-video course. Discover how to interact with Redshift service with both the console and Amazon Web Services (AWS) Command Line Interface (CLI). Key concepts covered here include how to use the Amazon Redshift Quick Launch feature to provision a data warehouse; provisioning a Redshift cluster with the default cluster; and tool configuration options for a Redshift cluster, and metrics available to optimize a cluster configuration. Next, learn how to create Identity and Access Management (IAM) roles on AWS that include necessary permissions to interact with Redshift and S3 services; to provision an IAM user that can connect to and interact with AWS using the CLI; and to install the AWS command-line interface to create and delete Redshift clusters. Then learn to use Redshift Query Editor to create tables, load data, and run queries; and learn features of Amazon Redshift and commands and configurations needed to work with Redshift by using the CLI.
9 videos |
54m
Assessment
Badge
Scalable Data Architectures: Using Amazon Redshift & QuickSight
In this 12-video course, explore the loading of data from an external source such as Amazon S3 into a Redshift cluster, as well as configuration of snapshots and resizing of clusters. Discover how to use Amazon QuickSight to visualize data. Key concepts covered in this course include using the AWS console to load data sets to Amazon S3 and then into a table provisioned on a Redshift cluster; running queries on data in a Redshift cluster with the query evaluation feature; and working with SQL Workbench to connect to and query data in a Redshift cluster. Learn how to disable automated snapshots for a Redshift cluster and configure a table to be excluded from snapshots; recover an individual table from the snapshot of an entire cluster; and create a security group rule enabling access from Amazon's QuickSight servers to a Redshift cluster. Next, configure Amazon QuickSight to load data from a table in a Redshift cluster for analysis; and use the QuickSight dashboard to generate a time series plot to visualize sales at a retailer over time.
12 videos |
1h 17m
Assessment
Badge
Cloud Data Architecture: Data Management & Adoption Frameworks
Explore how to implement containers and data management on popular cloud platforms like Amazon Web Services (AWS) and Google Cloud Platform (GCP) for data science. Planning big data solutions, disaster recovery, and backup and restore in the cloud are also covered in this course. Key concepts covered here include cloud migration models from the perspective of architectural preferences; prominent big data solutions that can be implemented in the cloud; and the impact of implementing Kubernetes and Docker in the cloud, and how to implement Kubernetes on AWS. Next, learn how to implement data management on AWS, GCP, and DBaaS; how to implement big data solutions using AWS; how to build backup and restore mechanisms in the cloud; and how to implement disaster recovery planning for cloud applications. Learners will see prominent cloud adoption frameworks and their associated capabilities, and hear benefits of and how to implement blockchain technologies or solutions in the cloud. Finally, learn how to implement Kubernetes on AWS, build backup and restore mechanisms on GCP, and implement big data solutions in the cloud.
13 videos |
1h 4m
Assessment
Badge
Data Architecture Getting Started
In this 12-video course, learners explore how to define data, its lifecycle, the importance of privacy, and SQL and NoSQL database solutions and key data management concepts as they relate to big data. First, look at the relationship between data, information, and analysis. Learn to recognize personally identifiable information (PII), protected health information (PHI), and common data privacy regulations. Then, study the data lifecycle's six phases. Compare and contrast SQL and NoSQL database solutions and look at using Visual Paradigm to create a relational database ERD (entity-relationship diagram). To implement an SQL solution, Microsoft SQL Server is deployed in the Amazon Web Services (AWS) cloud, and a NoSQL solution by deploying DynamoDB in the AWS cloud. Explore definitions of big data and governance. Learners will examine various types of data architecture, including TOGAF (The Open Group Architecture Framework) enterprise architecture. Finally, learners study data analytics and reporting, how organizations can derive value from data they have. The concluding exercise looks at implementing effective data management solutions.
13 videos |
1h 2m
Assessment
Badge
Data Architecture Deep Dive - Design & Implementation
This 11-video Skillsoft Aspire course explores the numerous types of data architecture that can be used when working with big data; how to implement strategies by using NoSQL (not only structured query language); CAP theorem (consistency, availability, and partition tolerance); and partitioning to improve performance. Learners examine the core activities essential for data architectures: data security, privacy, integrity, quality, regulatory compliances, and governance. You will learn different methods of partitioning, and the criteria for implementing data partitioning. Next, you will install and explore MongoDB, a cross-platform document-oriented database system, and learn to read and write optimizations in MongoDB. You will learn to identify various important components of hybrid data architecture, and adapting it to your data needs. You will learn how to implement DAG (Directed Acyclic Graph) by using the Elasticsearch search engine. You evaluate your needs to determine whether to implement batch processing or stream processing. This course also covers process implementation by using serverless and Lambda architecture. Finally, you will examine types of data risk when implementing data modeling and design.
12 videos |
35m
Assessment
Badge
Data Architecture Deep Dive - Microservices & Serverless Computing
Explore numerous types of data architecture that are effective data wrangling tools when working with big data in this 9-video Skillsoft Aspire course. Learn the strategies, design, and constraints involved in implementing data architecture. You will learn the concepts of data partitioning, CAP theorem (consistency, availability, and partition tolerance), and process implementation using serverless and Lambda data architecture. This course examines Saga, newly introduced in data management pattern catalog of microservices; API (application programming interface) composition; CQRS (Command Query Responsibility Segregation); event sourcing; and application event. This course explores the differences in traditional data architecture and serverless architecture which allows you to use client-side logic and third-party services. You will learn how to use AWS (Amazon Web Services) Lambda to implement a serverless architecture. This course then explores batch processing architecture, which processes data files by using long running batch jobs to filter actual content, real-time architecture, and machine learning at scale architecture built to serve machine learning algorithms. Finally, you will explore how to build a successful data POC (proof of concept).
10 videos |
25m
Assessment
Badge
Streaming Data Architectures: An Introduction to Streaming Data in Spark
Learn the fundamentals of streaming data with Apache Spark. During this course, you will discover the differences between batch and streaming data. Observe the types of streaming data sources. Learn about how to process streaming data, transform the stream, and materialize the results. Decouple a streaming application from the data sources with a message transport. Next, learn about techniques used in Spark 1.x to work with streaming data and how it contrasts with processing batch data; how structured streaming in Spark 2.x is able to ease the task of stream processing for the app developer; and how streaming processing works in both Spark 1.x and 2.x. Finally, learn how triggers can be set up to periodically process streaming data; and the key aspects of working with structured streaming in Spark
9 videos |
50m
Assessment
Badge
SHOW MORE
FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.BOOKS INCLUDED
Book
Scalable Big Data Architecture: A Practitioner's Guide to Choosing Relevant Big Data ArchitectureCovering real-world, concrete industry use cases, this book is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a big data project and which tools to integrate into that pattern.
1h 51m
By Bahaaldine Azarmi
Book
Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data VaultDrawing upon years of practical experience and using numerous examples and an easy to understand framework, this timely guide defines the importance of data architecture and how it can be used effectively to harness big data within existing systems.
4h 27m
By Daniel Linstedt, W.H. Inmon
Book
Data Architecture: From Zen to RealityDiscussing proven methods and technologies to solve the complex issues dealing with data, this book explains the principles underlying data architecture, how data evolves with organizations, and the challenges organizations face in structuring and managing their data.
7h 23m
By Charles D. Tupper