Final Exam: Data Scientist
Intermediate
- 1 video | 32s
- Includes Assessment
- Earns a Badge
Final Exam: Data Scientist will test your knowledge and application of the topics presented throughout the Data Scientist track of the Skillsoft Aspire Data Analyst to Data Scientist Journey.
WHAT YOU WILL LEARN
-
Load data from databases using rimplement dask arrays in order to manage numpy apislist dask task scheduling and big data collection featuresdemonstrate the steps involved in ingesting data from databases to hadoop clusters using sqooplist and compare the various essential data ingestion tools that we can use to ingest datademonstrate how we can ingest data using wavefrontdescribe the various essential distributed data management frameworks used to handle big datadefine the concept of storyboarding along with the prominent storyboarding templates that we can use to implement storyboardingcompare the different types of recommendation engines and how they can be used to solve different recommendation problemsidentify different cloud data sources availablecreate an r function that finds similar users and finds products they liked which would be good to recommend to the userbuild heat maps and scatter plots using rdescribe the gestalt principles of visual perceptionpandas ml to explore a dataset where the samples are not evenly distributed across the target classesdescribe how regression works by finding the best fit straight line to model the relationships in your datadescribe the process involved in learning a relationship between input and output during the training phase of machine learninguse pandas and seaborn to visualize the correlated fields in a datasetcombine the use of oversampling and pca in building a classification modelrecognize how to enable data-driven decision makingcan be leveraged to extract value from big dataimplement python luigi in order to set up data pipelinesorganize your dashboard by adding objects and adjusting the layoutidentify libraries that can be used in python to implement data visualizationshare your dashboard to othersimplement point and interval estimation using rcreate an http server using hapi.jscompare the differences between the descriptive and inferential statistical analysislist libraries that can be used in python to implement data visualizationdescribe the concept of serverless computing and its benefitsdescribe what truncated data is and how to remove it using azure automation
-
recognize the impact of implementing containerization on cloud hosting environmentsdemonstrate how to craft visual data using tableaurecognize the problems associated with a model that is overfitted to training data and how to mitigate the issueuse the scikit-learn library to build and train a linearsvc classification model and then evaluate its performance using the available model evaluation functionswork with vectors and metrics using python and rrecall cloud migration models from the perspective of architectural preferencesdemonstrate how to create a stacked bar plotdescribe the aspects of data qualitydemonstrate how to implement different types of bar charts using powerbicreate histograms, scatter plots, and box plots using python librariesdefine a portrecognize the data pipeline building capabilities provided by kafka, spark, and pysparkbuild and customize graphs using ggplot2 in radd extensions to your dashboard such as tableau extensions apiimplement data exploration using plots in rrecall the various essential decluttering steps and approaches that we can implement to eliminate cluttersbuild backup and restore mechanisms in the clouddescribe blockchainrecognize the impact of the implementing kubernetes and docker in the clouddemonstrate how to implement data exploration using rimplement correlogram and build area charts using ruse r to import, filter, and massage data into data setslinear regressionuse modules in your api using node.jshow the four vs should be balanced in order to implement a successful big data strategyinstall and prepare r for data explorationspecify volume in big data analytics and its role in the principle of the four vsintegrate spark and tableau to manage data pipelinesimplement missing values and outliers using pythonidentify the process and approaches involved in storytelling with data
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.