Operations with petl: Introduction
Petl 1.6
| Beginner
- 15 videos | 2h 2m 5s
- Includes Assessment
- Earns a Badge
Extract, Transform, and Load (ETL) tasks help in collecting and manipulating data from diverse sources to fit the user's requirements. In this course, you'll explore different interfaces available in the petl library and perform basic ETL tasks using petl. You will begin by examining how to import data from various data sources, including delimited text files, Microsoft Excel, and structured JSON data. You'll also recognize how to load and save data in these formats. Next, you'll outline how to integrate petl with a relational database using SQLAlchemy and SQLite3. Finally, you'll perform transform operations on data using different petl features to filter specific data needed by you. Once you have completed this course, you'll have a clear understanding of the role played by petl in simplifying ETL tasks.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseInstall petl and create a basic petl table from a toy datasetImport data from a csv file and extract it to a petl tablePerform various import and export operations on csv, tsv, and txt filesExport data from petl using a template, epilogue, and prologuePerform lookups on data imported from pickle filesImport data from xml files and perform lookups on itImplement read operations on data and export it in html format
-
Read json data and perform lookup operations on itExport data stored in petl data tables to a persistent file formatImport data from microsoft excel and perform basic operations on itView summary statistics of data in petl and export it to microsoft excelCreate a table in sqlite and import it to petl using sqlalchemy and sqlite3Slice and dice data stored as records within a petl data tableSummarize the key concepts covered in this course
IN THIS COURSE
-
2m 30sIn this video, you’ll learn more about the course and your instructor. In this course, you’ll explore data sources that PETL can extract data from, including delimited text files, Microsoft Excel, structured JSON data, and pickled binary objects. You’ll also learn how PETL can load or save data in these formats. You’ll then look at how PETL can integrate with relational databases using SQLAlchemy. FREE ACCESS
-
7m 55sIn this video, you’ll learn more about petl. This demo is from a Macintosh platform and you’ll use a Jupyter Notebook to run your petl programs. You’ll see the command to invoke Jupyter onscreen. When you hit Enter, the Jupyter Notebook server is launched. You’ll copy one of the URLs onscreen and paste it into a web browser. This will bring up the user interface of the Jupyter Notebook. FREE ACCESS
-
11m 24sIn this video, you’ll learn more about petl. In this demo, you’ll explore the petl library. You’ll start by learning how petl can be used to work with CSV files. You’ll start with the import statements as usual. Onscreen there are three import statements. You’ll import petl with the alias etl. You’ll also import the csv model which is a part of the Python core library and then from within petl, you’ll import appendcsv. FREE ACCESS
-
10m 13sIn this video, you’ll watch a demo. In this demo, you’ll pick up where you left off at the end of the last demo. Onscreen, you can see new data has been read in. This information is contained in a file within the datasets folder called GSPC_2020. Now, you’ll invoke the etl.io.csv.fromcsv. You’ll display this data onscreen using the .display method. Now, the header row seems like one of the data rows. FREE ACCESS
-
9m 21sIn this video, you’ll watch a demo. You’ll learn petl is an ETL tool. An ETL is an acronym for extract, transform, and load. In this demo, you’ll look at transform and load operations using petl. You’ll pick up where you left off at the end of the previous demo. Those three columns are all visible on screen now. You’ve now extracted data from this text file, and you’ll transform it into another format. FREE ACCESS
-
7m 23sIn this video, you’ll watch a demo. You’ll learn more about some non-text formats using pickle. Onscreen now, you’ll see various import statements. You’ve imported petl with the alias etl. Then, you imported the pickle library, and then from within petl, you imported the appendpickle module. pickle is a popular serialization format used in many Python applications. You’ll learn serialization is the process of writing objects out to binary files. FREE ACCESS
-
11m 21sIn this video, you’ll watch a demo. You’ll learn more about XML. XML is an acronym for Extensible Markup Language. This is text format for coding structured data. It's simple to define objects and tables consisting of rows and columns in XML format. This is why HTML is so closely related to XML. Onscreen now, you’ll see Python open function has been invoked. FREE ACCESS
-
11m 31sIn this video, you’ll watch a demo. In this demo, you’ll focus on how data is displayed. You’ll zero in on the display and the two HTML methods of the petl table abstraction. Onscreen, you’ll see the display method on advertisement data has been invoked. This is the petl table which you created and passed in the last demo. In this demo you’ll use a new keyword argument, index_header. You’ll specify index_header equal to True. FREE ACCESS
-
11m 2sIn this video, you’ll watch a demo. In this demo, you’ll turn your attention to another important file format, JSON. You’ll start by importing the petl module with the alias etl, and import prettyprint. You’ll see this is defined in the pprint module. Then, you’ll invoke the etl.fromjson method. You’ll pass in two input arguments here. The first is the file location of the JSON file and the second is a header. FREE ACCESS
-
6m 54sIn this video, you’ll watch a demo. In this demo, you’ll look back to one particular snippet of code from the previous demo. There, you performed an etl.dictlookupone operation on a petl table called movies_data specifically indexing on the directed_by field. This gives you a dictionary called lookup_director_one. FREE ACCESS
-
9m 29sIn this video, you’ll watch a demo. In this demo, you’ll explore petl's support and interoperability with files in Microsoft Excel formats. You’ll learn petl has some dependencies on external libraries. To work with .xls files, you’ll need the modules xlrd and xlwt-future. Onscreen, you can see both of these have been installed using the pip package manager. On the top right of our notebook, you’ll see Python 3, and that's why you’ll use pip3. FREE ACCESS
-
5m 40sIn this video, you’ll watch a demo. In this demo, you’ll pick up where you left off with the last one. You’ll learn that just as you can read data in from an xlsx file, you can write data out as well. On the screen you’ll see the etl.toxlsx function has been invoked. The table you just created has been passed in. The file output path and a keyword argument write_header are equal to True. FREE ACCESS
-
8m 22sIn this video, you’ll watch a demo. In this demo, you’ll see how petl interoperates with SQL databases and how petl has an internal dependency on SQLAlchemy. You’ll start by importing petl as etl. Then, you’ll import sqlite3. And then from within petl, you’ll import the appenddb sub module. sqlite3 is a Python package, which allows you to interact with SQLite. SQLite is an extremely lightweight database, that can be used within Python. FREE ACCESS
-
7m 13sIn this video, you’ll watch a demo. Onscreen, you’ll see that you've invoked the etl.nrows operation on your petl table. This tells you that your petl table had 709 rows. Now, you’ll get started and explore other petl functions for working with data. These are going to be from the petl transform module. You’ll begin by invoking the etl.head function in order to get a subset of the initial rows in your petl table. FREE ACCESS
-
1m 49sIn this video, you’ll summarize what you’ve learned in the course. You learned the use cases of petl in building extract transform load pipelines. You explored a variety of data sources that petl can extract data from, including delimited text files, Microsoft Excel, and structured JSON data. You also covered how petl can load or save data in these formats. You looked into how petl can integrate with a relational database using SQLAlchemy. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.