Operations with petl: Introduction

Petl 1.6    |    Beginner
  • 15 videos | 2h 2m 5s
  • Includes Assessment
  • Earns a Badge
Rating 4.6 of 36 users Rating 4.6 of 36 users (36)
Extract, Transform, and Load (ETL) tasks help in collecting and manipulating data from diverse sources to fit the user's requirements. In this course, you'll explore different interfaces available in the petl library and perform basic ETL tasks using petl. You will begin by examining how to import data from various data sources, including delimited text files, Microsoft Excel, and structured JSON data. You'll also recognize how to load and save data in these formats. Next, you'll outline how to integrate petl with a relational database using SQLAlchemy and SQLite3. Finally, you'll perform transform operations on data using different petl features to filter specific data needed by you. Once you have completed this course, you'll have a clear understanding of the role played by petl in simplifying ETL tasks.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Install petl and create a basic petl table from a toy dataset
    Import data from a csv file and extract it to a petl table
    Perform various import and export operations on csv, tsv, and txt files
    Export data from petl using a template, epilogue, and prologue
    Perform lookups on data imported from pickle files
    Import data from xml files and perform lookups on it
    Implement read operations on data and export it in html format
  • Read json data and perform lookup operations on it
    Export data stored in petl data tables to a persistent file format
    Import data from microsoft excel and perform basic operations on it
    View summary statistics of data in petl and export it to microsoft excel
    Create a table in sqlite and import it to petl using sqlalchemy and sqlite3
    Slice and dice data stored as records within a petl data table
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 2m 30s
    In this video, you’ll learn more about the course and your instructor. In this course, you’ll explore data sources that PETL can extract data from, including delimited text files, Microsoft Excel, structured JSON data, and pickled binary objects. You’ll also learn how PETL can load or save data in these formats. You’ll then look at how PETL can integrate with relational databases using SQLAlchemy. FREE ACCESS
  • 7m 55s
    In this video, you’ll learn more about petl. This demo is from a Macintosh platform and you’ll use a Jupyter Notebook to run your petl programs. You’ll see the command to invoke Jupyter onscreen. When you hit Enter, the Jupyter Notebook server is launched. You’ll copy one of the URLs onscreen and paste it into a web browser. This will bring up the user interface of the Jupyter Notebook. FREE ACCESS
  • Locked
    3.  Reading Data from CSV Files in petl
    11m 24s
    In this video, you’ll learn more about petl. In this demo, you’ll explore the petl library. You’ll start by learning how petl can be used to work with CSV files. You’ll start with the import statements as usual. Onscreen there are three import statements. You’ll import petl with the alias etl. You’ll also import the csv model which is a part of the Python core library and then from within petl, you’ll import appendcsv. FREE ACCESS
  • Locked
    4.  Working with Structured and Semi-structured Data
    10m 13s
    In this video, you’ll watch a demo. In this demo, you’ll pick up where you left off at the end of the last demo. Onscreen, you can see new data has been read in. This information is contained in a file within the datasets folder called GSPC_2020. Now, you’ll invoke the etl.io.csv.fromcsv. You’ll display this data onscreen using the .display method. Now, the header row seems like one of the data rows. FREE ACCESS
  • Locked
    5.  Exporting Data from petl Using Templates
    9m 21s
    In this video, you’ll watch a demo. You’ll learn petl is an ETL tool. An ETL is an acronym for extract, transform, and load. In this demo, you’ll look at transform and load operations using petl. You’ll pick up where you left off at the end of the previous demo. Those three columns are all visible on screen now. You’ve now extracted data from this text file, and you’ll transform it into another format. FREE ACCESS
  • Locked
    6.  Performing Lookups in petl
    7m 23s
    In this video, you’ll watch a demo. You’ll learn more about some non-text formats using pickle. Onscreen now, you’ll see various import statements. You’ve imported petl with the alias etl. Then, you imported the pickle library, and then from within petl, you imported the appendpickle module. pickle is a popular serialization format used in many Python applications. You’ll learn serialization is the process of writing objects out to binary files. FREE ACCESS
  • Locked
    7.  Importing XML Data to petl
    11m 21s
    In this video, you’ll watch a demo. You’ll learn more about XML. XML is an acronym for Extensible Markup Language. This is text format for coding structured data. It's simple to define objects and tables consisting of rows and columns in XML format. This is why HTML is so closely related to XML. Onscreen now, you’ll see Python open function has been invoked. FREE ACCESS
  • Locked
    8.  Formatting Data in petl and Exporting It to HTML
    11m 31s
    In this video, you’ll watch a demo. In this demo, you’ll focus on how data is displayed. You’ll zero in on the display and the two HTML methods of the petl table abstraction. Onscreen, you’ll see the display method on advertisement data has been invoked. This is the petl table which you created and passed in the last demo. In this demo you’ll use a new keyword argument, index_header. You’ll specify index_header equal to True. FREE ACCESS
  • Locked
    9.  Working with JSON Data in petl
    11m 2s
    In this video, you’ll watch a demo. In this demo, you’ll turn your attention to another important file format, JSON. You’ll start by importing the petl module with the alias etl, and import prettyprint. You’ll see this is defined in the pprint module. Then, you’ll invoke the etl.fromjson method. You’ll pass in two input arguments here. The first is the file location of the JSON file and the second is a header. FREE ACCESS
  • Locked
    10.  Exporting petl Data to Persistent File Formats
    6m 54s
    In this video, you’ll watch a demo. In this demo, you’ll look back to one particular snippet of code from the previous demo. There, you performed an etl.dictlookupone operation on a petl table called movies_data specifically indexing on the directed_by field. This gives you a dictionary called lookup_director_one. FREE ACCESS
  • Locked
    11.  Importing Data from Microsoft Excel to petl
    9m 29s
    In this video, you’ll watch a demo. In this demo, you’ll explore petl's support and interoperability with files in Microsoft Excel formats. You’ll learn petl has some dependencies on external libraries. To work with .xls files, you’ll need the modules xlrd and xlwt-future. Onscreen, you can see both of these have been installed using the pip package manager. On the top right of our notebook, you’ll see Python 3, and that's why you’ll use pip3. FREE ACCESS
  • Locked
    12.  Exporting Data from petl to Microsoft Excel
    5m 40s
    In this video, you’ll watch a demo. In this demo, you’ll pick up where you left off with the last one. You’ll learn that just as you can read data in from an xlsx file, you can write data out as well. On the screen you’ll see the etl.toxlsx function has been invoked. The table you just created has been passed in. The file output path and a keyword argument write_header are equal to True.   FREE ACCESS
  • Locked
    13.  Importing SQL Tables to petl
    8m 22s
    In this video, you’ll watch a demo. In this demo, you’ll see how petl interoperates with SQL databases and how petl has an internal dependency on SQLAlchemy. You’ll start by importing petl as etl. Then, you’ll import sqlite3. And then from within petl, you’ll import the appenddb sub module. sqlite3 is a Python package, which allows you to interact with SQLite. SQLite is an extremely lightweight database, that can be used within Python. FREE ACCESS
  • Locked
    14.  Slicing and Dicing Data in a petl Table
    7m 13s
    In this video, you’ll watch a demo. Onscreen, you’ll see that you've invoked the etl.nrows operation on your petl table. This tells you that your petl table had 709 rows. Now, you’ll get started and explore other petl functions for working with data. These are going to be from the petl transform module. You’ll begin by invoking the etl.head function in order to get a subset of the initial rows in your petl table. FREE ACCESS
  • Locked
    15.  Course Summary
    1m 49s
    In this video, you’ll summarize what you’ve learned in the course. You learned the use cases of petl in building extract transform load pipelines. You explored a variety of data sources that petl can extract data from, including delimited text files, Microsoft Excel, and structured JSON data. You also covered how petl can load or save data in these formats. You looked into how petl can integrate with a relational database using SQLAlchemy.   FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.7 of 830 users Rating 4.7 of 830 users (830)
Rating 4.7 of 12 users Rating 4.7 of 12 users (12)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.7 of 29 users Rating 4.7 of 29 users (29)
Rating 4.1 of 12 users Rating 4.1 of 12 users (12)
Course Auditing
Rating 4.4 of 100 users Rating 4.4 of 100 users (100)