Data Filtering

Data Science    |    Beginner
  • 11 videos | 56m 52s
  • Includes Assessment
  • Earns a Badge
Rating 4.3 of 84 users Rating 4.3 of 84 users (84)
Once data is gathered for data science, it is often in an unstructured or raw format and must be filtered for content and validity. Explore examples of practical tools and techniques for data filtering.

WHAT YOU WILL LEARN

  • Identify common filtering techniques and tools
    Extract date elements from common date formats
    Parse content types in http headers
    Use csvcut to filter csv data
    Use sed to replace values in a text data stream
    Drop duplicate records from data
  • Extract headers from a jpeg image
    Use pdfgrep to extract data from searchable pdf files
    Detect invalid or impossible data combinations
    Parse robots.txt from a web site to decide what should and shouldn't be crawled nor indexed
    Drop records from a csv file based on date range

IN THIS COURSE

  • 3m 24s
    In this video, you will identify common filtering techniques and tools. FREE ACCESS
  • 6m 7s
    Find out how to extract date elements from common date formats. FREE ACCESS
  • Locked
    3.  Filtering HTTP Headers
    5m 11s
    In this video, learn how to interpret content types in HTTP headers. FREE ACCESS
  • Locked
    4.  Filtering CSV Data
    4m 52s
    In this video, you will learn how to use csvcut to filter out CSV data. FREE ACCESS
  • Locked
    5.  Replacing Values with sed
    6m 16s
    In this video, you will use sed to replace values in a text data stream. FREE ACCESS
  • Locked
    6.  Dropping Duplicate Data
    4m 44s
    In this video, you will learn how to drop duplicate records from data. FREE ACCESS
  • Locked
    7.  Working with JPEG Headers
    6m 49s
    In this video, you will learn how to extract headers from a jpeg image. FREE ACCESS
  • Locked
    8.  Filtering PDF Files
    4m 55s
    In this video, find out how to use pdfgrep to extract data from searchable PDF files. FREE ACCESS
  • Locked
    9.  Filtering for Invalid Data
    5m 51s
    During this video, you will learn how to detect invalid or impossible data combinations. FREE ACCESS
  • Locked
    10.  Exercise: Cull Old Data
    3m 21s
    Find out how to parse robots.txt from a web site to decide what should and shouldn't be crawled or indexed. FREE ACCESS
  • Locked
    11.  Parsing robots.txt
    5m 22s
    In this video, you will learn how to drop records from a CSV file based on a date range. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.5 of 24 users Rating 4.5 of 24 users (24)
Rating 4.8 of 12 users Rating 4.8 of 12 users (12)
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.4 of 2070 users Rating 4.4 of 2070 users (2070)
Rating 4.6 of 5897 users Rating 4.6 of 5897 users (5897)
Rating 4.2 of 2871 users Rating 4.2 of 2871 users (2871)