Companies Home Search Profile

Cleaning Bad Data in R

Focused View

Mike Chapple

1:55:31

105 View
  • 01 - Data is messy.mp4
    01:10
  • 02 - What you need to know.mp4
    01:09
  • 01 - Types of missing data.mp4
    03:38
  • 02 - Missing values.mp4
    11:25
  • 03 - Missing rows.mp4
    05:58
  • 04 - Aggregations and missing values.mp4
    05:08
  • 01 - Duplicated rows and values.mp4
    04:50
  • 02 - Aggregations in the data set.mp4
    03:42
  • 01 - Converting dates.mp4
    05:54
  • 02 - Unit conversions.mp4
    03:50
  • 03 - Numbers stored as text.mp4
    03:32
  • 04 - Text improperly converted to numbers.mp4
    03:17
  • 05 - Inconsistent spellings.mp4
    06:51
  • 01 - Screening for outliers.mp4
    04:53
  • 02 - Handling outliers.mp4
    02:46
  • 03 - Outliers use case.mp4
    03:34
  • 04 - Outliers in subgroups.mp4
    03:33
  • 05 - Detecting illogical values.mp4
    03:14
  • 01 - What is tidy data.mp4
    03:59
  • 02 - Variables, observations, and values.mp4
    05:11
  • 03 - Common data problems.mp4
    07:57
  • 04 - Wide vs. long data sets.mp4
    03:23
  • 05 - Making wide data sets long.mp4
    04:37
  • 06 - Making long data sets wide.mp4
    03:41
  • 01 - Suspicious values.mp4
    04:49
  • 02 - Suspicious multiples.mp4
    02:25
  • 01 - Whats next.mp4
    01:05
  • Description


    Data integrity is the new focal point of the data science revolution. Now that everybody is onboard with the role of data in people's lives and business, it's not an unfair question to ask, "Can you prove that your data is accurate?" In this course, you can learn how to identify and address many of the data integrity issues facing modern data scientists, using R and the tidyverse. Discover how to handle missing values and duplicated data. Find out how to convert data between different units and tackle poorly formatted text. Plus, learn how to detect outliers, address structural issues, and identify red flags that indicate potential data quality issues.

    Where possible, instructor Mike Chapple shows how to correct the issues using R, but the same principles can be applied to any statistical programing language.

    More details


    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Category
    Mike Chapple
    Mike Chapple
    Instructor's Courses
    Cybersecurity and analytics educator and leader with over 20 years of experience in government, the private sector and higher education. Author of over 30 books, including best-selling study guides from Wiley covering the CISSP, Security+, CISM, CySA+, CIPP/US, and PenTest+ exams. Creator of over 100 cybersecurity and business analytics video courses on LinkedIn Learning.
    LinkedIn Learning is an American online learning provider. It provides video courses taught by industry experts in software, creative, and business skills. It is a subsidiary of LinkedIn. All the courses on LinkedIn fall into four categories: Business, Creative, Technology and Certifications. It was founded in 1995 by Lynda Weinman as Lynda.com before being acquired by LinkedIn in 2015. Microsoft acquired LinkedIn in December 2016.
    • language english
    • Training sessions 27
    • duration 1:55:31
    • Release Date 2022/12/15

    Courses related to R Programming