Companies Home Search Profile

Building Features from Text Data

Focused View

Janani Ravi

2:35:59

88 View
  • 1. Course Overview.mp4
    01:49
  • 01. Version Check.mp4
    00:16
  • 02. Module Overview.mp4
    01:16
  • 03. Prerequisites and Course Outline.mp4
    01:17
  • 04. One-hot Encoding.mp4
    04:24
  • 05. Count Vectors.mp4
    03:00
  • 06. Tf-Idf Vectors.mp4
    03:00
  • 07. Co-occurence Vectors.mp4
    05:05
  • 08. Word Embeddings.mp4
    05:11
  • 09. Installing Packages and Setting Up the Environment.mp4
    03:08
  • 10. Sentence and Word Tokenization.mp4
    05:28
  • 11. Plotting Word Frequency Distributions.mp4
    04:01
  • 12. Module Summary.mp4
    01:12
  • 1. Module Overview.mp4
    01:17
  • 2. Bag-of-words and Bag-of-n-grams.mp4
    03:02
  • 3. Bag-of-words Using the Count Vectorizer.mp4
    06:53
  • 4. Inverse Transform Using the Count Vectorizer.mp4
    01:49
  • 5. Bag-of-n-grams Using the Count Vectorizer.mp4
    05:30
  • 6. Generating N-grams Using NLTK.mp4
    03:25
  • 7. Bag-of-words Using the Tf-Idf Vectorizer.mp4
    04:23
  • 8. Module Summary.mp4
    01:18
  • 1. Module Overview.mp4
    01:15
  • 2. Natural Language Processing Operations.mp4
    05:44
  • 3. Stopword Removal Using NLTK and scikit-learn.mp4
    06:43
  • 4. Frequency Filtering Using scikit-learn.mp4
    02:58
  • 5. Stemming.mp4
    05:56
  • 6. Lemmatization.mp4
    03:31
  • 7. Parts-of-speech Tagging.mp4
    06:21
  • 8. Module Summary.mp4
    01:22
  • 1. Module Overview.mp4
    01:11
  • 2. Feature Hashing.mp4
    02:26
  • 3. Reducing Dimensions Using the Feature Hasher.mp4
    03:43
  • 4. Reducing Dimensions at Scale Using the Hashing Vectorizer.mp4
    06:24
  • 5. Locality-sensitive Hashing.mp4
    05:29
  • 6. Similar Documents Using Jaccard Index and Locality-sensitive Hashing.mp4
    07:01
  • 7. Module Summary.mp4
    01:23
  • 01. Module Overview.mp4
    01:05
  • 02. Naive Bayes for Classification.mp4
    02:44
  • 03. Classification Using the Hashing Vectorizer.mp4
    07:55
  • 04. Pre-process Text Using a Stemmer, Build Features Using the Hashing Vectorizer.mp4
    02:58
  • 05. Building Features Using the Count Vectorizer.mp4
    02:13
  • 06. Pre-processing with Stopword Removal, Building Features Using Count Vectorizer.mp4
    01:49
  • 07. Pre-processing with Stopword Removal, Frequency Filtering, Building Features Using Count Vectorizer.mp4
    03:28
  • 08. Building Features Using the Tf-Idf Vectorizer.mp4
    01:49
  • 09. Building Features Using Bag-of-n-grams Model.mp4
    02:13
  • 10. Summary and Further Study.mp4
    01:34
  • Description


    This course covers aspects of extracting information from text documents and constructing classification models including feature vectorization, locality-sensitive hashing, stopword removal, lemmatization, and more from natural language processing.

    What You'll Learn?


      From chatbots to machine-generated literature, some of the hottest applications of ML and AI these days are for data in textual form.

      In this course, Building Features from Text Data, you will gain the ability to structure textual data in a manner ideal for use in ML models.

      First, you will learn how to represent documents as feature vectors using one-hot encoding, frequency-based, and prediction-based techniques. You will see how to improve these representations based on the meaning, or semantics, of the document.

      Next, you will discover how to leverage various language modeling features such as stopword removal, frequency filtering, stemming and lemmatization, and parts-of-speech tagging.

      Finally, you will see how locality-sensitive hashing can be used to reduce the dimensionality of documents while still keeping similar documents close together.

      You will round out the course by implementing a classification model on text documents using many of these modeling abstractions.

      When you’re finished with this course, you will have the skills and knowledge to use documents and textual data in conceptually and practically sound ways and represent such data for use in machine learning models.

    More details


    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing high-quality content for technical skill development. Loonycorn is working on developing an engine (patent filed) to automate animations for presentations and educational content.
    Pluralsight, LLC is an American privately held online education company that offers a variety of video training courses for software developers, IT administrators, and creative professionals through its website. Founded in 2004 by Aaron Skonnard, Keith Brown, Fritz Onion, and Bill Williams, the company has its headquarters in Farmington, Utah. As of July 2018, it uses more than 1,400 subject-matter experts as authors, and offers more than 7,000 courses in its catalog. Since first moving its courses online in 2007, the company has expanded, developing a full enterprise platform, and adding skills assessment modules.
    • language english
    • Training sessions 46
    • duration 2:35:59
    • level advanced
    • English subtitles has
    • Release Date 2023/01/24