HOME > Development > Information Retrieval and Mining Massive Data Sets

Information Retrieval and Mining Massive Data Sets

  • Development
  • Mar 05, 2025
SynopsisInformation Retrieval and Mining Massive Data Sets, available...
Information Retrieval and Mining Massive Data Sets  No.1

Information Retrieval and Mining Massive Data Sets, available at $69.99, has an average rating of 4.25, with 123 lectures, based on 123 reviews, and has 2133 subscribers.

You will learn about The course is primarily divided into 6 parts. Part 1: Building an Information Retrieval System Part 2: Mining Frequent Patterns and Associations Part 3: Classification and Clustering Part 4: Web Mining Part 5: Recommendation Systems This course is ideal for individuals who are Big Data Enthusiast or Data Scientists It is particularly useful for Big Data Enthusiast or Data Scientists.

Enroll now: Information Retrieval and Mining Massive Data Sets

Summary

Title: Information Retrieval and Mining Massive Data Sets

Price: $69.99

Average Rating: 4.25

Number of Lectures: 123

Number of Published Lectures: 123

Number of Curriculum Items: 123

Number of Published Curriculum Objects: 123

Original Price: $19.99

Quality Status: approved

Status: Live

What You Will Learn

  • The course is primarily divided into 6 parts.
  • Part 1: Building an Information Retrieval System
  • Part 2: Mining Frequent Patterns and Associations
  • Part 3: Classification and Clustering
  • Part 4: Web Mining
  • Part 5: Recommendation Systems
  • Who Should Attend

  • Big Data Enthusiast
  • Data Scientists
  • Target Audiences

  • Big Data Enthusiast
  • Data Scientists
  • The goal is to introduce various techniques required to build an IR System. In this course we will explore various methods to solve big data problem. We will evaluate alternative solutions and trade offs. In the later part of the course we will discuss various data mining algorithms to make sense of massive data sets.

    Course Curriculum

    Chapter 1: Introduction To a Boolean Search Engine

    Lecture 1: What is Data Mining

    Lecture 2: Structured Data, Unstructured data and Information Retrieval

    Lecture 3: Term-Document Incidence Matrix (1)

    Lecture 4: Term-Document Incidence Matrix (2)

    Lecture 5: Inverted Index

    Lecture 6: Tradeoffs in implementing an Inverted Index

    Lecture 7: Processing AND, OR, NOT queries

    Lecture 8: Overview of Index Construction Pipeline

    Lecture 9: Query optimization using Document Frequency (1)

    Lecture 10: Query Optimization Using Document Frequency (2)

    Lecture 11: Boolean Retrieval Model

    Lecture 12: Example of a Boolean Retrieval Model

    Lecture 13: Limitations of Boolean Retrieval Model

    Lecture 14: How to evaluate performance of an IR System

    Lecture 15: Google zeitgeist

    Chapter 2: Dictionary Data Structure. Tolerant retrieval

    Lecture 1: Parsing Documents and Issues Associated with it

    Lecture 2: Tokenization Process in an IR System

    Lecture 3: Normalization to Terms

    Lecture 4: Faster Postings Merges With Skip Pointers

    Lecture 5: How to Handle Phrase Query

    Lecture 6: Phrase Query Using Positional Index

    Lecture 7: How to handle proximity query

    Lecture 8: Discussion on Positional Index Size

    Chapter 3: Index construction. Postings size estimation, sort-based indexing, dynamic index

    Lecture 1: Dictionary Data Structure Implementation

    Lecture 2: Wild card queries

    Lecture 3: Questions on Wild Card Queries

    Lecture 4: Wild Card Query Handling Using Permuterm Index

    Lecture 5: Wild Card Query Handling Using K-Gram Index

    Lecture 6: Soundex Algorithm

    Lecture 7: Spelling Correction Techniques in an IR System

    Lecture 8: Question On Soundex Algorithm

    Lecture 9: Spelling Correction (Part 2)

    Lecture 10: Introduction To Dynamic Programming

    Lecture 11: How To Calculate Edit Distance Between Two Strings

    Lecture 12: Spelling Correction Using Weighted Edit Distance

    Lecture 13: Spelling Correction Using Ngram Overlap Technique

    Lecture 14: Calculating Jaccard Coefficient (An Example)

    Lecture 15: Context Sensitive Spell Correction

    Chapter 4: Dictionary Compression, Posting Compression

    Lecture 1: Introduction to Index Construction

    Lecture 2: Index Construction Using InMemory Sorting

    Lecture 3: Index Construction Using BSBI Algorithm

    Lecture 4: Index Construction Using SPIMI Algorithm

    Lecture 5: Introduction To Distributed Indexing

    Lecture 6: How To build distributed indexes

    Lecture 7: Q & A on Distributed Index

    Lecture 8: Map Reduce

    Lecture 9: Dynamic indexing using naive approach

    Lecture 10: Dynamic indexing using logarithimic merge

    Lecture 11: Issues With Multiple Indexes

    Chapter 5: Scoring, term weighting, and the vector space model

    Lecture 1: Why do we compress indexes

    Lecture 2: Important Statistics about RCV Collection

    Lecture 3: Various Dictionary Compression Techniques

    Lecture 4: Various Dictionary Compression Techniques Part 2

    Lecture 5: Various Posting Compression Techniques

    Chapter 6: Efficient vector space scoring. Nearest neighbor techniques

    Lecture 1: Ranked Retrieval Model

    Lecture 2: Jaccard Score

    Lecture 3: Term Frequency Weighing And Bag Of Words Model

    Lecture 4: Inverse Document Frequency

    Lecture 5: TF-IDF Score

    Lecture 6: Documents AS TF-IDF Vectors

    Lecture 7: Length Normalization

    Lecture 8: Cosine Similarity Example

    Lecture 9: Computing Cosine Scores On Index

    Lecture 10: Variants of TF IDF Weights

    Chapter 7: Evaluating search engines. User happiness, precision, recall, F-measure

    Lecture 1: Term at a Time Scoring

    Lecture 2: Efficient Cosine Ranking

    Lecture 3: Generic Approach For Speeding up Cosine Similarity

    Lecture 4: Index Elimination

    Lecture 5: Champion Lists

    Lecture 6: Static Quality Score

    Lecture 7: High And Low Lists

    Lecture 8: Impact Ordered Posting

    Lecture 9: Cluster Pruning

    Lecture 10: Parametric Zone Tired Index

    Lecture 11: Query Term Proximity And Query Parsing

    Lecture 12: How A Search Engine Works

    Chapter 8: Advertisement Systen. Google AdSense. Search Engine Optimization

    Lecture 1: Performance of a Search Engine Part 1

    Lecture 2: Performance of a Search Engine Part 2

    Lecture 3: Performance of a Search Engine Part 3

    Lecture 4: Performance of a Search Engine Part 4

    Lecture 5: Performance of a Search Engine Part 5

    Chapter 9: Supervised Learning. Text Classification. Naive-Bayes Text Classification

    Lecture 1: ECommerce Vs. Traditional Businesses

    Lecture 2: Pricing Models For Online Advertisement

    Lecture 3: AdWords and AdSense

    Lecture 4: SEM And SEO

    Chapter 10: Link analysis. Web as a graph. PageRank

    Lecture 1: Classification System

    Lecture 2: Document Classification

    Lecture 3: Manual Classification Methods

    Lecture 4: Naive Bayes Classifiers

    Lecture 5: Bayes Rules Of Text Classification

    Instructors

  • Information Retrieval and Mining Massive Data Sets  No.2
    Omkar Deshpande
    Principal Engineer at WalmartLabs
  • Information Retrieval and Mining Massive Data Sets  No.3
    Mentors Net
    Touch More Lives, Impart More Wisdom
  • Rating Distribution

  • 1 stars: 4 votes
  • 2 stars: 3 votes
  • 3 stars: 17 votes
  • 4 stars: 37 votes
  • 5 stars: 62 votes
  • Frequently Asked Questions

    How long do I have access to the course materials?

    You can view and review the lecture materials indefinitely, like an on-demand channel.

    Can I take my courses with me wherever I go?

    Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!