

Mining Massive Data Sets at Stanford Overview
Mining Massive Data Sets
at Stanford
Duration | 7 hours |
Total fee | Free |
Mode of learning | Online |
Schedule type | Self paced |
Difficulty level | Intermediate |
Mining Massive Data Sets
Table of content- Overview
- Highlights
- Course Details
- Curriculum
- Entry Requirements
Mining Massive Data Sets at Stanford Highlights
Mining Massive Data Sets
at Stanford
- Earn a Certificate of completion from Stanford School Of Engineering on successful course completion
- Instructors - Jure Leskovec, Anand Rajaraman, & Jeffrey Ullman
- An introduction to modern distributed file systems, MapReduce, and algorithms
- FREE. Add a Verified Certificate for ?11,151
Read more
Mining Massive Data Sets at Stanford Course details
Mining Massive Data Sets
at Stanford
Skills you will learn
Who should do this course?
- This course is designed for those who want to learn the concepts of modern distributed file systems and MapReduce.
What are the course deliverables?
- There will be about 2 hours of video to watch each week, broken into small segments. There will be automated homeworks to do for each week, and a final exam.
More about this course
- The course introduces the participant to modern distributed file systems and MapReduce, including what distinguishes good MapReduce algorithms from good algorithms in general. The rest of the course is devoted to algorithms for extracting models and information from large datasets. Participants will learn how Google's PageRank algorithm models importance of Web pages and some of the many extensions that have been used for a variety of purposes. It will then cover locality-sensitive hashing, a bit of magic that allows you to find similar items in a set of items so large you cannot possibly compare each pair. When data is stored as a very large, sparse matrix, dimensionality reduction is often a good way to model the data, but standard approaches do not scale well; it will talk about efficient approaches. Many other large-scale algorithms are covered as well, as outlined in the course syllabus.
Mining Massive Data Sets at Stanford Curriculum
Mining Massive Data Sets
at Stanford
Week 1: MapReduce
Link Analysis -- PageRank
Week 2: Locality-Sensitive Hashing -- Basics + Applications
Distance Measures
Nearest Neighbors
Frequent Itemsets
Week 3: Data Stream Mining
Analysis of Large Graphs
Week 4: Recommender Systems
Dimensionality Reduction
Week 5: Clustering
Computational Advertising
Week 6: Support-Vector Machines
Decision Trees
MapReduce Algorithms
Week 7: More About Link Analysis - Topic-specific PageRank, Link Spam
More About Locality-Sensitive Hashing
Mining Massive Data Sets at Stanford Entry Requirements
Mining Massive Data Sets
at Stanford
Other courses offered by Stanford
4 years
View Other 213 Courses
Mining Massive Data Sets at Stanford Popular & recent articles
Mining Massive Data Sets
at Stanford
Aishwarya Bhatnagar · Jun 29, 2025

Aishwarya Bhatnagar · Jun 23, 2025

View more articles
Mining Massive Data Sets at Stanford Contact Information
Mining Massive Data Sets
at Stanford