Professor
Your browser is ancient!
Upgrade to a different browser to experience this site.
In “Data Mining in Python,” you will learn how to extract useful knowledge from large-scale datasets. This course introduces basic concepts and general tasks for data mining. You will explore a wide range of real-world data sets, including grocery store, restaurant reviews, business operations, social media posts, and more.
You will learn how to formally describe real-world information with general data representations (e.g., itemsets, vectors, matrices, sequences, and more). You will then learn how to formulate data in the wild with one or more of these representations.
This course will teach you how to characterize and explain your data by looking for patterns and similarities, which are basic building blocks for advanced analysis and machine learning models.
This is the first course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the Applied Data Science with Python specialization prior to beginning this course.
Data Mining in Python, part of the More Applied Data Science with Python series, introduces core concepts and techniques for discovering patterns in complex datasets. Learners work with itemsets, vectors, matrices, and sequences while applying similarity measures and algorithms through quizzes and programming assignments focused on real-world data representations.
This abbreviated syllabus description was created with the help of AI tools and reviewed by staff. The full syllabus is available to those who enroll in the course.
Module 1: Basic Concepts of Data Mining
Module 2: Mining Itemset Data
Module 3: Mining Vector and Matrix Data
Module 4: Mining Sequences
The course grade is based on four quizzes worth 20% (5% each), and four programming assignments. The first is worth 5%, and the remaining three are worth 25% each.
Professor
Course content developed by U-M faculty and managed by the university. Faculty titles and affiliations are updated periodically.
Advanced Level
A basic understanding of linear algebra, and completing the courses of the “More Applied Data Science with Python” series in order, is recommended.
Each course in this series will help you build data analytics skills using Python and increase your understanding of the role of data in shaping decisions.
Qiaozhu Mei Professor of Information, Associate Dean for Research and Innovation, School of Information