Lesson 8: Machine Learning - Classification, Dimesionality Reduction¶

You have already started Machine Learning when you performed the linear regression analysis, but let’s talk about Machine Learning in general first, then, as promised earlier, we’ll move to our larger dataset.

Machine Learning (Learn from data and make decisions)	Supervised Learning (Predictive Model)	Classification
	Supervised Learning (Predictive Model)	Regression
	Unsupervised Learning (Non-predicitve Model)	Clustering
	Unsupervised Learning (Non-predicitve Model)	Dimensionality Reduction

Supervised Learning:¶

Use training set with correct inputs and outputs to predict outputs for test data inputs.

Classification:¶

Inputs(X): Features
Outputs(y): binary or multiple classes

Regression:¶

Inputs(X): Independent Variable
Outputs(y): Dependent Variable (Continuous)

Unupervised Learning:¶

Find patterns among inputs (features), no labels in data

Clustering:¶

Find groups within data (Example: Phylogeny tree)

Dimensionality Reduction:¶

Find a lower dimension representation of higher dimensional data

We built a linear regression model in the last lesson, and Classification and Dimensionaloty Reduction component of the ML lesson is currently available as Kaggle Notebook on Tumor Classification between AML and ALL and finding top genes contributing to the classification.

Next, we will explore Clustering.