# Dimensionality Reduction
There are two aspects to data:
- Quality
- Quantity
In an ideal scenario we would have high quality data with sufficient quantity. However, sometimes we run into problems:
- Lack of computational resources to train on our dataset
- Noise in the data that does not contribute to the predictive power we are looking for
- Sparcity
One way of alliveating these probelems is by modifying the data sest such that it is more digestable by reducing its dimensions. There are two types of dimensionality reduction:
1. Feature Selection: Selecting features (and deleting others) from existing ones
2. Feature Extraction: Creating entirely new features from existing ones
- Linear Techniques
a. Random Projection
1. **Gaussian Random Projection:** Multiplying input with dense random matrix, elements of which are derived from Gaussian distribution.
2. **Sparse Random Projection:** Multiplying input with sparse random matrix for faster computation and being memory efficient.
b. [[Principal Component Analysis]]: Finding orthogonal hyperplanes (lower-dimensional linear subspaces) that capture maximum variance in data.
c. Matrix Factorization Techniques:
1. **Singular Value Decomposition (SVD):** It’s not explicitly a dimension reduction method but a matrix factorization technique. However, it can be used for dimension reduction using its Low-rank approximation.
2. **CUR Decomposition**
- Non-Linear Techniques
a. t-SNE
b. Kernel PCA