Principal Component Analysis (PCA)
This algorithm is used for dimensionality reduction of high-dimensional data and subsequent analysis or plotting.
from sklearn.decomposition import PCA
pca_model = PCA(n_components=2)
principal_components = pca_model.fit_transform(scaled_X)
pca_model.components_
pca_model.explained_variance_ratio_
Elbow method to determine optimal number of components
explained_variance = []
for n in range(1,30):
pca = PCA(n_components=n)
pca.fit(scaled_X)
explained_variance.append(np.sum(pca.explained_variance_ratio_))
plt.plot(list(range(1,30)),explained_variance)
plt.xlabel('Num of components')
plt.ylabel('Variance Explained')