This article was automatically translated from the original Turkish version.
+1 More
Principal Component Analysis (PCA) is a transformation technique that reduces the dimensions of datasets containing many interrelated variables while preserving as much of the variation within the data as possible. These studies were first initiated by Karl Pearson in 1901 and later developed by Hotelling in 1933. The goal is to find the optimal transformation that allows the data to be represented using fewer variables. The variables obtained after transformation are called principal components and are ordered such that the principal component with the highest variance is placed first.
Principal Component Analysis (PCA) is commonly used for the following purposes:
In PCA, there are two main components: the first principal component (PC1) and the second principal component (PC2).
When PCA is applied, a scatter plot is typically used to illustrate the relationship between PC1 and PC2. The axes for PC1 and PC2 are shown perpendicular to each other. The first and second principal components are graphically represented below.

First and Second Principal Components (generated by artificial intelligence.)
Let our data matrix be observation vectors, each of size , represented as . In matrix , each column represents a different variable (data type), as described below.
Since the variables may have different units of measurement, the data are standardized. Standardization is performed by centering each variable so that its mean becomes zero. This is done by subtracting the mean of the dataset from each data point.
After subtracting the means, the matrix is obtained as follows.
In the next step, the covariance matrix is calculated as follows.
Variance and covariance are used to understand how variables behave within a dataset. In the covariance matrix the coefficients along the diagonal represent variance values, indicating the spread of data in a single dimension around its mean. Covariance indicates how two variables change together: positive covariance means that when one variable increases, the other tends to increase as well, or both decrease; negative covariance means that when one variable increases, the other tends to decrease. The obtained covariance matrix undergoes eigenvalue-eigenvector decomposition.
Here, represents the eigenvalues and represents the eigenvectors. The eigenvalues are ordered from largest to smallest, and the first eigenvectors corresponding to the largest eigenvalues are selected to form the columns of the projection matrix .
Using the projection matrix , the dimensionality of the data is reduced from dimensions to dimensions for .
Bulut, Hasan. "R Uygulamaları ile Çok Değişkenli İstatistiksel Yöntemler." Ankara: Nobel Akademik Yayıncılık, 2018.
H. Hotelling, "Analysis of a complex of statistical variables into principal components," Journal of Educational Psychology, Volume 24, pp. 417-441, 1933.
H. S. Yavuz and M. A. Çay, "Applications of the Principal Component Analysis Method and Some Classical and Robust Adaptations in Face Recognition," ESOGÜ Journal of Engineering and Architecture Faculty, vol. 22, no. 1, pp. 49–63, 2009.
IBM. "Principal Component Analysis." Accessed May 17, 2025. Accessed Adresi.
K. Pearson, "On lines and planes of closest fit to systems of points in space," Philosophical Magazine, vol. 2, no. 11, pp. 559-572, 1901.
No Discussion Added Yet
Start discussion for "Principal Component Analysis" article
Mathematical Model of Principal Component Analysis