principal component analysis(PCA) āĻŦāĻšā§āĻ˛ āĻŦā§āĻ¯āĻŦāĻšāĻŋāĻ¤ Dimensionality Reduction Algorithm. PCA āĻŽā§āĻ˛āĻ¤ āĻāĻāĻāĻŋ āĻĄāĻžāĻāĻžāĻ¸ā§āĻā§ āĻĄāĻžāĻāĻžāĻā§āĻ˛ā§āĻ° Orthogonal Projection ( āĻ˛āĻŽā§āĻŦ āĻ āĻāĻŋāĻā§āĻˇā§āĻĒ) āĻā§āĻā§ āĻŦā§āĻ° āĻāĻ°ā§āĨ¤ Orthogonal projection āĻāĻ° āĻŽāĻžāĻ§ā§āĻ¯āĻŽā§ PCA āĻĄāĻžāĻāĻžāĻ¸ā§āĻā§āĻ° āĻ¸āĻ°āĻŦāĻā§āĻ Variance āĻā§āĻā§ āĻŦā§āĻ° āĻāĻ°ā§, āĻāĻžāĻ° āĻ¸āĻžāĻšāĻžāĻ¯ā§āĻ¯ āĻĄāĻžāĻāĻžāĻ¸ā§āĻ āĻāĻŦāĻ Feature āĻŽāĻ§ā§āĻ¯ā§ linear-corelation āĻŦā§āĻ° āĻāĻ°āĻž āĻāĻžāĻ!āĨ¤
āĻāĻ°āĻĨāĻžāĻ¤, āĻāĻŽāĻžāĻĻā§āĻ° āĻāĻžāĻā§ āĻāĻĻāĻŋ āĻāĻāĻāĻŋ āĻ¨āĻŋāĻĻāĻŋāĻļāĻ¤ āĻĻāĻžāĻ¤āĻžāĻ¸āĻāĻ¤ā§āĻ° linearlg corelated āĻāĻŋāĻā§ āĻĢāĻŋāĻāĻžāĻ° āĻĨāĻžāĻā§ āĻ¤āĻžāĻāĻ˛ā§ PCA āĻāĻāĻāĻž suitability orthogonal direction āĻā§āĻā§ āĻŦā§āĻ° āĻāĻ°āĻ¤ā§ āĻĒāĻžāĻ°āĻŦā§ āĻāĻž āĻāĻŽāĻĻā§āĻ āĻĻāĻžāĻ¤āĻžāĻ¸ā§āĻ° āĻāĻ° āĻ¸āĻŽāĻ¸āĻ¸ā§āĻ¤ āĻĻāĻžāĻ¤āĻž āĻā§ āĻāĻā§āĻ¤āĻž direction āĻ āĻ¤ā§āĻ˛ā§ āĻ§āĻ°āĻ¤ā§ āĻĒāĻžāĻ°āĻŦā§āĨ¤
PCA āĻāĻ¤āĻā§āĻ˛āĻŋ principal component āĻ¨āĻŋā§ā§ āĻāĻ āĻŋāĻ¤, āĻāĻ˛ā§āĻ¨ Principal component āĻāĻŋ āĻĻā§āĻā§ āĻ¨ā§ā§
Principal component : Principal component āĻšāĻ˛ā§ Initial Variable ( Raw dataset) āĻĨā§āĻā§ āĻ¸ā§āĻ°āĻ¸ā§āĻ¤ Linear combination or mixure āĻāĻ° āĻŽāĻžāĻ§ā§āĻ¯āĻŽā§ āĻāĻāĻāĻŋ New Variable( New Dataset).
New Variable āĻāĻŋ āĻāĻ¤āĻā§āĻ˛ Principal component āĻ¨āĻŋā§ā§ āĻāĻ¤āĻŋāĻ¤āĨ¤ Principal component āĻāĻ āĻŦāĻž āĻāĻāĻžāĻ§āĻŋāĻ āĻšāĻ¤ā§ āĻĒāĻžāĻ°ā§āĨ¤ āĻ āĻ°āĻĨāĻžāĻ¤ āĻāĻāĻāĻŋ āĻĄāĻžāĻ¤āĻžāĻ¸ā§āĻ¤ā§āĻ° āĻĄāĻŋāĻŽā§āĻ¨āĻļāĻ¨ āĻāĻĻāĻŋ ā§§ā§Ļā§Ļ āĻšāĻ āĻ āĻ¤āĻŋāĻŦā§ āĻ¤āĻžāĻ° principal component hobe 100āĻ¤āĻŋāĨ¤ Principal component āĻā§āĻ˛ āĻĻāĻžāĻ¤āĻžāĻ° āĻāĻ¨āĻĢāĻ°āĻŽā§āĻļāĻ¨ āĻāĻ° āĻāĻĒāĻ°ā§ āĻāĻŋāĻ¤ā§āĻ¤āĻŋ āĻāĻ°ā§ āĻ¨āĻŋāĻŽā§āĻ¨āĻā§āĻ¤ āĻŦāĻŋāĻ¨āĻžāĻ¸ āĻāĻāĻžāĻ°ā§ āĻ¸āĻžāĻāĻžāĻ¨ā§āĻž āĻĨāĻžāĻā§ âĻâĻâĻâĻ.
New variable āĻāĻŋ totally uncorrelated āĻšā§ā§ āĻĨāĻžāĻā§ āĻāĻŦāĻ Initail variable āĻāĻ° āĻ āĻ§āĻŋāĻāĻžāĻāĻļ āĻāĻ¨āĻĢāĻ°āĻŽā§āĻļāĻ¨ compressed āĻšā§ā§ 1st pricipal component create kore thake.
PCA try āĻāĻ°ā§ āĻ āĻ§āĻŋāĻāĻžāĻāĻļ āĻ¨āĻŋāĻ°āĻāĻ° āĻāĻ¨āĻĢāĻ°āĻŽā§āĻļāĻ¨ 1st principal component āĻ āĻ°āĻžāĻāĻžāĻ° āĻ¤āĻžāĻ°āĻĒāĻ° āĻ āĻŦāĻļāĻŋāĻˇā§āĻ āĻ āĻ§āĻŋāĻāĻžāĻāĻļ āĻāĻ¨āĻĢāĻ°āĻŽā§āĻļāĻ¨ 2nd principal component āĻ āĻ°āĻžāĻāĻžāĻ° āĻāĻŦāĻ āĻāĻāĻāĻžāĻŦā§ āĻāĻ¨āĻĢāĻ°āĻŽā§āĻļāĻ¨ āĻāĻ° āĻāĻĒāĻ°ā§ āĻāĻŋāĻ¤ā§āĻ¤āĻŋ āĻāĻ°ā§ principal component āĻāĻ° āĻŦāĻŋāĻ¨āĻžāĻ¸ create āĻšā§ā§ āĻĨāĻžāĻā§āĨ¤ āĻ¨āĻŋāĻŽā§āĻ¨ā§ āĻāĻŋāĻ¤ā§āĻ°ā§ āĻĻā§āĻāĻžāĻ¨ā§ āĻšāĻ˛ā§āĨ¤ picture hereâĻâĻ
āĻāĻĒāĻ°āĻŋāĻāĻā§āĻ¤ āĻŦāĻŋāĻ¨ā§āĻ¨āĻžāĻ¸(higher information to lower information) āĻāĻāĻžāĻ°ā§ principal components āĻā§āĻ˛āĻž āĻ¸āĻžāĻāĻŋā§ā§ āĻā§āĻŦ āĻ¸āĻšāĻā§ āĻāĻŽāĻ°āĻž āĻāĻŽ āĻāĻ¨āĻĢāĻ°āĻŽā§āĻļāĻ¨ āĻ¨āĻˇā§āĻ āĻāĻ°ā§ āĻāĻāĻāĻŋ Lower dimensional Dataset( new Dataset) create hoi.
āĻ¸ā§āĻ¤āĻ°āĻžāĻ, āĻāĻāĻāĻžāĻŦā§ lower information principal component āĻŦāĻžāĻĻ āĻĻāĻŋā§ā§ āĻ āĻŦāĻļāĻŋāĻ¸ā§āĻ principal component āĻ¨āĻŋā§ā§ initail varaiable(Raw Dataset) āĻĨā§āĻā§ new Variable ( new Dataset) create hoye thake.
Example : āĻāĻŽāĻ°āĻž āĻāĻžāĻ¨āĻŋ āĻāĻāĻāĻž āĻĻāĻžāĻ¤āĻžāĻ¸ā§āĻ¤ āĻāĻ° dimension jodi 100D hoye tobe tar principal component o hobe 100ti. PCA jokhon dimension reduction kore tokhon low variance feature ke bad diye higher dimension theke lower dimension dataset create kore thake.. Orthat optimal principal component khuje ber korar jonno PCA sob somoi low information feature or low variance data ke noise hisabe bibecona kore. Ei noise feature gula PCA bad diye ekta notun dataset create kore thake jar dimension hoii Main dataset er dimension theke onkk kom ( deoend on infomation gather by each princiapl component).
Dhoren main datasert er name A. A datser er dimension hosce 100D ebong er principal component o 100ti. 100 ti principal component er mjhee 1st 20 principal component e 95% data information hold kore.
PCA tokhon 21-100 porjonto dimension er data ke noise hisabe bibchone kore ogula reomve kore dibe. Baki 20 principle component niye ekta new Dataset create kore B.
sutrang PCA apply kore, B dataset ti A dataset 95% information hold kore, higher dimensional dataset (A: 100 dimension) theke lower dimensional dataset( B : 20 dimension) create korbe.
Note :: Dekha gese noise feature or low variance data gula suoervied