Wednesday, January 30, 2008

Singular Value Decomposition (SVD) and Search Engines (SEO)

Singular Value Decomposition (SVD) is a powerful and fully automatic statistical method used by Latent Semantic Analysis (LSA). The SVD algorithm is O(N2 k3), where N is the number of terms + documents, k is the number of dimensions in concept space. The SVD algorithm is unusable for a large, dynamic collection because it is hard to find the number of dimensions.

Latent Semantic Indexing (LSI) is slow because of using this SVD method to create concept spaces. LSI assumes that there is some underlying or latent structure in word usage that is partially obscured by variability in word choice. So, a truncated Singular Value decomposition (SVD) is used to estimate the structure in word usage across documents. Retrieval is then performed using the database of singular values and vectors obtained from the truncated SVD. Data shows that these statistically derived vectors are more robust indicators of meaning than of individual terms.

SVD and LSI are least-squares methods. The projection into the latent semantic space is chosen so that the representations in the original space are changed as little as possible when measured by the sum of the squares of these differences. The projection transforms a document's vector in n-dimensional word space into a vector in the k-dimensional reduced space.

One can conclude or prove that SVD is unique, that is, there is only one possible decomposition of a given matrix. Because SVD finds an optimal projection to a low dimensional space, that is the key property for word co-occurrence patterns.

Jose Nuez is a Scientific SEO/SEM Specialist, PHD in Computer Engineering Technology. He also also owns and operates Search Engines By Hand an online resource focusing on Search Engines (SE) and Artificial Inteligence (AI)Camala Blog18872
Barb Blog82843

0 Comments:

Post a Comment

<< Home

Besucherza sexsearch