Thumbnail
Access Restriction
Open

Author Pennington, Jeffrey ♦ Socher, Richard ♦ Manning, Christopher D.
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Vector Space Representation ♦ Related Model ♦ Nonzero Element ♦ Word-word Co-occurrence Matrix ♦ Recent Word Analogy Task ♦ Statistical Information ♦ Entity Recognition ♦ Local Context Window Method ♦ Syntactic Regularity ♦ New Global Log-bilinear Regression Model ♦ Major Model Family ♦ En-tire Sparse Matrix ♦ Word Vector ♦ Recent Method ♦ Word Representation ♦ Global Vector ♦ Large Corpus ♦ Vector Space ♦ Individual Context Window ♦ Simi-larity Task ♦ Global Matrix Factorization ♦ Model Property ♦ Meaningful Sub-structure
Description Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arith-metic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global log-bilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word co-occurrence matrix, rather than on the en-tire sparse matrix or on individual context windows in a large corpus. The model pro-duces a vector space with meaningful sub-structure, as evidenced by its performance of 75 % on a recent word analogy task. It also outperforms related models on simi-larity tasks and named entity recognition. 1
In EMNLP
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 2014-01-01