Jump to content

Talk:Distributional semantics

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Sahlgren

[edit]

The last line of the first paragraph wasn't really informative, but it's reference seemed useful so I deleted the line and added the reference to external links. Kyoakoa (talk) 17:54, 8 August 2011 (UTC)[reply]

Julia Morozova

[edit]

I propose to edit References so that they look more consistent. Also I'm adding a section about Applications.


The new look of the article will be as follows:

Distributional semantics is a research area that develops and studies theories and methods for quantifying and categorizing semantic similarities between linguistic items based on their distributional properties in large samples of language data. The basic idea of distributional semantics can be summed up in the so-called Distributional hypothesis: linguistic items with similar distributions have similar meanings.

Distributional Hypothesis

[edit]

The Distributional Hypothesis in linguistics is derived from the semantic theory of language usage, i.e. words that are used and occur in the same contexts tend to purport similar meanings. [1] The underlying idea that "a word is characterized by the company it keeps" was popularized by Firth. [2] The Distributional Hypothesis is the basis for Statistical Semantics. Although the Distributional Hypothesis originated in Linguistics, [3] it is now receiving attention in Cognitive Science especially regarding the context of word use. [4] In recent years, the distributional hypothesis has provided the basis for the theory of similarity-based generalization in language learning: the idea that children can figure out how to use words they've rarely encountered before by generalizing about their use from distributions of similar words. [5] [6] The distributional hypothesis suggests that the more semantically similar two words are, the more distributionally similar they will be in turn, and thus the more that they will tend to occur in similar linguistic contexts. Whether or not this suggestion holds has significant implications for both the data-sparsity problem in computational modeling, and for the question of how children are able to learn language so rapidly given relatively impoverished input (this is also known as the problem of the poverty of the stimulus).

Distributional semantic modeling

[edit]

Distributional semantics favor the use of linear algebra as computational tool and representational framework. The basic approach is to collect distributional information in high-dimensional vectors, and to define distributional/semantic similarity in terms of vector similarity. Different kinds of similarities can be extracted depending on which type of distributional information is used to collect the vectors: topical similarities can be extracted by populating the vectors with information on which text regions the linguistic items occur in; paradigmatic similarities can be extracted by populating the vectors with information on which other linguistic items the items co-occur with. Note that the latter type of vectors can also be used to extract syntagmatic similarities by looking at the individual vector components.

The basic idea of a correlation between distributional and semantic similarity can be operationalized in many different ways. There is a rich fauna of computational models implementing distributional semantics, including Latent semantic analysis (LSA), [7] Hyperspace Analogue to Language (HAL), syntax- or dependency-based models, [8] Random indexing, and various variants of the Topic model.

Distributional semantic models differ primarily with respect to the following parameters:

Distributional semantic models that use linguistic items as context have also been referred to as word space models [9] [10]

Compositional Distributional Semantics

[edit]

Compositional distributional semantic models are an extension of distributional semantic models that characterize the semantics of entire phrases or sentences. This is achieved by composing the distributional representations of the words that sentences contain. Different approaches to composition have been explored, and are under discussion at established workshops such as SemEval. [11]

Simpler non-compositional models fail to capture the semantics of larger linguistic units as they ignore grammatical structure and logical words, which are crucial for their understanding.

Applications

[edit]

Distributional semantic models were successfully applied for the following tasks:

  • finding semantic similarity between words and multi-word expressions;
  • word clustering based on semantic similarity;
  • automatic creation of thesauri and bilingual dictionaries;
  • lexical ambiguity resolution;
  • expanding search requests using synonyms and associations;
  • defining the topic of a document;
  • document clustering for information retrieval;
  • data mining and named entities recognition;
  • creating semantic maps of different subject domains;
  • paraphrasing;
  • sentiment analysis;
  • modeling of selectional preferences of words.

Software

[edit]

See also

[edit]

References

[edit]
  1. ^ Harris 1954
  2. ^ Firth 1957
  3. ^ Sahlgren 2008
  4. ^ McDonald & Ramscar 2001
  5. ^ Gleitman 2002
  6. ^ Yarlett 2008
  7. ^ Deerwester et al. 1990
  8. ^ Padó & Lapata 2007
  9. ^ Schütze 1993
  10. ^ Sahlgren 2006
  11. ^ "SemEval-2014, Task 1".

Sources

[edit]
  • Harris, Z. (1954). "Distributional structure". Word. 10 (23): 146–162. {{cite journal}}: Invalid |ref=harv (help)
  • Firth, J.R. (1957). "A synopsis of linguistic theory 1930-1955". Studies in Linguistic Analysis. Oxford: Philological Society: 1–32. {{cite journal}}: Invalid |ref=harv (help) Reprinted in F.R. Palmer, ed. (1968). Selected Papers of J.R. Firth 1952-1959. London: Longman.
  • Sahlgren, Magnus (2008). "The Distributional Hypothesis" (PDF). Rivista di Linguistica. 20 (1): 33–53. {{cite journal}}: Invalid |ref=harv (help)
  • McDonald, S.; Ramscar, M. (2001). "Testing the distributional hypothesis: The influence of context on judgements of semantic similarity". Proceedings of the 23rd Annual Conference of the Cognitive Science Society. pp. 611–616. CiteSeerX 10.1.1.104.7535. {{cite conference}}: Invalid |ref=harv (help); Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Gleitman, Lila R. (2002). "Verbs of a feather flock together II: The child's discovery of words and their meanings". The Legacy of Zellig Harris: Language and information into the 21st century: Philosophy of science, syntax and semantics. Current issues in Linguistic Theory. 1. John Benjamins Publishing Company: 209–229. {{cite journal}}: Invalid |ref=harv (help)
  • Yarlett, D. (2008). Language Learning Through Similarity-Based Generalization (PDF) (PhD thesis). Stanford University. {{cite thesis}}: Invalid |ref=harv (help)
  • Deerwester, Scott; Dumais, Susan T.; Furnas, George W.; Landauer, Thomas K.; Harshman, Richard (1990). "Indexing by Latent Semantic Analysis" (PDF). Journal of the American Society for Information Science. 41 (6): 391–407. doi:10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9. {{cite journal}}: Invalid |ref=harv (help)
  • Padó, Sebastian; Lapata, Mirella (2007). "Dependency-based construction of semantic space models". Computational Linguistics. 33 (2): 161–199. {{cite journal}}: Invalid |ref=harv (help)
  • Schütze, Hinrich (1993). "Word Space". Advances in Neural Information Processing Systems 5. pp. 895–902. CiteSeerX 10.1.1.41.8856. {{cite conference}}: Invalid |ref=harv (help); Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Sahlgren, Magnus (2006). The Word-Space Model (PDF) (PhD thesis). Stockholm University. {{cite thesis}}: Invalid |ref=harv (help)
  • Thomas Landauer, Susan T. Dumais. "A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge". Retrieved 2007-07-02. {{cite web}}: Invalid |ref=harv (help)
  • Kevin Lund, Curt Burgess, Ruth Ann Atchley (1995). Semantic and associative priming in a high-dimensional semantic space. Cognitive Science Proceedings. pp. 660–665. {{cite conference}}: Invalid |ref=harv (help)CS1 maint: multiple names: authors list (link)
  • Keving Lund, Curt Burgess (1996). "Producing high-dimensional semantic spaces from lexical co-occurrence". Behavior Research Methods, Instruments & Computers. 28 (2): 203–208. {{cite journal}}: Invalid |ref=harv (help)
[edit]
[edit]

Hello fellow Wikipedians,

I have just modified one external link on Distributional semantics. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 12:23, 11 September 2017 (UTC)[reply]