[ Pobierz całość w formacie PDF ]
model to predict the results of one such study, to provide an illustration of how it can be applied to
a task of this kind.
Sereno, Pacht, and Rayner (1992) conducted a study in which the eye movements of
participants were monitored while they read sentences containing ambiguous words. These
ambiguous words were selected to have one highly dominant meaning, but the sentences
established a context that supported the subordinate mearning. For example, one sentence read
The dinner party was proceeding smoothly when, just as Mary was serving the port,
one of the guests had a heart attack.
where the context supported the subordinate meaning of PORT. The aim of the study was to
establish whether reading time for ambiguous words was better explained by the overall frequency
with which a word occurs in all its meanings or senses, or the frequency of a particular meaning.
To test this, participants read sentences containing either the ambiguous word, a word with
frequency matched to the subordinate sense (the low-frequency control), or a word with frequency
matched to the dominant sense (the high-frequency control). For example, the control words for
PORT were VEAL and SOUP respectively. The results are summarized in Table 3: ambiguous words
using their subordinate meaning were read more slowly than words with a frequency
corresponding to the dominant meaning, although not quite as slowly as words that match the
frequency of the subordinate meaning. A subsequent study by Sereno, O Donnell, and Rayner
(2006, Experiment 3) produced the same pattern of results.
Reading time studies present a number of challenges for computational models. The study
of Sereno et al. (1992) is particularly conducive to modeling, as all three target words are
substituted into the same sentence frame, meaning that the results are not affected by sentences
differing in the number of words in the vocabulary of the models or other factors that introduce
additional variance. However, in order to model these data we still need to make an assumption
about the factors influencing reading time. The abstract computational-level analyses provided by
generative models do not make assertions about the algorithmic processes underlying human
Topics in semantic representation 48
cognition, and can consequently be difficult to translate into predictions about the amount of time it
should take to perform a task. In the topic model, there are a variety of factors that could produce
an increase in the time taken to read a particular word. Some possible candidates include
uncertainty about the topic of the sentence, as reflected in the entropy of the distribution over
topics, a sudden change in perceived meaning, producing a difference in the distribution over
topics before and after seeing the word, or simply encountering an unexpected word, resulting in
greater effort for retrieving the relevant information from memory. We chose to use only the last of
these measures, being the simplest and the most directly related to our construal of the
computational problem underlying linguistic processing, but suspect that a good model of reading
time would need to incorporate some combination of all of these factors.
Letting wtarget be the target word and wsentence be the sequence of words in the sentence
before the occurrence of the target, we want to compute P (wtarget|wsentence). Applying Equation
8, we have
P (wtarget|wsentence) = P (wtarget|z)P (z|wsentence) (10)
z
where P (z|wsentence) is the distribution over topics encoding the gist of wsentence. We used the
1700 topic solution to compute this quantity for the 21 of the 24 sentences used by Sereno et
al. (1992) for which all three target words appeared in our vocabulary, and averaged the resulting
log probabilities over all sentences. The results are shown in Table 3. The topic model predicts the
results found by Sereno et al. (1992): the ambiguous words are assigned lower probabilities than
the high-frequency controls, although not quite as low as the low-frequency controls. The model
predicts this effect because the distribution over topics P (z|wsentence) favors those topics that
incorporate the subordinate sense. As a consequence, the probability of the target word is reduced,
since P (wtarget|z) is lower for those topics. However, if there is any uncertainty, providing some
residual probability to topics in which the target word occurs in its dominant sense, the probability
of the ambiguous word will be slightly higher than the raw frequency of the subordinate sense
suggests.
Topics in semantic representation 49
Insert Table 3 about here
For comparison, we computed the cosine and inner product for the three values of wtarget
and the average vectors for wsentence in the 700 dimensional LSA solution. The results are shown
in Table 3. The cosine does not predict this effect, with the highest mean cosines being obtained by
the control words, with little effect of frequency. This is due to the fact that the cosine is relatively
insensitive to word frequency, as discussed above. The inner product, which is sensitive to word
frequency, produces predictions that are consistent with the results of Sereno et al. (1992).
Semantic intrusions in free recall
Word association involves making inferences about the semantic relationships among a pair
of words. The topic model can also be used to make predictions about the relationships between
multiple words, as might be needed in episodic memory tasks. Since Bartlett (1932), many
memory researchers have proposed that episodic memory might not only be based on specific
memory of the experiences episodes but also on reconstructive processes that extract the overall
theme or gist of a collection of experiences.
One procedure for studying gist-based memory is the Deese-Roediger-McDermott (DRM)
paradigm (Deese, 1959; Roediger & McDermott, 1995). In this paradigm, participants are
instructed to remember short lists of words that are all associatively related to a single word (the
[ Pobierz całość w formacie PDF ]