手机APP下载

您现在的位置: 首页 > 英语单词 > VOA词汇大师 > 正文

词汇大师第184期:词汇识别 Word Bursts

来源:可可英语 编辑:Jasmine   可可英语APP下载 |  可可官方微信:ikekenet
  下载MP3到电脑  [F8键暂停/播放]   批量下载MP3到手机

今天的《词汇大师》讨论的是计算机是如何识别文字。

就比如说海里有各种各样的鱼。每个种类的鱼代表一个单词。计算机会自动测算出该单词出现的次数,频率……

Broadcast on "Coast to Coast": April 10, 2003

AA: I'm Avi Arditti with Rosanne Skirble and this week on WORDMASTER — a way to find out what people are talking about. First, though, a story.

(SOUNDS OF RUSHING WATER)

RS: Imagine a waterway. Fish rush by. Lots and lots of different fish. A computer program counts how many of each kind of fish there are. Now imagine that each kind of fish is really a word. What the computer program counts, then, is how often each different word appears.

AA: A computer scientist named Jon Kleinberg has developed one such program. He's an associate professor at Cornell University in Ithaca, New York. The software does more than just count words. With enough computer power, it could analyze huge amounts of electronic content. For instance, what's on all the front pages of all the English-language newspapers on the Internet.

RS: Day after day, it could track the frequency of use for each word. When certain words start to get used a lot more often, say, this week than last week — "word bursts," they're called — that's a signal. It suggests that these words are suddenly topical. Jon Kleinberg says these word bursts? reveal what is on people's minds.

KLEINBERG: "One of the things that's going on the Web is that there's not just mainstream media — so things like the New York Times' homepage, CNN's homepage — but there are also tens of thousands of people who maintain these online journals."

RS: These journals are called "Web logs" — or simply "blogs" — and they have become, Jon Kleinberg says, a new type of medium.

"They're the same kind of populist sort of commentary and discussion that we got with personal homepages early on in the Web, and they're now doing that for current events and for news. And by watching what these people talk about, that's a very good leading indicator of trends that people, for example, on the Web are aware of."

AA: Jon Kleinberg has also looked backward for trends — for example, in the online archives of State of the Union speeches by U-S presidents. Words that appeared with particular increases in frequency tended to correspond to historical trends — not a surprise, he says.

KLEINBERG: "So in the 1930s we have words like 'banks,' 'depression,' 'recovery,' in the 1940s we have words like 'war' and 'atomic,' and then in the '50s words like 'Korea,' 'communist.' The point is, that's an example where we believe we know what we're going to be seeing. It's a way of sanity-checking what's happening, so that we can then try it on things where we don't necessarily know what to expect."

RS: "For example?"

KLEINBERG: "One thing which surprised me — and this was still in the context even of State of the Union addresses — is that once we get to the 1980s, words that have to do with historical events in the '80s get mixed in a lot more with particular rhetorical devices. So, sudden increases in words like 'communities' and 'American' and 'patriotism.' So we find that with the increasing dominance of professional speechwriting, we have certain words that simply were appearing every single year. And that's something which one sort of may have thought about at an intuitive level, but it shows up extremely strongly when one does this frequency analysis. So it's a way of quantitatively verifying a shift in the language used in speechwriting, for example."

RS: "Do you see anything in this work that tells us a little bit about who we are as Americans? Because you see the frequency of words, does it tell us — "

AA: "Where we're heading?"

RS: "Where we're heading, or where we've been?"

KLEINBERG: "I'm certainly heartened by all of the activity and things like the Web log community, which is really, I think, supplementing the mainstream news media with this very large additional set of outlets for opinions and commentary and expression. It's creating an extremely vibrant community, and I think that's an exciting development, certainly — and something that one can, again, hopefully track by being aware of the current topics of interest as manifested through choices of words."

AA: "Doesn't that just sort of feed on itself or create kind of a loop, where you know what words are on the rise so you start using them more?"

KLEINBERG: "There is this interesting feedback going on, that as you become explicitly aware of this notion of popularity, you — right, it feeds back on itself. One thing that helps alleviate that is this notion of 'burstiness' as being about change, not just about frequency. So we aren't just finding the most frequent words, but the words that are changing most sharply. So once something becomes popular, the fact that people continue using it no longer contributes to its change. It already is popular."

RS: "What about new words in the language?"

KLEINBERG: "At the moment methods like this are very good for catching the sudden appearance of coinages of new words in online media, simply because we have access to all that data. But I think this is something that could be used retrospectively to go back through books or newspapers over hundreds of years, trying to find the rise of words that are now quite common."

RS: In the long run, Jon Kleinberg of Cornell University says, the goal is to develop computer search engines that can catch ideas that are on the rise, and not just words.

AA: You'll find all of our words on the Web at voanews.com/wordmaster. And our e-mail address is word@voanews.com. With Rosanne Skirble, I'm Avi Arditti.

重点单词   查看全部解释    
address [ə'dres]

想一想再看

n. 住址,致词,讲话,谈吐,(处理问题的)技巧

 
reveal [ri'vi:l]

想一想再看

vt. 显示,透露
n. (外墙与门或窗之间的

 
recovery [ri'kʌvəri]

想一想再看

n. 恢复,复原,痊愈

 
depression [di'preʃən]

想一想再看

n. 沮丧,萧条

联想记忆
associate [ə'səuʃieit]

想一想再看

n. 同伴,伙伴,合伙人
n. 准学士学位获得

联想记忆
medium ['mi:diəm]

想一想再看

n. 媒体,方法,媒介
adj. 适中的,中等

联想记忆
track [træk]

想一想再看

n. 小路,跑道,踪迹,轨道,乐曲
v. 跟踪

 
popularity [.pɔpju'læriti]

想一想再看

n. 普及,流行,名望,受欢迎

联想记忆
extremely [iks'tri:mli]

想一想再看

adv. 极其,非常

联想记忆
popular ['pɔpjulə]

想一想再看

adj. 流行的,大众的,通俗的,受欢迎的

联想记忆

发布评论我来说2句

    最新文章

    可可英语官方微信(微信号:ikekenet)

    每天向大家推送短小精悍的英语学习资料.

    添加方式1.扫描上方可可官方微信二维码。
    添加方式2.搜索微信号ikekenet添加即可。