Jump to content

User:TheFearow/PopularWords

From Wikipedia, the free encyclopedia

I am currently working on some statistics on the most popular words that are used. At the moment my main studies are in titles, partially because a dump of the didles is a 20mb download, and one of the articles is just over 2gb.

Data Source

[edit]

I am using the slightly outdated database dumps, as screen scraping all 1.8 million entries even if I was using 100 entries a page would result in over 18000 page views (which i'm not sure I would be loved for).

Processing

[edit]

I am doing the processing using a custom written Java application. I will consider publicising the source at a later date, once I get the bugs worked out and make it tidier.

Results

[edit]

The results will be on the following pages: