The following appeared at the top of the word list:

     UK English Wordlist With Frequency Classification

The character set for words in this list is: lowercase
letters, hyphen, apostrophe. There are no proper names and
no accented letters.

Abbreviations are excluded. Colloquialisms and archaisms
are generally excluded. Both -ise and -ize spellings are
included.

                               Brian Kelk bck1@cl.cam.ac.uk
                               April 1998

Here are bits of a brief conversation I had with the author:

From: Brian Kelk <Brian.Kelk@cl.cam.ac.uk>
Date: Sat, 08 Jul 2000 20:27:21 +0100

> I was wondering what the copyright status of your "UK English Wordlist
> With Frequency Classification" word list as it seems to be lacking any
> copyright notice.  Also, how did you arrive at the "Frequency
> Classification".

There were many many sources in total, but any text marked
"copyright" was avoided. Locally-written documentation was one
source. An earlier version of the list resided in a filespace
called PUBLIC on the University mainframe, because it was
considered public domain.

Briefly about frequency: rather than counting occurrences of
a word this classification is more along the lines of counting
the number of texts in which the word occurs. That way you
get some noise immunity, which you very much need. It's based
on maybe 5-10 million words of text on the Cambridge mainframe
in the 1980s. I had in mind that it might be useful for ranking
possible corrections ...

Date: Tue, 11 Jul 2000 19:31:34 +0100

> So are you saying your word list is also in the public domain?

That is the intention.




