Jamshydf 2016-07-24
I'm using an open-source software (DokuWiki) for my new Farsi thesaurus website. This is due to budget constraints but also because of the structure that I'm imposing on the Farsi thesaurus.

For my work, I used the model originally set by Roget for his English Thesaurus. The latest British edition uses the same schema that includes opposite categories. So the antonyms are already woven into the hierarchy.

The structure involves a hierarchy of meanings but at it's most basic it means that words are divided into two types: headers and the rest. The headers are limited to about 7000 words or expressions. And there are rules. The headers may appear within the word list of another header but such appearances may be due to a different meaning of the word. In which case, they would not be treated as a link to the same-word-header. To limit the number of such occurrences, I've tried to pick words for headers that do not have many meanings and are rather precise. Arabic words in Farsi usually fit this description and that's why most of my header words are of Arabic origin.

The headers are limited to about 7000 words > For the past year while I've been using the new tool (wiki, DokuWiki), I'm coming to the conclusion that the number of headers should be even less. The restriction on the number of headers should make it easier to navigate the thesaurus but it would also help reduce its complexity during its (open-ended) development.

