Automatic key extraction full example

From OpenKM Documentation
Revision as of 16:27, 20 September 2010 by Jllort (talk | contribs)

Jump to: navigation, search

SVN checkout modules

To creating KEA model must checkout openkm and thesaurus modules:

Select the svn type and type the url https://openkm.svn.sourceforge.net/svnroot/openkm/trunk/openkm to refer openkm:

Select the svn type and type the url https://openkm.svn.sourceforge.net/svnroot/openkm/trunk/thesaurus to refer thesaurus:


Installing openkm classes into maven repository

Ensure you've intalled openkm into your local maven repository, to ensure it you can execute the command:

mvn clean package install -Dmaven.test.skip=true


Donwloading AGROVOC thesaurus

We'll use agrovoc for testing purposes, you can downloading from http://oaei.ontologymatching.org/2007/environment/ please read terms of use.


Copy into thesaurus/src/test/resources/vocabulary folder the file ag_skos_20070219.rdf Into vocabulary folder there's testdocs folders are some agrovoc training docs to creating KEA module.


Create runtime configuration

Now we can create runtime configuration, it must be executed the ModelBuilder class with some params


Okm installation guide 004.jpeg


For training KEA module is needed execute ModelBuilder class with that params:

sourceFolder 
trainingFolder 
vocabularyFile 
vocabularyType
stopwordFile 
modelFileName 
porterStemmerClass 
stopwordClass 
language 
documentEncoding
testDocs


In my case sourceFolder=/home/jllort/softwareFactoryGalileo/thesaurus/vocabulary ( all path are relative to sourceFolder ) trainingFolder=testdocs/en/train vocabularyFile=ag_skos_20070219.rdf vocabularyType=skos stopwordFile=stopwords_en.txt modelFileName=ag_skos_20070219.model porterStemmerClass=com.openkm.kea.stemmers.PorterStemmer stopwordClass=com.openkm.kea.stopwords.StopwordsEnglish language=en documentEncoding=UTF-8 testDocs=testdocs/en/test


The params to execute ModelBuilder class are "/home/jllort/softwareFactoryGalileo/thesaurus/vocabulary testdocs/en/train ag_skos_20070219.rdf skos stopwords_en.txt ag_skos_20070219.model com.openkm.kea.stemmers.PorterStemmer com.openkm.kea.stopwords.StopwordsEnglish en UTF-8 testdocs/en/test" and VM argument "-Xmx526M" as you can see in next screenshot

Okm installation guide 005.jpeg


Classpath must be shown as


Okm installation guide 006.jpeg