Disaster recovering

From OpenKM Documentation
Jump to: navigation, search

In case of hardware failure maybe your OpenKM repository became corrupted. In this case you should restore from the last backup (See Backup restoring). But what happen if the backup is missing or even also corrupted? Everything is lost? Well, not exactly. Depending on the kind of disaster you can recover at least the document content.


Nota clasica.png This recovering process will only work if OpenKM is configured to store the document content in a File DataStore.

By default, OpenKM is configured to use a File DataStore. This means thar every document content is stored in the filesystem. This give us very good performance when retrieving document content. But also has a hidden benefit. If your OpenKM installation has been damaged but you can read these DataStore file, you can recover these document contents.

To help up with the process, we have created a simple Bash script, called rescue.sh:

#!/bin/bash
declare -A MIMEXT
RESCUE="rescue"
mkdir -p $RESCUE
 
MIMEXT["application/octet-stream"]="bin"
MIMEXT["application/vnd.oasis.opendocument.text"]="ott"
MIMEXT["application/vnd.oasis.opendocument.presentation"]="odp"
MIMEXT["application/vnd.oasis.opendocument.spreadsheet"]="ots"
MIMEXT["application/vnd.oasis.opendocument.graphics"]="odg"
MIMEXT["application/vnd.oasis.opendocument.database"]="odb"
MIMEXT["application/msword"]="doc"
MIMEXT["application/vnd.ms-excel"]="xls"
MIMEXT["application/vnd.ms-powerpoint"]="pps"
MIMEXT["application/vnd.ms-project"]="mpp"
MIMEXT["application/vnd.ms-access"]="mdb"
MIMEXT["application/vnd.visio"]="vsd"
MIMEXT["application/x-mspublisher"]="pub"
MIMEXT["application/vnd.openxmlformats-officedocument.wordprocessingml.document"]="dotx"
MIMEXT["application/vnd.openxmlformats-officedocument.presentationml.presentation"]="ppsx"
MIMEXT["application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"]="xltx"
MIMEXT["image/gif"]="gif"
MIMEXT["image/jpeg"]="jpg"
MIMEXT["image/png"]="png"
MIMEXT["image/tiff"]="tif"
MIMEXT["image/bmp"]="bmp"
MIMEXT["image/x-ico"]="ico"
MIMEXT["image/x-psd"]="psd"
MIMEXT["image/x-freehand"]="fh9"
MIMEXT["application/illustrator"]="ai"
MIMEXT["application/zip"]="zip"
MIMEXT["application/x-rar-compressed"]="rar"
MIMEXT["application/x-tar"]="tar"
MIMEXT["application/x-gzip"]="gz"
MIMEXT["application/x-compressed-tar"]="tgz"
MIMEXT["application/xml"]="xml"
MIMEXT["text/plain"]="properties"
MIMEXT["text/html"]="html"
MIMEXT["text/css"]="css"
MIMEXT["text/csv"]="csv"
MIMEXT["text/xml"]="xml"
MIMEXT["text/x-sql"]="sql"
MIMEXT["text/x-java"]="java"
MIMEXT["application/pdf"]="pdf"
MIMEXT["application/rtf"]="rtf"
MIMEXT["application/x-shellscript"]="sh"
MIMEXT["application/x-perl"]="pl"
MIMEXT["application/x-revelation"]="rvl"
MIMEXT["application/x-argouml"]="zargo"
MIMEXT["application/x-planner"]="planner"
MIMEXT["application/x-php"]="php"
MIMEXT["audio/mpeg"]="mp3"
MIMEXT["audio/x-ogg"]="ogg"
MIMEXT["video/x-flv"]="flv"
MIMEXT["video/mp4"]="mp4"
MIMEXT["video/mpeg"]="mpg"
MIMEXT["video/x-msvideo"]="avi"
MIMEXT["video/x-ms-wmv"]="wmv"
MIMEXT["application/x-shockwave-flash"]="swf"
MIMEXT["image/vnd.dxf"]="dxf"
MIMEXT["image/vnd.dwg"]="dwg"
MIMEXT["application/x-jasper"]="jasper"
MIMEXT["application/x-jrxml"]="jrxml"
MIMEXT["application/x-bsh"]="bsh"
MIMEXT["application/x-java-archive"]="jar"
MIMEXT["application/postscript"]="ps"
MIMEXT["application/x-report"]="rep"
MIMEXT["audio/x-wav"]="wav"
MIMEXT["image/svg+xml"]="svg"
MIMEXT["image/x-ms-bmp"]="bmp"
 
for DOC in $(find -type f); do
  FILE=$(basename $DOC)
  MIME=$(file -i $DOC | cut -d' ' -f2 | cut -d';' -f1)
  EXT=${MIMEXT[$MIME]}
 
  if [ ! $EXT ]; then
    echo "Missing MIME type: $MIME"
    EXT="bin"
  fi
 
  cp -v $DOC $RESCUE/$FILE.$EXT
done

This is the BeanShell code to generate the associative array from OpenKM registered MIME types:

import com.openkm.dao.*;
import com.openkm.dao.bean.*;
 
for (MimeType mt : MimeTypeDAO.findAll("mt.id")) {
    mime = mt.getName();
    ext = mt.getExtensions().iterator().next();
 
    if ("application/octet-stream".equals(mime)) {
        ext = "bin";
    }
 
    print("MIMEXT[\"" + mime + "\"]=\"" + ext + "\"<br/>");
}

You need to run this script from $TOMCAT_HOME directory. All the recovered documents will be copied to the $TOMCAT_HOME/rescue directory. This script is only a guide and should be improved to support more MIME types.


Nota clasica.png Recovered document name are not the original ones. They will be like e5de262a84a39f553392751ffd9f4c56796c0029.pdf.

If you have a functional OpenKM installation but want to recover deleted documents from a damaged backup you can also make use of this script. After you have executed the rescue.sh script you will have a lot of file under the rescue directory. To remove the document already have in the current OpenKM installation, copy the rescue directory to the working OpenKM installation and execute this script, called prune.sh:

#!/bin/bash
RESCUE="rescue"
 
for DOC in $(find repository/repository/datastore -type f); do
  FILE=$(basename $DOC)
 
  if [ -f $RESCUE/$FILE.* ]; then
    rm $RESCUE/$FILE.*
  fi
done