DetEval - The command line version

Back to the main page for the DetEval software.

Table of contents

You need to cite the IJDAR paper in all publications which describe work for which you used the DetEval tool.

Differences between the Linux and Windows version

The commands do not have a suffix (rocplot, evalplots etc.) The ruby commands have suffix .rb, (rocplot.rb, evalplots.rb)

In the following, all examples are given in the Linux syntax.

Creating ROC plots

The script rocplot produces a ROC curve. It takes as arguments the groundtruth XML file as well as a list of detection XML files, where each detection XML files corresponds to a detection run for a single detection parameter value:

rocplot groundtruth.xml { results-run1.xml }
for instance:
rocplot groundtruth.xml det-param-034.xml det-param-076.xml det-param-110.xml det-param-234.xml det-param-343.xml 
or, simply
rocplot groundtruth.xml det-param-*.xml

This will create a directory roc-curve with the following ASCII text files:

roc-curve/Recall Object count recall as a function the detection parameter
roc-curve/Precision Object count precision as a function of the detection parameter
roc-curve/Harmonic Mean The harmonic mean of object count recall and precision as a function the detection parameter

The data can be plotted using Matlab, Excel or Gnuplot. It can also be done automatically by the script by adding the doplot option, which creates outputs in PDF and PNG format:

rocplot --doplot=true groundtruth.xml { results-run1.xml }

This will create the two additional pdf files as well as two additinal png files:

roc-curve/rocplot.pdf In format PDF: The ROC curve itself including the equal error rate diagonal line (recall=precision)
roc-curve/rocplot.png In format PNG: The ROC curve itself including the equal error rate diagonal line (recall=precision)
roc-curve/fileplot.pdf In format PDF: A curve showing recall and precision on the y-axis as a function of the detection parameter on the x-axis
roc-curve/fileplot.png In format PNG: A curve showing recall and precision on the y-axis as a function of the detection parameter on the x-axis

Examples:

Various properties of rocplot can be configured, as for example the detection quality constraints (see above), a list of ignored object types etc.:

usage: rocplot [ options ] groundtruth-xml { det-xmls } 

Options are of the format --

Creating quality/quantity plots

The tool evalplots creates plots showing recall and precision for varying constraints on detection quality (related to the amount of overlap, see above). It takes as input files a single detection result XML file and a single ground truth result XML file and creates two directories evalcurve-tr and evalcurve-tp with ASCII files containing the data for the plots of Recall, Precision and their harmonic mean:

evalplots results-run1.xml groundtruth.xml

This will creates the following files:

evalcurve-tr/Recall Object count recall as a function of tr (object area recall)
evalcurve-tr/Precision Object count precision as a function of tr (object area recall)
evalcurve-tr/Harmonic Mean The harmonic mean of object count recall and precision as a function of tr (object area recall)
evalcurve-tp/Recall Object count recall as a function of tp (object area precision)
evalcurve-tp/Precision Object count precision as a function of tp (object area precision)
evalcurve-tp/Harmonic Mean The harmonic mean of object count recall and precision as a function of tp (object area precision)

The data can be plotted using Matlab, Excel or Gnuplot. As for the tool rocplot, plotting can be done automatically by the script using the doplot option:

evalplots --doplot=true results-run1.xml groundtruth.xml

Examples:

Various properties of evalplots can be configured, as for example the ranges of the two varying parameters (tp and tr), the fixed value of the parameter which is not varied (tp is fixed when tr is varied, tr is fixed when tp is varied) etc.:

usage: evalplots [ options ] <det-results-xml> <groundtruth-xml>

Options are of the format --<option>=<value>

Available options:
tr-beg ............. start of the range for the tp threshold (default: 0.1)
tr-end ............. end   of the range for the tp threshold (default: 1.0)
tr-stp ............. step  of the range for the tp threshold (default: 0.1)
tr-stp ............. value of the tp threshold when it is fixed (default: 0.4)
tp-beg ............. start of the range for the tp threshold (default: 0.1)
tp-end ............. end   of the range for the tp threshold (default: 1.0)
tp-stp ............. step  of the range for the tp threshold (default: 0.1)
tp-stp ............. value of the tp threshold when it is fixed (default: 0.8)
thr-center-diff .... threshold on the distance between the centers
                     of two rectangles (default: 0.0)
thr-border.......... threshold on the relative horizontal border
                     difference of two rectangles (default: 0.0)
dir-curve-tr ....... dirctory where the curve for varying tr is written
                     (default: evalcurve-tr)
dir-curve-tp ....... dirctory where the curve for varying tp is written
                     (default: evalcurve-tp)
ignore-det ......... How many levels in the image name path do we ignore
                     (detection-xml)
ignore-gt .......... How many levels in the image name path do we ignore
                     (groundtruth-xml)                     
object-types ....... Specify a comme separated list of object types which shall
                     shall be loaded from the groundtruth, all other objects
                     will be ignored in the groundtruth (but not in the 
                     detection lists!). Default: all objects will be loaded.
doplot ............. true if pdf and png plots shall be created automatically.
                     (default: false)

Access to the evaluation details

The tools rocplot and evalplots use highly configurable low level programs which can be used directly (but manually) in order to adapt the output to ones wishes, or to access details of the evaluation, for instance evalaution results for each image. In this manual mode, the evaluation on a dataset can be done in two steps: in a first evaluation step, the tool evaldetection produces detailed results for each entry and collects them in a single file. In a second step, the tool readdeteval reads the evaluation report and produces global statistics.

Perform the evaluation on a data set

The executable evaldetection performs the actual evaluation for each entry in the data set, i.e. for each detection result of a single image, and produces a detailed report which is written to standard output. The input data is provided in two XML files, one containing the detection results and one containing the ground truth description. Various options change the default behaviour:

usage:  
   evaldetection [options]  detection-xml groundtruth-xml

   -p <options>   Set the evaluation parameters: 
                  -p <a>,<b>,<c>,<d>,<e>,<f>,<g>,<h>,<i>,<j>
                  Default values: 0.8,0.4,0.8,0.4,0.4,0.8,0,1
   -d <count>     How many levels in the image name path do we ignore
                  (detection-xml)
   -g <count>     How many levels in the image name path do we ignore
                  (groundtruth-xml)
   -z             Ignore detected rectangles having zero coordinates
   -v             Print version string

An example of a command could be:

evaldetection results-run1.xml groundtruth.xml > eval-details.xml

If the image filenames in the XML files and their paths are not correct, then the -d and -g options may be used to remove levels from the path. This is usefull if the XML file has been created on a different machine than the one where it is used, and if the author of the XML file entered the full path to directories which do not exist on the destination machine.

The evaluation results are given in XML format, as the following example indicates:

<?xml version="1.0" encoding="UTF-8"?>
<evaluationSet>
  <evaluation noImages="1" imageName="images/image1.jpg">    
    <icdar2003 r="0.494285" p="0.494285" hmean="0.494285" noGT="3" noD="3"/>
    <score r="0" p="0" hmean="0" noGT="3" noD="3"/>
  </evaluation>
  <evaluation noImages="1" imageName="images/image1.jpg">    
    <icdar2003 r="0.716359" p="0.487149" hmean="0.579927" noGT="8" noD="11"/>
    <score r="0.575" p="0.345455" hmean="0.431605" noGT="8" noD="11"/>
  </evaluation>
</evaluationSet>

Each result detail is put in a an <evaluation> tag, which includes tags which different (complementary) information. The results calculated using our proposed evaluation measure (described here) are in the <score> tag. The results calculated using the measure used in the framework of the ICDAR 2003 competition are in the <icdar2003> tag.

The results themselves are structured as follows: 'p' stands for precision, 'r' for recall and 'hmean' for the harmonic mean between precision and recall. 'noGT' is the number of ground truth rectangles and 'noD' is the number of detected rectangles.

Calculate statistics on a set of evaluation results

The detailed results produced by the evaldetection tool are not very usefull for experiments with large data sets. The tool readdeteval is able to read a lengthy XML file with detailed information and produce statistics on it:
usage: readdeteval [ options ] input-xmlfile [ input-xmlfile2]

  The XML file(s) must contain one of the following root-tags:
    <evaluationSet>    Evaluations for different images: calculate the
                       performance measures for the total set.
    <evaluationSeries> Evaluations for different parameters. Create a
                       plain text file for input to gnuplot.

  OPTIONS:
    [ -g ]             create a series with falling generality
                       (only tresholded Prec or Rec / generality)
    [ -M ]             treat mode estimation evaluation results
    [ -n <count> ]     restrict number of evaluations to <count>
    [ -p <par-value> ] print the given value into the "p" field
                       when writing an output <evaluation> tag.
    [ -s <prefix> ]    The prefix for the 3 files containing the
                       the detailed output (recall,precision,hmean)
    [ -c <cmptype> ]   compares two XML files (the second file must
                       be specified!) cmptype:
                       a ... all differences
                       g ... results where rec or prec is greater in 
                             the first file
                       l ... results where rec or prec is less in 
                             the first file
    [ -L ]             print the output in LaTeX tabular format.

The output of the readdeteval command is an XML file (written to standard output) in the same format as the input files. This way, several outputs may be concatenated and resubmitted to the readdeteval command. The noImages field (number of images) is updated (i.e., summed) at each run. Additionally, general information is written to standard error.

An example of a command could be:

readdeteval eval-details.xml

with the following example of results (standard error and standard output):

Total-Number-Of-Processed-Images: 2
100% of the images contain text.
Generality: 5.5
Inverse-Generality: 0.181818
  <evaluation noImages="2">
    <icdar2003 r="0.655793" p="0.488678" hmean="0.560035" noGT="11" noD="14"/>
    <score r="0.418182" p="0.271429" hmean="0.32919" noGT="11" noD="14"/>
  </evaluation>

Creating result tables in a LaTeX document can be done easy by providing the -L option, which changes the output to the following format:

ICDAR 65.6 & 48.9 & 56 \\
CRISP 41.8 & 27.1 & 32.9 \\

Manually creating arbitrary plots

Often one wishes to compare results for different parameters of an object detection algorithm, thus also creating multiple evaluation results. The tool readdeteval provides an easy way to process these multiple experiments:

Comparing two detection evaluation results

After two test runs of an detection algorithm with two different parameters on a set of multiple test images, sometimes one wants to know which images produced different results, better or worse. This is is possible with the two tools already presented:

You need to cite the IJDAR paper in all publications which describe work for which you used the DetEval tool.

Back to the main page for the DetEval software.