SYNOPSIS
STDEval.pl -e ecffile -r rttmfile -s stdfile -t termfile [-F fthresh]
[-S sthresh] [-I name] [-E] [-C [filename]] [-a [filename]] [-O [file-
name]] [-o [filename]] [-d filename] [-D filename] [-H folder] [-Q
folder] [-q attribute] [-w] [-T [set_name:]termid[,termid[, ...]]] [-Y
[set_name:]sourcetype[,sourcetype[, ...]]] [-N file/channel[,file/chan-
nel[, ...]]] [-A] [-c cachefile] [-k value] [-K value] [-P] [-n value]
[-p value]
DESCRIPTION
Evaluation Toolkit (STDEval) Software. Develop standard technology
evaluation tools and administer the open evaluation of Spoken Term
Detection technologies.
USAGE
The release contains test files and example files in the test_suite
directory. The 'test2.*' files comprise a complete set of input files
supplied as an example usage of the evaluation tool. The following
command generates two reports, 'example.occ.txt' and 'example.ali.txt'.
A DET curve is also produced in the example.det.* files which can be
used with the GNUPLOT program to render the graph. The -c option
writes intermediate information to so that subsequent executions can
avoid searching the RTTM file for term occurences.
perl -I src ./src/STDEval.pl -e test_suite/test2.ecf.xml \
-r test_suite/test2.rttm -s test_suite/test2.stdlist.xml \
-t test_suite/test2.tlist.xml -A -o example.occ.txt \
-a example.ali.txt -d example.det -c example.cache
OPTIONS
Required file arguments:
-e, --ecffile
The ECF file name.
-r, --rttmfile
The RTTM filename.
-s, --stdfile
The STDList filename.
-t, --termfile
The TermList filename.
Find options:
-F, --Find-threshold <thresh>
The <thresh> value represents the maximum time gap in seconds
between two words in order to consider the two words to be part
of a term when searching the RTTM file for reference term
occurences. (default: 0.5).
-S, --Similarity-threshold <thresh>
The <thresh> value represents the maximum time distance between
the temporal extent of the reference term and the mid point of
system's detected term for the two to be considered a pair of
potentially aligned terms. (default: 0.5).
Filter options:
-E, --ECF-filtering
System and reference terms must be in the ECF segments
(default: off).
-T, --Term [<set_name>:]<termid>[,<termid>[, ...]]
Only the <termid> or the list of <termid> (separated by ',')
will be displayed in the Conditional Occurence Report and Con-
ditional DET Curve. An name can be given to the set by specify-
ing <set_name> (<termid> can be a regular expression).
-Y, --YSourcetype [<set_name>:]<type>[,<type>[, ...]]
Only the <type> or the list of <type> (separated by ',') will
be displayed in the Conditional Occurence Report and Condi-
tional DET Curve. An name can be given to the set by specifying
<set_name> (<type> can be a regular expression).
-N, --Namefile <file/channel>[,<file/channel>[, ...]]
Only the <file> and <channel> or the list of <file> and <chan-
nel> (separated by ',') will be displayed in the Occurence
Report and DET Curve (<file> and <channel> can be regular
expressions).
-q, --query <name_attribute>
Populate the Conditional Reports with set of terms identified
by <name_attribute> in the the term list's 'terminfo' tags.
-w, --words-oov
Generate a Conditional Report sorted by terms that are Out-Of-
Vocabulary (OOV) for the system.
Report options:
-a, --align-report <filename>
Output the Alignment Report. Filename is optional, if not spec-
ified, it displays in the STDOUT.
-o, --occurrence-report <filename>
Output the Occurence Report. Filename is optional, if not spec-
ified, it displays in the STDOUT.
-O, --Occurrence-conditionalreport <filename>
Output the Conditional Occurence Report. Filename is optional,
if not specified, it displays in the STDOUT.
-d, --det-curve <filename>
Output the Conditional DET Curve.
-D, --DET-conditional-curve <filename>
Output the Conditional DET Curve.
-P, --Pooled-DETs
Produce term occurrence DET Curves instead of 'Term Weighted'
DETs. '-d' and '-D' must still be used to specify the file
names for the DET plots.
-C, --CSV <filename>
Output the CSV Report.
-H, --HTML <folder>
Output the Occurrence HTML Report.
-Q, --QHTML <folder>
Output the Conditional Occurrence HTML Report.
-A, --All-display
Add an additional column in the Occurence report containing the
overall statistics for every term (default: off).
-k, --koefcorrect <value>
Value for correct (C).
-K, --Koefincorrect <value>
Value for incorrect (V).
-n, --number-trials-per-sec <value>
The number of trials per second (default: 1).
-p, --prob-of-term <value>
The probability of a term (default: 0.0001).
-I, --ID-System <name>
Overwrites the name of the STD system.
Other options:
-c, --cache-find <filename>
Use the caching file for finding occurrences. If the file does
not exist, it creates the cache during the search.
-h, --help
Display the help.
-v, --version
Display the version number.
BUGS
NOTES
filter options
The filter options -T, --Term and -Y, --YSourcetype can be called
several time on the command-line. Every time the option -T occurs,
it creates a set of term IDs. The name of the set can be specified
by adding the name and ':' before listing the Term IDs. By the same
way, sets can specify for the source types. By doing this, it will
create sub-reports for every possible set combinaisons. Every sub-
reports will be displayed as a DET Curve if -D option has been
called. In the same way, term sets represent rows and source type
sets represent column in the Conditional Occurrence Report(-O
option).
For example, the combinaison of options: -T alpha:TERM-01,TERM-02
-T beta:TERM-03 -Y BNEWS+CTS:BNEWS,CTS -Y CONFMTG generates in the
conditional occurrence report 2 rows: alpha and beta, and 2 columns
BNEWS+CTS, CONFMTG.
AUTHORS
Jerome Ajot <jerome.ajot@nist.gov>
Jon Fiscus <jonathan.fiscus@nist.gov>
George Doddington <george.doddington@comcast.net>
VERSION
STDEval.pl version 0.7 20061206
COPYRIGHT
This software was developed at the National Institute of Standards and
Technology by employees of the Federal Government in the course of
their official duties. Pursuant to Title 17 Section 105 of the United
States Code this software is not subject to copyright protection within
the United States and is in the public domain. asclite is an experimen-
tal system. NIST assumes no responsibility whatsoever for its use by
any party.
THIS SOFTWARE IS PROVIDED "AS IS." With regard to this software, NIST
MAKES NO EXPRESS OR IMPLIED WARRANTY AS TO ANY MATTER WHATSOEVER,
INCLUDING MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
perl v5.8.6 2006-12-06 STDEVAL(1)
Man(1) output converted with
man2html