![]() |
Project
|
o2-ctf-writer-workflow
can be piped in the end of the processing workflow to create CTF data. By default data of every detector flagged in the GRP as being read-out are expected. The list of detectors storing CTF data can be managed using --onlyDet arg (=none)
and --skipDet arg (=none)
comma-separated lists. Every detector writing CTF data is expected to send an output with entropy-compressed EncodedBlocks
flat object.
Example of usage:
For the storage optimization reason one can request multiple CTFs stored in the same output file (as entries of the ctf
tree):
will accumulate CTFs in entries of the same tree/file until its size fits exceeds min
and does not exceed max
(max
check is disabled if max<=min
) or EOS received. The --max-file-size
limit will be ignored if the very first CTF already exceeds it. Additional option --max-ctf-per-file <N>
will forbid writing more than N
CTFs to single file (provided N>0
) even if the min-file-size
is not reached. User may request autosaving of CTFs accumulated in the file after every N
TFs processed by passing an option --save-ctf-after <N>
.
The output directory (by default: cwd
) for CTFs can be set via --output-dir
option and must exist. Since in on the EPNs we may store the CTFs on the RAM disk of limited capacity, one can indicate the fall-back storage via --output-dir-alt
option. The writer will switch to it if (i) szCheck = max(min-file-size*1.1, max-file-size)
is positive and (ii) estimated (accounting for eventual other CTFs files written concurrently) available space on the primary storage is below the szCheck
. The available space is estimated as:
`
If the option --meta-output-dir <dir>
is not /dev/null
, the CTF meta-info
files will be written to this directory (which must exist!).
By default only CTFs will written. If the upstream entropy compression is performed w/o external dictionaries, then the for every CTF its own dictionary will be generated and stored in the CTF. In this mode one can request creation of dictionary file (or dictionary file per detector if option --dict-per-det
is provided) by passing option --output-type dict
(in which case only the dictionares will be stored but not the CTFs) or --output-type both
(will store both dictionaries and CTF). This is the only valid mode for dictionaries creation (if one requests dictionary creation but the compression was done with external dictionaries, the newly created dictionaries will be empty). In the dictionaries creation mode their data are accumulated over all CTFs procssed. User may request periodic (and incremental) saving of dictionaries after every N
TFs processed by passing --save-dict-after <N>
option.
Option --ctf-dict-dir <dir>
can be provided to indicate the directory where the dictionary will be stored.
The external dictionaries created by the o2-ctf-writer-workflow
containes a TTree (one for all participating detectos) and separate dictionaries per detector which can be uploaded to the CCDB. The per-detector dictionaries compatible with CCDB can be also extracted from the common TTree-based dictionary file using the macro O2/Detectors/CTF/utils/CTFdict2CCDBfiles.C
(installed to $O2_ROOT/share/macro/CTFdict2CCDBfiles.C) which extracts the dictionary for every detector into separate file containing plain vector<char>
. These per-detector files can be directly uploaded to CCDB and accessed via CcdbAPI
(the reference of the vector should be provided to corresponding detector CTFCoder::createCoders method to build the run-time dictionary). These files can be also used as per-detector command-line parameters, on the same footing as tree-based dictionaries, e.g.
See below for the details of --ctf-dict
option.
One can pause the writing if available disk space is low using a combination of following options:
o2-ctf-reader-workflow
should be the 1st workflow in the piped chain of CTF processing. At the moment accepts as an input a comma-separated list of CTF files produced by the o2-ctf-writer-workflow
, reads data for all detectors present in it (the list can be narrowed by --onlyDet arg (=none)
and --skipDet arg (=none)
comma-separated lists), decode them using decoder provided by detector and injects to DPL. In case of multiple entries in the CTF tree, they all will be read in row.
Example of usage:
The option are:
inptu data (obligatort): comma-separated list of CTF files and/or files with list of data files and/or directories containing files
comma-separated list of detectors to read, Overrides skipDet
comma-separated list of detectors to skip
By default an exception will be thrown if detector is requested but missing in the CTF. To enable injection of the empty output in such case one should use option --allow-missing-detectors
.
allows to alter the subSpecification
used to send the CTFDATA from the reader to decoders. Non-0 value must be used in case the data extracted by the CTF-reader should be processed and stored in new CTFs (in order to avoid clash of CTFDATA messages of the reader and writer).
max CTFs to process (<= 0 : infinite)
max TFs to process from every CTF file (<= 0 : infinite)
loop N times after the 1st pass over the data (infinite for N<0)
delay in seconds between consecutive CTFs sending (depends also on file fetching speed)
copy command for remote files or no-copy
to avoid copying
regex string to identify CTF files: optional to filter data files (if the input contains directories, it will be used to avoid picking non-CTF files)
regex string to identify remote files
max CTF files queued (copied for remote source).
There is a possibility to read remote root files directly, w/o caching them locally. For that one should: 1) provide the full URL the remote files, e.g. if the files are supposed to be accessed by xrootd
(the XrdSecPROTOCOL
and XrdSecSSSKT
env. variables should be set up in advance), use root://eosaliceo2.cern.ch//eos/aliceo2/ls2data/...root
(use xrdfs root://eosaliceo2.cern.ch ls -u <path>
to list full URL). 2) provide proper regex to define remote files, e.g. for the example above: --remote-regex "^root://.+/eos/aliceo2/.+"
. 3) pass an option --copy-cmd no-copy
.
This is a ctf-reader
device local option allowing selective reading of particular CTFs. It is useful when dealing with CTF files containing multiple TFs. The comma-separated list of increasing CTFs indices must be provided in the format parsed by the RangeTokenizer<int>
, e.g. 1,4-6,...
. Note that the index corresponds not to the entry of the TF in the CTF tree but to the reader own counter incremented throught all input files (e.g. if the 10 CTF files with 20 TFs each are provided for the input and the selection of TFs 0,2,22,66
is provided, the reader will inject to the DPL the TFs at entries 0 and 2 from the 1st CTF file, entry 5 of the second file, entry 6 of the 3d and will finish the job.
This option (used for skimming) allow to push to DPL only those TFs which overlap with selected BC-ranges provided via input root file (for various formats see o2::utils::IRFrameSelector::loadIRFrames
method).
This option allows to push to DPL only those TFs which overlap with the <runnumber> <range-min> <range-max>
(separators can be any whitespace, comma or semicolon) records provided via text file (assuming that there are some entries for a given run, otherwise the option is ignored). Multiple ranges per run and multiple runs can be mentioned in a single input file. The range limits can be indicated either as a UNIX timestamp in ms
or as an orbit number (in the fill the run belongs to).
In case an option
is provided, the selections above are inverted: TFs matching some of the provided ranges will be discarded, while the rest will be pushed to the DPL
At the end of the processing the ctf-writer
will create a local file ctf_read_ntf.txt
containing only the number of TFs pushed to the DPL. In case no TF passed the selections above, this file will contain 0.
In absence of the external dictionary the encoding with generate for every TF and store in the CTF the dictionary information necessary to decode the CTF. Since the time needed for the creation of dictionary and encoder/decoder may exceed encoding/decoding time, there is a possibility to create in a separate pass a dictionary stored in the CTF-like object and use it for further encoding/decoding.
The option --ctf-dict <OPT>
steers in all detectors entropy encoders the fething of the entropy dictionary. The choices for OPT are: 1) "ccdb"
(or empty string): leads to using CCDB objec fetching by the DPL CCDB service (default)
2) <filename>
: use the dictionary from provided file (either tree-based format or flat one in CCDB format)
3) "none"
: do not use external dictionary, instead per-TF dictionaries will be stored in the CTF
To create a dictionary run usual CTF creation chain but with extra option, e.g.:
This will create a file ctf_dictionary_<date>_<NTF_used>.root
(linked to ctf_dictionary.root
) containing dictionary data in a TTree format for all detectors processed by the o2-ctf-writer-workflow
. Additionally, for every participation detector a ctf_dictionary_<DET>_v<version>_<data>_<NTF_used>.root
file will be produced, with the dictionary in the flat format. These files can be directly uploaded to the CCDB. By default the dictionary file is written on the exit from the workflow, in CTFWriterSpec::endOfStream()
which is currently not called if the workflow is stopped by ctrl-C
. Periodic incremental saving of so-far accumulated dictionary data during processing can be triggered by providing an option --save-dict-after <N>
.
When decoding CTF containing dictionary data (i.e. encoded w/o external dictionaries), externally provided dictionaries will be ignored.
To apply TF rate limiting (make sure that no more than N TFs are in processing) provide --timeframes-rate-limit <N> --timeframes-rate-limit-ipcid <IPCID>
too all workflows (e.g. via ARGS_ALL). The IPCID is the NUMA domain ID (usually 0 on non-EPN workflow). Additionally, one may throttle on the free SHM by providing an option to the reader --timeframes-shm-limit <shm-size>
.
Note that by default the reader reads into the memory the CTF data and prepares all output messages but injects them only once the rate-limiter allows that. With the option --limit-tf-before-reading
set also the preparation of the data to inject will be conditioned by the green light from the rate-limiter.
For the ITS and MFT entropy decoding one can request either to decompose clusters to digits and send them instead of clusters (via o2-ctf-reader-workflow
global options --its-digits
and --mft-digits
respectively) or to apply the noise mask to decoded clusters (or decoded digits). If the masking (e.g. via option --its-entropy-decoder " --mask-noise "
) is requested, user should provide to the entropy decoder the noise mask file (eventually will be loaded from CCDB) and cluster patterns decoding dictionary (if the clusters were encoded with patterns IDs). For example,
will decode ITS and MFT data, decompose on the fly ITS clusters to digits, mask the noisy pixels with the provided masks, recluster remaining ITS digits and send the new clusters out, together with unchanged MFT clusters.
will send decompose clusters to digits and send ben out after masking the noise for the MFT, while ITS clusters will be sent as decoded.