The dataFinder.pl script (v2.80)
The
dataFinder.pl script runs like a deamon on the stol2 machine at the port 9030 on the storage subnet (
stol2rear:9030).
- the executable is located at
/root/bin/dataFinder.pl- the working directory with data and configuration files is:
/var/datafind
-
Short description about it's functionalities:
The
dataFind.pl script:
- scans periodically the STOL2 disk (data mount point) in order to find new, incoming files;
- checks that files for integrity and for missing frames;
- if the file is completed in all its parts an ffl entry will be calculated and sent to the
fflGen.pl module;
- it also answers to the request from all the
dataStorage.pl modules in order to find a better replica for an incoming (on the offilne buffer from
stol1 data flow) incomplete file, if this replica is find, the file is sent to the requester;
- if there is a crash on the current dest volume (triggered by the
dataManager.pl module) it tryes to inspect on the backup stream buffer for the requested files (last produced and unavailable fro nthe offline buffer) and replace them into the ffl with local replicas (follow procedure P1 for the manual restore to normal operating conditions);
- send emails to the operators when there are missing frames on a file, recovered files and for all other exceptions;
-
Configuration parameters and files
The following files must be present into the working directory:
this files can be manually edited:
-
/var/datafind/local_host_name : usually
stol2rear, is the local host name (mapped on the storage subnet);
-
/var/datafind/datasend_host_name : usually
datagwrear, is the name of the machine that hosts the module
dataFinder.pl (used for the file revovering);
-
/var/datafind/fflgen_host_name : usually
stol2rear, is the name of the machine that hosts the
fflGen.pl module;
-
/var/datafind/dataman_host_name : usually
stol1rear, is the name of the machine that hosts the module
dataManager.pl;
-
/var/datafind/file_std_frames : usually this value is 240, are the standard file frames;
- other files like .db .list and .lock files must be leaved untouched, changing or deleting them can cause system inconsistency.
All files are automatically replicated each hour by a crond script into the directory
/var/dataSoft.bkp/datastor.stXX located on the
datasw machine.
The following variables are located into the head of the script: to change their value the script must be stopped and restarted:
- set this variable to 1 or 2 to increment the log level detail:
# verbose level (1/2)
our $verbose = 1;- set this services variable to 0 (disable) or 1 (enable):
# active services (0/1)
our $srv_filescan = 1;
our $srv_localffl = 1;
- set up the volume list (and mount point) that must be periodically checked:
our $data_mount_point = "/storage";
our $data_subvol_path = "data/DAQ/rawdata/lastday";
… other variable must not be changed!
In order to start the script like a deamon, follow this rules:
- check if no instance are running:
# ps -edaf | grep dataFind
if the list is not empty and there are processes like
dataFinder.pl:main, please wait until they will finish before restart the script like a deamon. If not, a port conflict will occour. To start the script on the stol2rear machine (from the root account), you must:
# cd /var/datafind
# nohup /root/bin/dataFinder.pl &
then check the log or/and the
nohup.out file:
# tail -f /var/log/dataFinder.pl.log
If you are starting the script after a total shutdown, remember that the correct script activation sequence is the following:
- the
fflGen.pl on stol2rear;
- the
dataBackup.pl on datagwrear;
- all the
dataStorage.pl on stXXrear;
- the
dataFinder.pl on stol2rear;
- the
dataManager.pl on stol1rear;
From any machine in the storage farm network simply type:
# dataCommand.pl stol2rear:9030 "stop:<reasons>"
please note that the ":" is mandatory even no <reasons> are specified, than take a look to the log file or to the
nohup.out:
# tail -f /var/log/dataFinder.pl.log
-
Commands recognized by the script
… the commands present into the
dataFinder.pl code are not intended to be execute manually.