Proof-on-demand
Quick-start
This section briefly describes how to setup a proof-on-demand system. Follow these steps:
- Setup a root version or skip this step and let proof-on-demand setup a root version for you
- Setup proof-on-demand via CVMFS
-
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/
source /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/user/atlasLocalSetup.shlocalSetupPoD
# in a script use# source ${ATLAS_LOCAL_ROOT_BASE}/packageSetups/atlasLocalPoDSetup.sh
- Start a PoD server
-
pod-server start
#or pod-server restart
- Submit 16 worker jobs on condor
-
pod-submit -r condor -n 16
- Now you are ready, please don't forget to kill your jobs if you don't need them anymore and stop the server. The worker jobs are terminated automatically, if they have been idle for 30min.
-
condor_rm *_jobnumber_*
pod-server stop
More information can be found here
http://pod.gsi.de/documentation.html, some advanced topics and settings are discussed here [[]].
Advanced topics
Here you find a loose collection of advanced topics are you encounter in your day-to-day use of
PoD.
Log-files and clean-up
In your home directory
~/.PoD/ you will find some working files of
PoD. The log files are located in
~/.PoD/log and can fill-up quite quickly (3 files for each worker job each time you submit workers), so a regular clean up should be done regularly.
Proof
Here some notes and some working examples are presented for a quick-start with proof. There are more information about proof here:
http://root.cern.ch/drupal/content/using-proof
Quick-start with TChain.Draw
You need to setup a proof session in your root script/interactive root. TChain objects know about this (global) proof session and can use them automatically without changes to the code. Functions that are "proof-aware" are TChain.Draw and TChain.Process.
To setup a proof session with
PoD you just use:
TProof* proof=TProof::Open("pod://")
For local testing you can also use
"lite://" as the argument. This will create local worker processes on the machine with the same number of workers as the number of CPUs.
Open a TChain and make it aware of Proof and draw a histogram:
TChain * t = new TChain("tree")
t.Add("myfiles*")
t.SetProof(kTRUE)
t.Draw("myvariable * myothervariable","somecuts > somevalue")
The root file(s) are being processed in parallel on all worker nodes and you will see a GUI popping up (without X, you will see some text status). Unfortunately, it is not so easy to find out, if there has been an error. You can mainly do it on the GUI, if e.g. not all events were processed and you can look at the log files.
Using TSelector
The TSelector is a class that defines certain processing steps before and after the loop, and what will happen in a loop (copy from the PROOF manual):
+++ CLIENT Session +++ +++ (n) WORKERS +++
Begin()
SlaveBegin()
Init()
Notify()
Process()
...
Process()
...
Init()
Notify()
Process()
...
Process()
...
SlaveTerminate()
Terminate()
The simplest way to obtain a TSelector is to generate it from a TChain:
t.MakeSelector("mySelector")
This will, similar to
MakeClass, produce a skeleton where the branches of the tree are already linked to member objects. You can simply fill the TSelector::Process function like this. Note that the entry in the tree has to be obtained with
GetEntry explicitly. Also the entry is refering to the local entry (i.e. in the TChain it is the entry in the
current file not in the whole chain).
Bool_t mySelector::Process(Long64_t entry){
fChain->GetTree()->GetEntry(entry);
hist->Fill(mc_n);
return kTRUE;
}
Since in each worker job a new instance of the TSelector is created, the histograms, output objects, etc. need to be created for each job in TSelector::SlaveBegin (hist is defined in the
.h file as
TH1F hist;*):
void mySelector::SlaveBegin(TTree * /*tree*/){
TString option = GetOption();
hist = new TH1F("hist","hist",100,0,100);
}
To collect all the outputs, the member TSelector::fOutput is used:
void mySelector::SlaveTerminate(){
fOutput->Add(hist);
}
When all the jobs are done, something should be done with the output, you can retrieve it, draw it, save it etc. This can be done in TSelector::Terminate:
void mySelector::Terminate(){
hist=(TH1F*)fOutput->FindObject("hist");
hist->Draw();
}
Advanced topics
Here you find a loose collection of advanced topics are you encounter in your day-to-day use of Proof.
Using datasets
Datasets are objects representing a dataset. This consists of a number of root files and a set of meta-data. For all the root files in the dataset, validation and checks can be performed and they can be processed with Proof using TSelectors:
TFileCollection * filecollection= new TFileCollection("myDataSet")
filecollection.SetDefaultTreeName("myTree")
proof.RegisterDataSet("myDataSet#myTree",filecollection)
proof.VerifyDataSet("myDataSet")
proof.Process("myDataSet")
Additional complications arise, if you want to run over several datasets at the same time (i.e. let proof handle the processing of many datasets, rather than a sequential call of TProof::Process over many datasets).
Using Python
The python version of TSelector cannot properly handle callbacks to python functions, so that you cannot simply use TSelector in python. There is the class
TPySelector which overcomes partially this problem.
Example python class using datasets
from ROOT import TPySelector,TH1F,kTRUE,TDSetElement,TDSet,TH1,TLorentzVector,TObjString,TFile,TString
class validationSelectorTest(TPySelector):
# "Begin" locally, set here some parameters, so that one don't have to modify the constructor
# for the moment basename used for the output filename and writeall controls if the histograms should also be written into one file
def Begin(self):
self.basefilename="test"
self.writeall=True
self.info("base filename",self.basefilename)
self.info("write all",self.writeall)
self.info("Begin")
pass
# "SlaveBegin" does prepare the histogram and inputname bookkeping and creates the global histograms only
# local historams (i.e. local to one dataset), are initialized in the Process loop
# There is a check for double initialisation, should not happen
def SlaveBegin(self,tree):
if getattr(self,"n",None) != None:
self.error("init twice")
raise
self.n=0
self.histograms={}
self.inputNames=[]
self.InputName=""
self.createHistograms(globalHist=True)
self.info("SlaveBegin")
# "Init" catches the tree
def Init(self,tree):
self.fChain=tree
self.info("Init")
# Add all the histograms to the output
def SlaveTerminate(self):
for h in self.histograms:
self.info("add to output",h,self.histograms[h])
self.fOutput.Add(self.histograms[h])
self.info("SlaveTerminate")
# Get output histograms and datasetnames
# prepare outputfiles for each dataset and sort histograms for each dataset
# also write out all histograms into one file
def Terminate(self):
hists=[]
strings=[]
for o in self.fOutput:
self.info("Get output object",o)
if isinstance(o,TH1):
hists.append(o)
if isinstance(o,TObjString):
strings.append(str(o.GetString().Data()))
histbygroup={}
histbygroup[""]=[]
for s in strings:
histbygroup[s]=[]
for h in hists:
name=h.GetName()
added=False
for s in strings:
if name[:len(s)]==s:
histbygroup[s].append(h)
added=True
break
if not added:
histbygroup[""].append(h)
self.info("histograms by group",histbygroup)
self.info("all histograms",hists)
if self.writeall:
filename=self.basefilename+"_all.root"
self.info("write all histograms",hists,"in",filename)
f=TFile(filename,"RECREATE")
for h in hists:
h.Write()
self.info("write",filename,h)
f.Close()
for hg in histbygroup:
filename=self.basefilename+hg+".root"
self.info("write histograms of group",hg,histbygroup[hg],"in",filename)
f=TFile(filename,"RECREATE")
for h in histbygroup[hg]:
h.Write()
self.info("write",filename,hg,h)
f.Close()
#self.fOutput.FindObject("hist"+self.InputName).Draw()
def info(self,*msg):
print self.GetName(),"of",self.ClassName(),
print time.time(),
for m in msg:
print m,
print
def warning(self,*msg):
print self.GetName(),"of",self.ClassName(),
print time.time(),"WARNING",
for m in msg:
print m,
print
def error(self,*msg):
print self.GetName(),"of",self.ClassName(),
print time.time(),"ERROR",
for m in msg:
print m,
print
# create a single histogram, add dataset name to the start of the name
def createHistogram(self,histtype,*args):
if self.InputName+args[0] in self.histograms:
self.warning("histogram",self.InputName+args[0],"already exists")
h=histtype(self.InputName+args[0],self.InputName+args[1],*args[2:])
h.SetDirectory(0)
self.histograms[self.InputName+args[0]]=h
# create global histograms without adding the dataset name
def createHistogramGlobal(self,histtype,*args):
if args[0] in self.histograms:
self.warning("global histogram",args[0],"already exists")
h=histtype(args[0],args[1],*args[2:])
h.SetDirectory(0)
self.histograms[args[0]]=h
# get histograms, check local, then global
def getHistogram(self,name):
if self.InputName+name in self.histograms:
return self.histograms[self.InputName+name]
elif name in self.histograms:
return self.histograms[name]
# get local histograms, for speed issues
def getHistogramLocal(self,name):
return self.histograms[self.InputName+name]
# get global histgorams, for speed issues
def getHistogramGlobal(self,name):
return self.histograms[name]
# list all the histograms here, that should be created, global and local
def createHistograms(self,globalHist=False):
if globalHist:
pass
else:
self.createHistogram(TH1F,"N","N",5000,0,5000)
self.info("added",self.histograms)
#elem=self.fInput.FindObject("PROOF_CurrentElement")
#if elem:
# fCurrent=elem.Value()
# print fCurrent.TestBit(TDSetElement.kNewRun),fCurrent.TestBit(TDSetElement.kNewPacket)
# decide which dataset is current processed, note sure if it hase to be checkev every event
def setInputName(self):
returnname=""
if self.fInput:
elem=self.fInput.FindObject("PROOF_CurrentElement")
if elem:
fCurrent=elem.Value()
if fCurrent.TestBit(TDSetElement.kNewRun) or True:
returnname=elem.Value().GetDataSet()
returnname=returnname.replace(".","_")
#print self.n, returnname,fCurrent.TestBit(TDSetElement.kNewRun)
else:
return
else:
return
else:
return
self.InputName=returnname
if returnname in self.inputNames:
pass
else:
self.inputNames.append(returnname)
self.createHistograms()
self.fOutput.Add(TObjString(returnname))
# process, need to get the entry, need to set the dataset name (=InputName)
def Process(self, entry):
self.n+=1
self.setInputName()
self.fChain.GetTree().GetEntry(entry)
#print self.n,self.histograms,self.histograms[self.InputName+"N"],self.InputName+"N"
self.getHistogram("N").Fill(self.n)
return kTRUE
--
DucBaoTa - 20 Nov 2013