MINOS Code Profile Results:

(Text too small?  Good! Cut-and-paste to see it properly!)

May 23, 2001: Initial tests
Minossoft CVS date:  ~ May 20. Small mods to make it work on my platform
ROOT version: CVS checkout ~ May 20
Script:  bub1.jcm  Note that this run was only 10 events.

#======================================================================= # # A simple job to demonstrate UserAnalysis module # # usage: demojob -b jobc.jcm (batch w/o graphics) # demojob -g jobc.jcm (batch w/ graphics) # # messier@huhepl.harvard.edu #======================================================================= /Path/Create Demo \ Input::Get \ RerootToRawDataModule::Get \ DigitListModule::Get \ DigitListModule::Reco \ DigiPairListModule::Reco \ CutDigiPairListModule::Reco \ SuperSliceListModule::Reco \ MSTClusterListModule::Reco \ ThruMuonListModule::Reco \ FitThruMuonListModule::Reco \ GoodThruMuonListModule::Reco \ ThruMuonAnalysisModule::Ana \ CheezyDisplayModule::Ana \ EventDump::Ana /DigiPairListModule/Set ListIn candmcdigitlist /CutDigiPairListModule/Set ADC 30; /CutDigiPairListModule/Set TDC 100 /SuperSliceListModule/Set ListIn candcutdigipairlist /MSTClusterListModule/Set DistMax 50 /MSTClusterListModule/Set HoleMax 0 /MSTClusterListModule/Set PlaneMax 3 /ThruMuonListModule/Set DiffMax 20 /GoodThruMuonListModule/Set ListIn candfitthrumuonlist /GoodThruMuonListModule/SetStraight MaxBad 20 /GoodThruMuonListModule/SetGeom DistMin 0 /GoodThruMuonListModule/SetGeom NPlnMin 6 /ThruMuonAnalysisModule/Set ListIn candgoodthrumuonlist /Demo/Input/Add /home/tagg/root_files/gm_far_cosmic_1_hapr01.root /Demo/Run Nin 10 /ThruMuonAnalysisModule/Plot NDigitSpec
Results: bub1.gprof.out Highlights:
% cumulative self self total time seconds seconds calls ms/call ms/call name 18.60 1.09 1.09 62765 0.02 0.04 TNodeX::GetNodeToDepth(char const *, int) 9.56 1.65 0.56 8617061 0.00 0.00 TListIter::Next(void) 7.00 2.06 0.41 8637688 0.00 0.00 TIter::operator()(void) 4.95 2.35 0.29 3 96.67 97.79 TNodeX type_info function 4.10 2.59 0.24 195 1.23 1.33 R__Inflate_codes 3.92 2.82 0.23 8870514 0.00 0.00 TNamed::GetName(void) const 3.75 3.04 0.22 13 16.92 16.92 TObject type_info function 3.24 3.23 0.19 8592760 0.00 0.00 TObject::TestBit(unsigned int) const 2.05 3.35 0.12 UgliScintPlnNode type_info function 2.05 3.47 0.12 UgliSteelPlnNode type_info function 1.54 3.56 0.09 2957 0.03 0.03 TBuffer::ReadArray(int *&) 1.02 3.62 0.06 314919 0.00 0.00 G__type2string 1.02 3.68 0.06 293895 0.00 0.00 TExMap::GetValue(unsigned long, long) 1.02 3.74 0.06 164765 0.00 0.00 Hash(char const *) 1.02 3.80 0.06 161056 0.00 0.00 TList::FindObject(char const *) const 1.02 3.86 0.06 32564 0.00 0.00 TNode::UpdateMatrix(void) 1.02 3.92 0.06 2823 0.02 0.02 G__defined_typename
Comments: Note that TNodeX::GetNodeToDepth is taking up a LOT of time.

May 25, 2001: Robert Creates TNodeX cacheing
Minossoft CVS date: May 25. (Small mods to make it work on my platform)
ROOT version: CVS checkout ~ May 20
Script: As above( bub1.jcm ) but with 8000 events
Results: bub0525.gprof.out Highlights:

% cumulative self self total time seconds seconds calls s/call s/call name 2.15 100.04 100.04 156437864 0.00 0.00 _Rb_tree, _Select1st >, less, allocator >::lower_bound(PlexPlaneId const &) 1.90 188.50 88.46 3671394449 0.00 0.00 basic_string, __default_alloc_template >::rep(void) const 1.87 275.61 87.11 236097870 0.00 0.00 TNode::UpdateTempMatrix(double const *, double const *, double, double, double, double *, double *, double *) 1.71 354.93 79.32 575422413 0.00 0.00 TObjArrayIter::Next(void) 1.61 429.86 74.93 483649392 0.00 0.00 basic_string, __default_alloc_template >::compare(basic_string, __default_alloc_template > const &, unsigned int, unsigned int) const 1.57 502.73 72.87 234610143 0.00 0.00 TNode::GetNode(char const *) const 1.50 572.43 69.70 156436894 0.00 0.00 UgliGeometry::GetScintPlnNode(PlexPlaneId) 1.46 640.35 67.92 9409337 0.00 0.00 RerootExodus::GetCellPos(PlexStripEndId) 1.45 707.61 67.26 1755561583 0.00 0.00 operator<(PlexPlaneId const &, PlexPlaneId const &) 1.45 774.86 67.25 1366870240 0.00 0.00 TMath::Abs(float) 1.39 839.57 64.71 2043621193 0.00 0.00 basic_string, __default_alloc_template >::length(void) const 1.22 896.18 56.61 1755561583 0.00 0.00 less::operator()(PlexPlaneId const &, PlexPlaneId const &) const 1.22 952.74 56.56 1599124689 0.00 0.00 _Rb_tree, _Select1st >, less, allocator >::_S_key(_Rb_tree_node > *) 1.18 1007.69 54.95 672734534 0.00 0.00 operator<(PlexStripEndId const &, PlexStripEndId const &) 1.15 1061.16 53.47 78730724 0.00 0.00 TNode::UpdateMatrix(void) 1.13 1113.62 52.46 78264622 0.00 0.00 _Rb_tree, _Select1st >, less, allocator >::lower_bound(PlexStripEndId const &) 1.07 1163.49 49.87 1451041169 0.00 0.00 TObjArray::At(int) const 0.94 1207.25 43.76 209132367 0.00 0.00 __default_alloc_template::allocate(unsigned int)
Comments:
The new _Rb_tree  call is part of Robert's modifcations.  The savings in performance is enormous, but it's still the front runner for absorbing CPU cycles.
The calls to string::rep and string::length are a bit worrying..  perhaps string compare functions are over-used during processing.  I suspect it's the same basic problem: TNamed are being used instead integer codes, which would be much faster to manipulate.
Another question good question: Why the PlexStripEnd::operator functions?  Why should these take time?

June 4, 2001: Roy's job, New Minossoft
Minossoft CVS date: June 4. (Small mods to make it work on my platform)
ROOT version: CVS checkout ~ June 4
Script: roy1.jcm with 20 events

// // Create the job path. The modules ahead of the UserAnalysis are // needed to convert from Reroot data format and build the candidate // digit list. // /Path/Create Demo \ Input::Get \ RerootToRawDataModule::Get \ DigitListModule::Get \ DigitListModule::Reco \ StripSRListModule::Reco \ SliceSRListModule::Reco \ ClusterSRListModule::Reco \ ShowerSRListModule::Reco \ TrackSRListModule::Reco \ FitTrackSRListModule::Reco \ EventSRListModule::Reco \ EventSRListModule::Ana // /Msg/SetLevel MuonCalibratorSR Verbose // /Msg/SetLevel EventSR Verbose // /Msg/SetLevel TrackSR Verbose // /Msg/SetLevel FitTrackSR Verbose // /Msg/SetLevel ShowerSR Verbose // /Msg/SetLevel SliceSR Verbose // /Msg/SetLevel ClusterSR Verbose // /Msg/SetLevel FitTrackSR Verbose /Msg/SetLevel Bfld Error /Msg/SetLevel Ugli Fatal // Set the input file and run the job /Demo/Input/Add /home/tagg/root_files/gm_far_le_numu_1_hapr01_20evt.root /Demo/Run Nin 100000000000

Job output: roy0604.out.txt
Results: roy0604.gprof.out
Highlights:
Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 5.03 9.03 9.03 81730 0.11 0.56 SwimObjSR::SwimTo(double) 3.06 14.53 5.50 SwimPlaneInterfaceSR type_info function 2.68 19.34 4.81 34052122 0.00 0.00 TObjArrayIter::Next(void) 1.93 22.80 3.46 19 182.11 182.11 TObject type_info function 1.88 26.17 3.37 109009425 0.00 0.00 TObjArray::At(int) const 1.80 29.41 3.24 138127 0.02 0.17 SwimObjSR::SwimToIron(double) 1.71 32.48 3.07 24720056 0.00 0.00 __default_alloc_template::allocate(unsigned int) 1.54 35.25 2.77 101 27.43 62.57 R__Deflate 1.39 37.75 2.50 6273275 0.00 0.01 NavSet::Fill(TCollection const *, char const *, int) 1.37 40.21 2.46 11420363 0.00 0.00 vector >::_M_insert_aux(void **, void *const &) 1.35 42.63 2.42 19930403 0.00 0.00 TIter::TIter(TCollection const *, unsigned char) 1.20 44.78 2.15 67063259 0.00 0.00 TObjArray::GetLast(void) const 1.16 46.86 2.08 67361924 0.00 0.00 TObjArray::GetAbsLast(void) const 1.16 48.94 2.08 24714203 0.00 0.00 __default_alloc_template::deallocate(void *, unsigned int) 1.06 50.85 1.91 49409763 0.00 0.00 __default_alloc_template::_S_freelist_index(unsigned int) 1.04 52.72 1.87 49409359 0.00 0.00 __default_alloc_template::_Lock::~_Lock(void) 1.01 54.54 1.82 6273249 0.00 0.00 NavSet::Init(void) 0.97 56.28 1.74 19930443 0.00 0.00 TIter::~TIter(void)

Notes: Swimmer is obviously a problem. The second call, I THINK, is overhead to the C++ virtual symbol lookup table, which is attempting to find the correct function to call. I'm not sure. TObjectArrayIter is expected. TNodeX was apparently hammered only by Bubblespeak, not by the newer reconstruction classes. Nick needs not to worry about the Navigation just yet.. it's pretty clear that although individual calls are expensive, they don't add up to a huge fraction. I don't know why the 'default alloc template' calls show up.



Nathaniel Tagg   n.tagg1@physics.ox.ac.uk