Implement first version of EMCAL hit creation by mfasDa · Pull Request #523 · AliceO2Group/AliceO2

mfasDa · 2017-08-30T15:37:18Z

Bug fix (temporary) in Shishkebab geometry
Small fix in EMCAL geometry
Remove unused variable (shut) in Hit
Fix z-coordinate of Hit
Fix add function for hit: All hits from the same track are
added to 1 hit. For this a lookup table is used to map the hits
Handle calculation of the module ID: Module ID needs to be
obtained from the copy number of the volume and the volume ID
over various levels in the geometry tree
Process Hits add currently hits once a volume is passed

- Bug fix (temporary) in Shishkebab geometry - Small fix in EMCAL geometry - Remove unused variable (shut) in Hit - Fix z-coordinate of Hit - Fix add function for hit: All hits from the same track are added to 1 hit. For this a lookup table is used to map the hits - Handle calculation of the module ID: Module ID needs to be obtained from the copy number of the volume and the volume ID over various levels in the geometry tree - Process Hits add currently hits once a volume is passed

sawenzel · 2017-08-30T17:46:24Z

 {
 class ShishKebabTrd1Module;

+using doublevect = std::vector<double>;


typenames should follow the type convention DoubleVect. This will be enforced soon.

You should not use "using" in headers. Moreover it is considered bad practice to have such a generic aliases, as they confuse the reader. You read code far more times you write it, so you should not worry about saving a few characters (of course for a specific type it's a different matter).

I completely agree with you to ban using directives from header files in case they are used for removing namespaces as this is a super dangerous procedure just in order to save a few characters, which is a wide-spread approach in AliRoot/AliPhysics code. Here however the using directive is used as a typedef, which is even declare within the EMCAL namespace. Furthermore I don't really see where it has any impact on readability: The name "DoubleVect" should be self-explaining.

I agree somewhat with @ktf that "std::vector" is a bit clearer than DoubleVect and the gain in typing is in any case minor.

Not sure if doublevect is so clear, can be vector of type double, or a somehow duplicated vector, some vector which is there twice. My first association goes more in the second direction, and that makes a simple thing confusing

DoubleVect serves no purpose but to save a couple of keystrokes and hide the actual type, possibly generating confusion. If you really need a typedef, you should give it some specific type meaning like "WeightVector", "EnergyVector" or similar. In any case do not keep it at namespace level, but hide it inside the class itself. Having it at namespace level means that everyone, in any namespace, will have to agree on what your type is, which is bad.

OK, I will change this back to std::vector in the commit together with the std::set issue.

sawenzel · 2017-08-31T06:20:30Z

@dberzano : Is the macos build down?

dberzano · 2017-08-31T07:19:50Z

@sawenzel that's a bug in the continuous builder not appearing on Linux that I believe I have fixed now, hold on.

dberzano · 2017-08-31T08:18:46Z

...aaand it's fixed.

ktf · 2017-08-31T09:16:10Z

  Double_t mBirkC1;
  Double_t mBirkC2;

+  std::set<TrackHit>


std::sets are very expensive, both for both per item overhead and per look up. Depending on your vector size, you might want to use a sorted std::vector<TrackHit> and do a binary search or a linear search. For a nice article about binary vs linear search: https://dirtyhandscoding.wordpress.com/2017/08/25/performance-comparison-linear-search-vs-binary-search/. If your vector is very big, you might want to sort a separate index vector and keep std::vector<TrackHit> unsorted.

Indeed the idea is to use a sorted container, and yes, binary search would be the preferred option as search algorithm. However even a linear search should be more efficient than the current implementation which is performing a linear search on an unsorted list.

My worry for the sorted vector is that I have to sort it completely each time I add a new node so I was wondering whether a std::set should be better suited for this task. However the map structure is very lightweight (integer key and pointer to mapped object) so also the sorting shouldn't be too expensive.

The size of the vector is proportional to the amount of particles passing the EMCAL (one hit per particle - also secondary - per EMCAL module).

Yes, you should somehow refactor your algorithm to perform all the insertions first, then sort (once) and then do the lookups. Alternatively, you could keep an array of fixed size array where you bin your hits using a lower precision key. In case there is a conflict you create a new fixed size array and use that for the conflicting keys.
In case there are no conflicts (i.e. you only have unique lower precision keys), you end up with a vector sorted by construction. In case there is conflicts you end up with K sorted vectors where K is the maximum number of conflicts you have for a given shorter key.

The purpose is to have a lookup during the stepping I don't see how to perform all insert operations before sorting. As the set is used for search operations it needs to expand in time. As as the amount of entries is a-priori unknown a fixed size array is also not possible.

The second proposal should be nothing else than a hashtable, right? That could work. I think std::unordered_map is the c++ stl implementation for it. I will give it a try.

Technically std::unordered_map is implemented as a vector of lists, so it's actually slightly different from what I am proposing. Keep in mind that all these structures are optimised for insertion / removal while you most likely want insert once, lookup many in hopefully compact store.

What you propose is a hash table (table where access is handled by an index which is obtained by a hash function of a key - even if you explained it differently). And from what I read std::unordered_map implements a hash table even though the internal implementation is different. And the lookup complexity follows what is to be expected for hashtables.

Indeed lookup operations are expected to be much more abundant then insert / remove operations.

I would prefer sticking to containers which are on the market which show good performance instead of reimplementing the custom solutions. If the std::unordered_map fulfills my needs I see no reason to implement something myself. If you are aware of something better on the market - which is within our software stack - I am open for discussion.

Lookup complexity can be the same, but the constants times involved are not the same and neither are per-item overheads. This is real life, not Algorithms 1 course. ;-) Of course the choice should be done based on a cost - benefit analysis, if you are happy with the unordered_map, it's for sure better than an ordered one.

We can keep the point in mind and I replace the unordered_map once I have something better. For the moment it is "good enough" and it should be more efficient than the version implemented in AliRoot.

- Remove typedefs for std::vector<int/double> - Replace std::set by std::unordered_map for the lookup table in order to improve on speed

sawenzel · 2017-08-31T11:59:09Z

+  auto hitentry = mEventHits.find(trackID);
  if (hitentry != mEventHits.end()) {
-    myhit = hitentry->mHit;
+    myhit = hitentry->second;


@mfasDa : I believe this mechanism might actually not be necessary. The Geant engine will ( I will confirm this ) finish processing a track until it moves to another track. So the trackID will essentially be the same until we change it to the next track. You can hence easily detect when a new track is seen. When this is the case you stop accumulation of the hit and start a new one. We use this mechanism in TPC: See Detector.cxx:312. In this case, it is not necessary to have this lookup structure.

Instead of lookup table store index for curent track / cell, and add a pointer to the current hit object.

sawenzel · 2017-08-31T13:22:09Z

+    // - Inside different cell
+    // - First track of the event
+    std::cout << "New track / cell started\n";
+    TLorentzVector pos, mom;


a tiny optimization hint: This will create (construct) TLorentzVector again and again. (I have seen this is the past). Consider making mPosCache, mMomCache private member variables of the detector.

Even less important (especially if you then use it as a data member of the class): consider using ROOT::Math::LorentzVector, most of the time it can be used as a drop-in replacement of TLorentzVector, it does not have the TObject overhead and you can use the float version for half the size in memory.

@mpuccio : How can this be used if the VMC interface expects TLorentzVector?

OK, could be done, although I normally prefer to not have local variables as local variables. On the long term, in this case maybe one could discuss with Ivana to add the same member functions with primitive output types (array of doubles) in order to stay minimalistic. All we need is the x and p vector, the functionality of TLorentzVector stays largely unused.

…onstruction of an expensive object multiple times

sawenzel · 2017-08-31T14:08:57Z

@mfasDa : Thanks for the changes.

* Add tracking efficiency calculation using custom index * Improve aliases

sawenzel reviewed Aug 30, 2017

View reviewed changes

Fix typedefs according to O2 naming convention

dc4e922

ktf reviewed Aug 31, 2017

View reviewed changes

Refactorings requested in AliceO2Group#523

4c63a81

- Remove typedefs for std::vector<int/double> - Replace std::set by std::unordered_map for the lookup table in order to improve on speed

sawenzel reviewed Aug 31, 2017

View reviewed changes

Change handlig of the current track

b4de487

Instead of lookup table store index for curent track / cell, and add a pointer to the current hit object.

sawenzel reviewed Aug 31, 2017

View reviewed changes

Make momentum vectors in ProcessHit class members in order to avoid c…

6396842

…onstruction of an expensive object multiple times

sawenzel merged commit fc43839 into AliceO2Group:dev Aug 31, 2017

mfasDa deleted the feature-emcalhit branch August 31, 2017 14:38

mbroz84 pushed a commit to mbroz84/AliceO2 that referenced this pull request Mar 16, 2022

PWGMM: update Run 3 dndeta task (AliceO2Group#523)

e96a6ea

* Add tracking efficiency calculation using custom index * Improve aliases

Conversation

mfasDa commented Aug 30, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ktf Aug 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sawenzel commented Aug 31, 2017

Uh oh!

dberzano commented Aug 31, 2017

Uh oh!

dberzano commented Aug 31, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sawenzel commented Aug 31, 2017

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

6 participants

ktf Aug 31, 2017 •

edited

Loading