Updated format for transport of TPCNativeCluster#3663
Conversation
|
I found a bug in the copying of the MCLabels on the sender side, but seems there is one more problem reading the labels. |
957ff9b to
90ead9c
Compare
|
Just updated the PR with a first working version. Have tested it including the ITTPS matching Also without the The ca clusters are copied to the output message right now, this can be changed in a follow up taking the relevant code from #3552 . No copy in the subscribing processes if the ca-clusterer is used. I'm starting now to clean-up. Have to check how to get the writing and reading of the new transport format can be handled with the RootFileWriter. Changing the configuration of the writer is probably trivial, the reader needs to be adjusted to support reading from both sector branches and from the single branch. |
|
@matthiasrichter: a small difference with or w/o But, when running TPC reco with MC ON, I see an error in the RootTreeWriter: Apart from that, cluster loading in the when running with the raw input (with TPC reco using --ca-clusterer) and when running with Doing the same with external clusterer: works fine |
|
thanks, yes, only the TPCITS matching was expected to work, I was not clear in my post. |
90ead9c to
f6a1539
Compare
|
Hi @shahor02 , @davidrohr , next iteration is pushed, now the writing/reading from/to file is tested with the TPC reco workflow, e.g. native cluster file produced with and without CA clusterer as input to the CA tracker. Also the TPC interpolation workflow is adjusted. Please check with the reader in the TPCITS matching and TPC interpolation workflows. |
|
I had only a quick look. Most cases seem to work. I am not so sure about the below one, seem the track interpolation doesn't run in this case: Will have a closer look later today. |
|
For me all modes work, including the one mentioned by @davidrohr (both with and w/o disable-writer in the TPC w/flow). |
…kingWorkflows In the future, we can drop the logic for checking valid and active sectors, this is done by the completion policy. We keep it as an extra check during the heavy development phase.
f6a1539 to
a74e400
Compare
|
Thanks @shahor02 for checking. The PR is now cleaned up, so we can merge when the CI is ready. |
The format is produced by the CA clusterer and consists of an index
header ClusterCountIndex with cluster counts per {sector,row} for the
full TPC followed by a linear buffer of ClusterNative objects. The
offsets within the linear buffer are derived from the cluster counts.
The MCLabelContainer corresponds to the sequence of clusters.
Main changes:
- Reworking HardwareClusterDecoder to use cluster index
- Adjusting ClusterNativeHelper to read updated ClusterNative transport format
Further notes:
Still the helper can read the cluster index from a list of inputs, but if
only one input is provided (data and optional MC labels), the index is set directly
from the single buffer without copy. If multiple inputs are available a copy is needed
to create linear buffers for clusters and labels
In the updated format, a full sector or even the full TPC is sent so we do not need the
vector of MCLabelContainers to transport all the MCLabelContainers for the padrows
Sending out the updated transport format which will allow to transport the data between devices without copy. For the moment, the CA tracker copies the data into the output message, but the GPU worker will be changed to write directly to the provided buffer. Further improvements to the input handling in the CA tracker spec: - Get rid of the fixed arrays for collecting the input objects - Adjusting cluster completeness check in the TPC CA tracker to use the sector bitmask from the sector header The check can be dropped in the future as it is the task of the completion policy to provide complete data sets. We keep it as a check for a while during the hot development phase.
With the updated transport format for ClusterNative data, the full TPC data is shipped in one block and also written to a single branch. Adjusting writer for native clusters to use one single MCLabelContainer Only one block of clusters and labels is produced by the CA clusterer and the writer is configured accordingly. Added a runtime check to the PublisherSpec to detect whether a fill is using Full or Sector mode. If only one branch of the configured branch name is found without the sector number suffix, the publisher switches to SectorMode::Full. We still need to define all output routes because the dynamic adjustment is at a later stage than the definition of routes. The full block will be published on one route, while zero-length data is created on all the other routes. By this we ensure that a completion policy waiting for input on all data can fire.
a74e400 to
13bc183
Compare
Allocating the buffer for cluster output before calling the GPU worker. It can be filled directly from the worker instead of copying after the worker.
| // a byte size resizable vector object, the DataAllocator returns reference to internal object | ||
| // initialize optional pointer to the vector object | ||
| using ClusterOutputChunkType = std::decay_t<decltype(pc.outputs().make<std::vector<char>>(Output{"", "", 0}))>; | ||
| ClusterOutputChunkType* clusterOutput = nullptr; |
There was a problem hiding this comment.
@davidrohr , providing the buffer for cluster output now before calling the GPU worker, so you can use it later to directly write to the vector. It has zero length now, although created there.
davidrohr
left a comment
There was a problem hiding this comment.
Tried all combinations I could think of and in the newest version they seem to work.
|
trying to find out what's the problem with the checkcode CI but I can't find a hint that the problem is related. |
|
For some reason it looks for non-existent directory, but not clear why (perhaps @aphecetche can suggest): |
o2checkcode is misbehaving on #3648 as well. I am pretty sure that o2checkcode completed OK earlier today after a new push. It was a dummy commit created with the sole intent of triggering checks. However, it is now showing as failed with signs of a very broken build, as this PR does. |
This PR is sketching the work for moving to an updated transport format as it is produced by the CAclusterer. The format consists of a header with the index of cluster counts for the full TPC and one monolithic buffer for all cluster objects. The cluster offsets per {sector,row} coordinates corresponds to the accumulated cluster counts. The preliminary version seems to work when MC labels are disabled.
The main changes are:
the ClusterDecoderRaw directly writes the updated format
ClusterNativeHelper can read the updated format
handling of the MC labels, this is a bit convoluted with multiple copies, needs to be sorted out
set ClusterNativeAccess from monolithic buffer without copy
publish the CA clusterer output in the updated format
test with subsequent workflows (ITS-TPC matching, TPC interpolation)
make the writing and reading of ClusterNative files working for all cases