You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Framework/Core/ANALYSIS.md
+73-10Lines changed: 73 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ defineDataProcessing() {
35
35
36
36
> **Implementation details**: `AnalysisTask` is simply a `struct`. Since `struct` default inheritance policy is `public`, we can omit specifying it when declaring MyTask.
37
37
>
38
-
> `AnalysisTask` will not actually provide any virtual method, as the `adaptAnalysis` helper relyes on template argument matching to discover the properties of the task. It will come clear in the next paragraph how this allow is used to avoid the proliferation of data subscription methods.
38
+
> `AnalysisTask` will not actually provide any virtual method, as the `adaptAnalysis` helper relies on template argument matching to discover the properties of the task. It will come clear in the next paragraph how this allow is used to avoid the proliferation of data subscription methods.
39
39
40
40
## Processing data
41
41
@@ -55,7 +55,7 @@ struct MyTask : AnalysisTask {
55
55
};
56
56
```
57
57
58
-
will allow you to get a per timeframe collection of tracks. You can then iterate on the tracks using the syntax:
58
+
will allow you to get a per time frame collection of tracks. You can then iterate on the tracks using the syntax:
59
59
60
60
```cpp
61
61
for (auto &track : tracks) {
@@ -77,7 +77,7 @@ This has the advantage that you might be able to benefit from vectorization / pa
77
77
78
78
> **Implementation notes**: as mentioned before, the arguments of the process method are inspected using template argument matching. This way the system knows at compile time what data types are requested by a given `process` method and can create the relevant DPL data descriptions.
79
79
>
80
-
> The distiction between `Tracks` and `Track` above is simply that one refers to the whole collection, while the second is an alias to `Tracks::iterator`. Notice that we assume that each collection is of type `o2::soa::Table` which carries metadata about the dataOrigin and dataDescription to be used by DPL to subscribe to the associated data stream.
80
+
> The distinction between `Tracks` and `Track` above is simply that one refers to the whole collection, while the second is an alias to `Tracks::iterator`. Notice that we assume that each collection is of type `o2::soa::Table` which carries meta data about the dataOrigin and dataDescription to be used by DPL to subscribe to the associated data stream.
the above will be called once per collision found in the timeframe, and `tracks` will allow you to iterate on all the tracks associated to the given collision.
92
+
the above will be called once per collision found in the time frame, and `tracks` will allow you to iterate on all the tracks associated to the given collision.
93
93
94
94
Alternatively, you might not require to have all the tracks at once and you could do with:
the `etaphi` object is a functor that will effectively act as a cursor which allows to populate the `EtaPhi` table. Each invokation of the functor will create a new row in the table, using the arguments as contents of the given column. By default the arguments must be given in order, but one can give them in any order by using the correct column type. E.g. in the example above:
179
+
the `etaphi` object is a functor that will effectively act as a cursor which allows to populate the `EtaPhi` table. Each invocation of the functor will create a new row in the table, using the arguments as contents of the given column. By default the arguments must be given in order, but one can give them in any order by using the correct column type. E.g. in the example above:
DECLARE_SOA_TABLE(Point, "MISC", "POINT", X, Y, (R2<X,Y>));
201
201
```
202
202
203
-
Notice how the dynamic column is defined as a standalone column and binds to X and Y
203
+
Notice how the dynamic column is defined as a stand alone column and binds to X and Y
204
204
only when you attach it as part of a table.
205
205
206
-
### Executing a finalisation method, post run
206
+
### Executing a finalization method, post run
207
207
208
208
Sometimes it's handy to perform an action when all the data has been processed, for example executing a fit on a histogram we filled during the processing. This can be done by implementing the postRun method.
To get combinations of distinct tracks, helper functions from `ASoAHelpers.h` can be used. Presently, there are 3 combinations policies available: strictly upper, upper and full.
337
337
338
-
The number of elements in a combination is deduced from the number of arguments pased to `combinations()` call. For example, to get pairs of tracks from the same source, one must specify `tracks` table twice:
338
+
The number of elements in a combination is deduced from the number of arguments passed to `combinations()` call. For example, to get pairs of tracks from the same source, one must specify `tracks` table twice:
It will be possible to specify a filter for a combination as a whole, and only matching combinations will be then outputed. Currently, the filter is applied to each element separately. Note that for filter version the input tables are mentioned twice, both in policy constructor and in `combinations()` call itself.
365
+
It will be possible to specify a filter for a combination as a whole, and only matching combinations will be then output. Currently, the filter is applied to each element separately. Note that for filter version the input tables are mentioned twice, both in policy constructor and in `combinations()` call itself.
366
366
367
367
```cpp
368
368
struct MyTask : AnalysisTask {
@@ -384,6 +384,69 @@ combinations(tracks, tracks); // equivalent to combinations(CombinationsStrictly
Produced tables can be saved to file as TTrees. This process is customized by the command line option `--keep` (of the internal-dpl-AOD-writer). **Please be aware, that the format of the `keep` option as described here is preliminary and might be changed in future.**
390
+
391
+
`keep` is a comma-separated list of `DataOuputDescriptions`.
392
+
393
+
`keep`
394
+
```csh
395
+
DataOuputDescription1,DataOuputDescription2, ...
396
+
```
397
+
398
+
Each `DataOuputDescription` is a semicolon-separated list of 4 items
399
+
400
+
`DataOuputDescription`
401
+
```csh
402
+
table:tree:columns:file
403
+
```
404
+
and instructs the internal-dpl-AOD-writer, to save the columns `columns` of table `table` as TTree `tree` into files `file_x.root`, where `x` is an incremental number. The selected columns are saved as separate TBranches of TTree `tree`.
405
+
406
+
By default `x` is incremented with every time frame. This behavior can be modified with the command line option `--ntfmerge`. The value of `ntfmerge` specifies the number of time frames to merge into one file.
407
+
408
+
The first item of a `DataOuputDescription` is mandatory and needs to be specified, otherwise the `DataOuputDescription` is ignored. The other three items are optional and are filled by default values if missing.
409
+
410
+
The format of `table` is
411
+
412
+
`table`
413
+
```csh
414
+
AOD/tablename/0
415
+
```
416
+
`tablename` is the name of the table as defined in the workflow definition.
417
+
418
+
The format of `tree` is a simple string which names the TTree the table will be saved to. If `tree` is not specified then `tablename` will be used as TTree name.
419
+
420
+
`columns` is a slash(/)-separated list of column names., e.g.
421
+
422
+
`columns`
423
+
```csh
424
+
col1/col2/col3
425
+
```
426
+
The column names are expected to match column names of table `tablename` as defined in the respective workflow. Non-matching columns are ignored. The selected table columns are saved as separate TBranches with the same names as the corresponding table columns. If `columns` is not specified then all table columns will be saved.
427
+
428
+
`file` finally specifies the base name of the files the tables are saved to. The actual file names are composed as `file`_`x`.root, where 'x' is an incremental number. If `file` is not specified the default file name is used. The default file name can be set with the command line option `--res-file`. However, if `res-file` is missing then the default file name is set to `AnalysisResults`.
429
+
430
+
#### Valid example command line options
431
+
432
+
```csh
433
+
--keep AOD/UNO/0
434
+
# save all columns of table 'UNO' to TTree 'UNO' in files 'AnalysisResults'_x.root
435
+
436
+
--keep AOD/UNO/0::c2/c4:unoresults
437
+
# save columns 'c2' and 'c4' of table 'UNO' to TTree 'UNO' in files 'unoresults'_x.root
# save columns 'c1' and 'c2' of table 'UNO' to TTree 'trsel1' in files 'myskim'_x.root and
441
+
# save columns 'c6', 'c7' and 'c8' of table 'DUE' to TTree 'trsel2' in files 'myskim'_x.root.
442
+
# Merge 50 time frames in each file.
443
+
444
+
```
445
+
446
+
#### Limitations
447
+
448
+
If the provided `--keep` option contains two `DataOuputDescriptions` with equal combination of `tree` and `file` then the processing will be stopped! It is not pssible to save two trees with equal name to a given file.
449
+
387
450
### Possible ideas
388
451
389
452
We could add a template `<typename C...> reshuffle()` method to the Table class which allows you to reduce the number of columns or attach new dynamic columns. A template wrapper could
0 commit comments