[io] Improve checks when reading arrays with TBufferFile#21814
[io] Improve checks when reading arrays with TBufferFile#21814dpiparo merged 3 commits intoroot-project:masterfrom
Conversation
| @@ -268,6 +269,11 @@ class TBufferFile : public TBufferIO { | |||
|
|
|||
| //---------------------- TBufferFile inlines --------------------------------------- | |||
|
|
|||
| //______________________________________________________________________________ | |||
| inline bool TBufferFile::IsBadCollLen(Int_t len, Int_t n) const { | |||
| return n <= 0 || len <=0 || (size_t)len > (size_t)(fBufMax - fBufCur); | |||
There was a problem hiding this comment.
Side note: the value zero is a totally valid (not bad) value that just happens to be treated the same way as the invalid values (i.e. read nothing for the array)
There was a problem hiding this comment.
Do you have a better name in mind? Maybe ShouldReadColl?
| @@ -818,7 +822,7 @@ Int_t TBufferFile::ReadArray(Int_t *&ii) | |||
| *this >> n; | |||
| Int_t l = sizeof(Int_t)*n; | |||
|
|
|||
| if (l <= 0 || l > fBufSize) return 0; | |||
| if (IsBadCollLen(l, n)) return 0; | |||
There was a problem hiding this comment.
| if (IsBadCollLen(l, n)) return 0; | |
| if (!n || ExceedsBufferCapacity<Int_t>(n)) return 0; |
and remove the variable l entirely.
If preferring not template:
| if (IsBadCollLen(l, n)) return 0; | |
| if (!n || ExceedsBufferCapacity(n, sizeof(Int_t)) return 0; |
There was a problem hiding this comment.
Are you suggesting to change the checks previously in place?
Suppose l is dropped. What would happen to the case where n is positive, but l, equal to n*sizeof(type) is negative?
There was a problem hiding this comment.
but
l, equal ton*sizeof(type)is negative?
This is technically a spurious/redundant test. l is not used and if n*sizeof(type) is larger than 31bits then it will also fail the (n * sizeof(type)) > (size_t)(fBufMax - fBufCur) part of the test.
That said if the test was desired it could still be done equally well by calculating l inside the function ...
There was a problem hiding this comment.
(retracted while reviewing essential details)(side note sizeof(Int_t)==size_on_file(Int_t) but sizeof(Long_t) != size_on_file(Long_t)
Test Results 22 files 22 suites 3d 10h 57m 27s ⏱️ For more details on these failures, see this check. Results for commit e4ffd90. ♻️ This comment has been updated with latest results. |
|
The comments above are relevant, and easy to address. Thanks. I think that the tests failures are due to an issue of the file, which was never detected before. This simple program, that can be run as a macro too, reveals it: #include "TFile.h"
#include "TTree.h"
/*
Invalid read happens at
- file "root://eospublic.cern.ch//eos/root-eos/testfiles/atlasFlushed.root"
- tree "CollectionTree"
- branch m_TrigMuonEFInfoContainers.vector<TPObjRef>
- basket 23
#0 0x00007ffff448d02c in __pthread_kill_implementation () from /lib64/libc.so.6
#1 0x00007ffff443fb86 in raise () from /lib64/libc.so.6
#2 0x00007ffff4429873 in abort () from /lib64/libc.so.6
#3 0x00007ffff48a1b21 in __gnu_cxx::__verbose_terminate_handler() [clone .cold] () from /lib64/libstdc++.so.6
#4 0x00007ffff48ad53c in __cxxabiv1::__terminate(void (*)()) () from /lib64/libstdc++.so.6
#5 0x00007ffff48ad5a7 in std::terminate() () from /lib64/libstdc++.so.6
#6 0x00007ffff48ad809 in __cxa_throw () from /lib64/libstdc++.so.6
#7 0x00007ffff72c639b in TBufferFile::IsBadCollLen (this=0x9112330, len=256, n=64) at /foo/root/io/io/src/TBufferFile.cxx:68
Here: len is 256, fBufMax 152116592, fBufCur 152116338, fBufMax-fBufCur = 254: two extra bytes
#8 0x00007ffff72bbcea in TBufferFile::ReadArray (this=0x9112330, ii=@0x90e3358: 0x0) at /foo/root/io/io/src/TBufferFile.cxx:839
#9 0x00007ffff634314d in TBasket::ReadBasketBuffers (this=0x90e3280, pos=298023188, len=3586, file=0x1723640) at /foo/root/tree/tree/src/TBasket.cxx:671
#10 0x00007ffff6352a21 in TBranch::GetBasketImpl (this=0x31aec30, basketnumber=23, user_buffer=0x0) at /foo/root/tree/tree/src/TBranch.cxx:1267
#11 0x00007ffff6353198 in TBranch::GetBasketAndFirst (this=0x31aec30, basket=@0x7fffffffcc80: 0x0, first=@0x7fffffffcc78: 1100, user_buffer=0x0)
at /foo/root/tree/tree/src/TBranch.cxx:1389
#12 0x00007ffff6353fb0 in TBranch::GetEntry (this=0x31aec30, entry=1100, getall=0) at /foo/root/tree/tree/src/TBranch.cxx:1715
#13 0x00007ffff6366ec4 in TBranchElement::GetEntry (this=0x31aec30, entry=1100, getall=0) at /foo/root/tree/tree/src/TBranchElement.cxx:2826
#14 0x00007ffff6366a34 in TBranchElement::GetEntry (this=0x31ae610, entry=1100, getall=0) at /foo/root/tree/tree/src/TBranchElement.cxx:2761
#15 0x00007ffff6366a34 in TBranchElement::GetEntry (this=0x31adfd0, entry=1100, getall=0) at /foo/root/tree/tree/src/TBranchElement.cxx:2761
#16 0x00007ffff63e4347 in operator() (__closure=0x7fffffffd2b0) at /foo/root/tree/tree/src/TTree.cxx:5741
#17 0x00007ffff63e44d5 in TTree::GetEntry (this=0x13dd370, entry=1100, getall=0) at /foo/root/tree/tree/src/TTree.cxx:5819
#18 0x00000000004021e0 in main () at repro.cpp:9
(gdb) f 10
#10 0x00007ffff6352a21 in TBranch::GetBasketImpl (this=0x31aec30, basketnumber=23, user_buffer=0x0) at /foo/root/tree/tree/src/TBranch.cxx:1267
*/
int repro()
{
auto f = std::unique_ptr<TFile>{TFile::Open("root://eospublic.cern.ch//eos/root-eos/testfiles/atlasFlushed.root")};
auto t = f->Get<TTree>("CollectionTree");
// not strictly necessary, but just to illustrate the matter
t->SetBranchStatus("*",0);
t->SetBranchStatus("m_TrigMuonEFInfoContainers.vector<TPObjRef>", 1);
t->GetEntry(1100); // Invalid Read!
return 0;
}
int main(){return repro();} |
6cbd362 to
e50cef7
Compare
|
The problem is with the management of the buffer length in |
| // that does not present corruption | ||
| const int efirst = 1101; //original value 0; | ||
| const int elast = 2999; //original value efirst+nentries; | ||
| if (cachesize == -2) { |
There was a problem hiding this comment.
These changes (thankfully not needed thanks to fixing the underlying issue) are changing the nature of this test. In this case a better solution would have been to disabled the branch triggering an issue.
There was a problem hiding this comment.
You are right. They were there assuming the file had issues and to check if other problems were present thanks to the CI.
Because, under certain circumstances, BufMax and fBufSize got out-of-sync Co-Authored-By: Philippe Canal <pcanal@fnal.gov>
i.e. in the TBufferFile::Read*Array* methods.
Previously, the collection length was checked to be positive
and that it was less than the total buffer size.
Those checks have been improved with a battery of 3 checks:
1. The collection's number of elements is checked to be positive
2. The collection length is checked to be positive
3. The collection length is checked to be less than the portion
of the buffer that remains to be read
The test roottest-root-io-prefetching-run had to be modified as well
because of what seems to be a corrupted input file.
thanks also to @Sebasteuo
|
/backport to 6.38, 6.36, 6.32, 6.30, 6.28, 6.26 |
|
/backport to 6.36, 6.32, 6.30, 6.28, 6.26 |
|
/backport to 6.32 |
|
/backport to 6.30 |
|
/backport to 6.28 |
|
/backport to 6.26 |
| inline bool TBufferFile::ShouldNotReadCollection(Int_t lengthInBytes, Int_t nElements) const | ||
| { | ||
| // Three cases here in which we should not read the collection: | ||
| // 1. The collection has zero or a negative number of elements or | ||
| // 2. The length in bytes of the collection is zero or negative | ||
| // 3. The remaining part of the buffer is too short to read the collection | ||
| return nElements <= 0 || lengthInBytes <= 0 || (size_t)lengthInBytes > (size_t)(fBufMax - fBufCur); | ||
| } |
There was a problem hiding this comment.
| inline bool TBufferFile::ShouldNotReadCollection(Int_t lengthInBytes, Int_t nElements) const | |
| { | |
| // Three cases here in which we should not read the collection: | |
| // 1. The collection has zero or a negative number of elements or | |
| // 2. The length in bytes of the collection is zero or negative | |
| // 3. The remaining part of the buffer is too short to read the collection | |
| return nElements <= 0 || lengthInBytes <= 0 || (size_t)lengthInBytes > (size_t)(fBufMax - fBufCur); | |
| } | |
| inline bool TBufferFile::ShouldNotReadCollection(size_t lengthInBytes, Int_t nElements) const | |
| { | |
| // Three cases here in which we should not read the collection: | |
| // 1. The collection has zero or a negative number of elements or | |
| // 2. The length in bytes of the collection is zero or negative | |
| // 3. The remaining part of the buffer is too short to read the collection | |
| return nElements <= 0 || lengthInBytes > (size_t)(fBufMax - fBufCur); | |
| } |
There was a problem hiding this comment.
This requires the call site to become:
if (ShouldNotReadCollection(n*size_on_file(type), n)) return 0;
or
if (ShouldNotReadCollection(size_on_file(type), n)) return 0;
and do the multiplication inline.
No description provided.