Fp16 print fix#2784
Conversation
| !isHalfSupported(getActiveDeviceId())) { | ||
| AF_ERROR("Half precision not supported", AF_ERR_NO_HALF); | ||
| } | ||
| } |
There was a problem hiding this comment.
I think we should move this to src/backend/common or src/api/c since this can be used from other locations too.
Also, Don't we need similar checks in create*Array helpers in others backends too ?
There was a problem hiding this comment.
This is already done in the OpenCL backend. It was not done in this backend because I intended to support half on older hardware and forgot to put it back in once I abandoned that idea.
There was a problem hiding this comment.
In that case, lets try to move this check into src/api/c/ a more common location is perhaps af_create_array ?
There was a problem hiding this comment.
lots of functions create arrays. for example randu. I think implementing it in the common namespace is sufficient.
There was a problem hiding this comment.
Whichever location makes sense most.
WilliamTambellini
left a comment
There was a problem hiding this comment.
Warning: this could easily kill perf for users creating thousand of small arrays per secs : have you checked the impact of checkTypeSupported on perf ?
Example:
- in a for loop, just create arrays of random shapes (to bypass the GC buffer, or turn off GC)
- track the time to do all these creations
- compare with and without this checkTypeSupported
?
I am not sure if these checks are avoided for fp16 for the reasons @WilliamTambellini mentioned. Having said that, some sort of check is required at some location I would think. @umar456 Can you please confirm if this is intentional or if check is done else where ? |
|
I don't think this check is going to be very expensive but it may be a good idea to memoize the result of the call for each device. |
| } | ||
|
|
||
| bool isHalfSupported(int device) { | ||
| auto prop = getDeviceProp(device); |
There was a problem hiding this comment.
Isn't getDeviceProp inexpensive ?
There was a problem hiding this comment.
Yes but this will be slightly faster.
Adds missing print in array_to_string.
Also checks if half type is supported during array creation in cuda backend.