Proposed new feature or change:
We can improve performance of reductions like np.sum() which are much more expensive than normal ufunc operations:
x = np.array([1., 2., 3.])
%timeit np.abs(x)
%timeit np.sum(x)
Results in:
288 ns ± 0.701 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
1.48 μs ± 2.17 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Some prototyping shows we improve performance by a factor of 2 (at least for the contiguous case), see main...eendebakpt:numpy:ufunc_reduction_performance.
Proposed new feature or change:
We can improve performance of reductions like
np.sum()which are much more expensive than normal ufunc operations:Results in:
Some prototyping shows we improve performance by a factor of 2 (at least for the contiguous case), see main...eendebakpt:numpy:ufunc_reduction_performance.