Skip to content

ln_python 10th polynomial benchmark variant #52

@pigay

Description

@pigay

Hi,

I would like to contribute a variant on ln_python benchmark.

Instead of using numpy slice, we directly access to array elements inside a loop. This is of course terribly slow for interpreted Python but seems to be faster with compiled code like generated with HOPE and numba. In my tests, it happens to be fairly faster than the numpy slice version.

In the following code, I had to add a N parameter beacuse HOPE doesn't support np.ndarray.size attribute. I also use a Horner schema for the polynomial evaluation.

def loop_ln_python(X,Y, N):
    for i in xrange(N):
        xm1 = X[i] - 1.
        tmp = xm1 * (1./9.) - 1./8.
        tmp = tmp * xm1 + 1./7.
        tmp = tmp * xm1 - 1./6.
        tmp = tmp * xm1 + 1./5.
        tmp = tmp * xm1 - 1./4.
        tmp = tmp * xm1 + 1./3.
        tmp = tmp * xm1 - 1./2.
        tmp = tmp * xm1 + 1.
        Y[i] = tmp * xm1



loop_ln_hope = hope.jit(loop_ln_python)
loop_ln_numba = numba.autojit(loop_ln_python)

I test with the following:

X = np.random.random(10000).astype(np.float64)
Y = np.ones_like(X)

loop_ln_hope(X, Y, len(X))
loop_ln_numba(X, Y, len(X))

%timeit loop_ln_python(X, Y, len(X))
%timeit loop_ln_hope(X, Y, len(X))
%timeit loop_ln_numba(X, Y, len(X))

On my laptop, I get the following timings:

100 loops, best of 3: 15.1 ms per loop
10000 loops, best of 3: 46.1 µs per loop
100000 loops, best of 3: 12.7 µs per loop

To be compared to the ln_python_exp() timings:

1000 loops, best of 3: 257 µs per loop
10000 loops, best of 3: 76.8 µs per loop
10000 loops, best of 3: 110 µs per loop

I hope it will be of some interest for you.

For the sake of completeness, we can write a Horner schema version that doesn't add much to the compiled performances but which is more fair for numpy interpreted code:

def ln_python_horner(X, Y):
    Xm1 = X - 1
    Y[:] = Xm1 / 9. - 1./8.
    Y[:] = Y * Xm1 + 1./7.
    Y[:] = Y * Xm1 - 1./6.
    Y[:] = Y * Xm1 + 1./5.
    Y[:] = Y * Xm1 - 1./4.
    Y[:] = Y * Xm1 + 1./3.
    Y[:] = Y * Xm1 - 1./2.
    Y[:] = Y * Xm1 + 1.
    Y[:] = Y * Xm1

And with the timings:

10000 loops, best of 3: 169 µs per loop
10000 loops, best of 3: 104 µs per loop
10000 loops, best of 3: 119 µs per loop

Anyway, thanks for this nice package.

Cheers,

Pierre

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions