Commit Graph

28 Commits

Author SHA1 Message Date
Siddhesh Poyarekar a357259bf8 Add more directives to benchmark input files
This patch adds some more directives to the benchmark inputs file,
moving functionality from the Makefile and making the code generation
script a bit cleaner.  The function argument and return types that
were earlier added as variables in the makefile and passed to the
script via command line arguments are now the 'args' and 'ret'
directive respectively.  'args' should be a colon separated list of
argument types (skipped if the function doesn't accept any arguments)
and 'ret' should be the return type.

Additionally, an 'includes' directive may have a comma separated list
of headers to include in the source.  For example, the pow input file
now looks like this:

42.0, 42.0
1.0000000000000020, 1.5

I did this to unclutter the benchtests Makefile a bit and eventually
eliminate dependency of the tests on the Makefile and have tests
depend on their respective include files only.
2013-10-07 11:51:25 +05:30
Siddhesh Poyarekar 7849ff938c Add benchmark inputs for sincos 2013-09-19 16:55:27 +05:30
Adhemerval Zanella e029e2e5c5 benchtests: Add memrchr benchmark 2013-09-06 09:24:52 -03:00
Will Newton bbf6e8e4f4 benchtests/Makefile: Run benchmark for memcpy.
The benchmark for memcpy got disabled accidentally. Re-enable it.

ChangeLog:

2013-09-06   Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile (string-bench): Add memcpy.
2013-09-06 11:59:00 +01:00
Will Newton cae16d6675 benchtests/Makefile: Use LDLIBS instead of LDFLAGS.
LDFLAGS puts the library too early in the command line if --as-needed
is being used. Use LDLIBS instead.

ChangeLog:

2013-09-04  Will Newton  <will.newton@linaro.org>

	* benchtests/Makefile: Use LDLIBS instead of LDFLAGS.
2013-09-04 15:38:41 +01:00
Siddhesh Poyarekar 94aca5e740 Port remaining string benchmarks
There were a few more string benchmarks (strcpy_chk and stpcpy_check)
in the debug directory that needed to be ported over.
2013-06-11 20:51:55 +05:30
Siddhesh Poyarekar 9702047480 Copy over string performance tests into benchtests
Copy over already existing string performance tests into benchtests.
Bits not related to performance measurements have been omitted.
2013-06-11 15:08:13 +05:30
Siddhesh Poyarekar c1f75dc386 Begin porting string performance tests to benchtests
This is the initial support for string function performance tests,
along with copying tests for memcpy and memcpy-ifunc as proof of
concept.  The string function benchmarks perform operations at
different alignments and for different sizes and compare performance
between plain operations and the optimized string operations.  Due to
this their output is incompatible with the function benchmarks where
we're interested in fastest time, throughput, etc.

In future, the correctness checks in the benchmark tests can be
removed.  Same goes for the performance measurements in the
string/test-*.
2013-06-11 15:08:13 +05:30
Siddhesh Poyarekar 50b818bf96 Avoid overwriting earlier flags in CPPFLAGS-nonlib in benchtests
When setting BENCH_DURATION in CPPFLAGS-nonlib, append to the variable
instead of assigning to it, to avoid overwriting earlier set flags,
notably the -DNOT_IN_libc=1 flag.
2013-06-10 10:08:46 +05:30
Siddhesh Poyarekar 3ce9e01097 Sort benchmark functions 2013-05-22 11:07:39 +05:30
Siddhesh Poyarekar 051063c88b Add benchmark inputs for math functions
Add benchmark inputs for inverse and hyperbolic trigonometric
functions and log.
2013-05-22 11:07:33 +05:30
Siddhesh Poyarekar fef94eab0b Add a README for benchtests
Move instructions from the Makefile here and expand on them.
2013-05-21 14:59:50 +05:30
Siddhesh Poyarekar 43fe811b73 Use HP_TIMING for benchmarks if available
HP_TIMING uses native timestamping instructions if available, thus
greatly reducing the overhead of recording start and end times for
function calls.  For architectures that don't have HP_TIMING
available, we fall back to the clock_gettime bits.  One may also
override this by invoking the benchmark as follows:

  make USE_CLOCK_GETTIME=1 bench

and get the benchmark results using clock_gettime.  One has to do
`make bench-clean` to ensure that the benchmark programs are rebuilt.
2013-05-13 13:44:32 +05:30
Siddhesh Poyarekar f0ee064b7d Allow multiple input domains to be run in the same benchmark program
Some math functions have distinct performance characteristics in
specific domains of inputs, where some inputs return via a fast path
while other inputs require multiple precision calculations, that too
at different precision levels.  The way to implement different domains
was to have a separate source file and benchmark definition, resulting
in separate programs.

This clutters up the benchmark, so this change allows these domains to
be consolidated into the same input file.  To do this, the input file
format is now enhanced to allow comments with a preceding # and
directives with two # at the begining of a line.  A directive that
looks like:

tells the benchmark generation script that what follows is a different
domain of inputs.  The value of the 'name' directive (in this case,
foo) is used in the output.  The two input domains are then executed
sequentially and their results collated separately.  with the above
directive, there would be two lines in the result that look like:

func(): ....
func(foo): ...
2013-04-30 14:17:57 +05:30
Siddhesh Poyarekar d569c6eeb4 Maintain runtime of each benchmark at ~10 seconds
The idea to run benchmarks for a constant number of iterations is
problematic.  While the benchmarks may run for 10 seconds on x86_64,
they could run for about 30 seconds on powerpc and worse, over 3
minutes on arm.  Besides that, adding a new benchmark is cumbersome
since one needs to find out the number of iterations needed for a
sufficient runtime.

A better idea would be to run each benchmark for a specific amount of
time.  This patch does just that.  The run time defaults to 10 seconds
and it is configurable at command line:

  make BENCH_DURATION=5 bench
2013-04-30 14:10:20 +05:30
Siddhesh Poyarekar 45d69176e8 Mention files in which fast/slow paths of math functions are implemented 2013-04-24 14:07:40 +05:30
Adhemerval Zanella 3c0265394d PowerPC: modf optimization
This patch implements modf/modff optimization for POWER by focus
on FP operations instead of relying in integer ones.
2013-04-23 13:38:52 -05:00
Siddhesh Poyarekar 037714dd49 Add benchmark inputs for cos and tan 2013-04-17 17:45:55 +05:30
Siddhesh Poyarekar 4856bcd2df Define NOT_IN_libc when compiling benchmark programs 2013-04-16 18:34:03 +05:30
Siddhesh Poyarekar a296407432 Add target bench-clean 2013-04-16 14:07:21 +05:30
Siddhesh Poyarekar 206a669911 Write to bench.out-tmp only once
Appending benchmark program output on every run could result in a case
where the benchmark run was cancelled, resulting in a partially
written file.  This file gets used again on the next run, resulting in
results being appended to old results.

It could have been possible to remove the file before every benchmark
run, but it is easier to just write the output to bench.out-tmp only
once.
2013-04-15 13:53:35 +05:30
Siddhesh Poyarekar acb4325fc7 Rebuild benchmark sources when Makefile is updated
Benchmark programs are generated using parameters from the Makefile,
so it is necessary to rebuild them whenever the parameters in the
Makefile are updated.  Hence, added a dependency for the generated C
source on the Makefile so that it gets regenerated when the Makefile
is updated.
2013-04-15 11:17:01 +05:30
Siddhesh Poyarekar 8fc1bee546 Move bench target to benchtests
The bench target will only be used within the benchtests directory.
2013-04-12 15:01:44 +05:30
Siddhesh Poyarekar 64aabd4b80 Add benchmark inputs for atan
Add separate inputs for slow and fast paths of atan
2013-04-03 15:50:15 +05:30
Siddhesh Poyarekar 92e3664bb5 Add benchmark inputs for sin 2013-04-02 17:48:47 +05:30
Siddhesh Poyarekar 81f311c2ee Add benchmark tests for slowpow and slowexp
Separate benchmarks for the fast and slow implementations of pow and
exp since measuring both together doesn't make sense.  Adjust the
iterations for pow and exp accordingly so that they run long enough
for the measurements to be meaningful.
2013-04-02 17:45:45 +05:30
Adhemerval Zanella 60c414c346 PowerPC: remove branch prediction from rint implementation
The branch prediction hints is actually hurts performance in this case.
The assembly implementation make two assumptions: 1. 'fabs (x) < 2^52'
is unlikely and 2. 'x > 0.0' is unlike (if 1. is true). Since it a
general floating point function, expected input is not bounded and then
it is better to let the hardware handle the branches.
2013-04-01 06:36:51 -05:00
Siddhesh Poyarekar 8cfdb7e056 Framework for performance benchmarking of functions
See benchtests/Makefile to know how to use it.
2013-03-15 12:30:03 +05:30