Update the bench_dwt utility to have a -decode/-encode switch
Measured performance gains for DWT encoder on a
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (4 cores, hyper threaded)
Encoding time:
$ ./bin/bench_dwt -encode -num_threads 1
time for dwt_encode: total = 8.348 s, wallclock = 8.352 s
$ ./bin/bench_dwt -encode -num_threads 2
time for dwt_encode: total = 9.776 s, wallclock = 4.904 s
$ ./bin/bench_dwt -encode -num_threads 4
time for dwt_encode: total = 13.188 s, wallclock = 3.310 s
$ ./bin/bench_dwt -encode -num_threads 8
time for dwt_encode: total = 30.024 s, wallclock = 4.064 s
Scaling is probably limited by memory access patterns causing
memory access to be the bottleneck.
The slightly worse results with threads==8 than with thread==4
is due to hyperthreading being not appropriate here.
* -PLT switch added to opj_compress
* Add a opj_encoder_set_extra_options() function that
accepts a PLT=YES option, and could be expanded later
for other uses.
-------
Testing with a Sentinel2 10m band, T36JTT_20160914T074612_B02.jp2,
coming from S2A_MSIL1C_20160914T074612_N0204_R135_T36JTT_20160914T081456.SAFE
Decompress it to TIFF:
```
opj_uncompress -i T36JTT_20160914T074612_B02.jp2 -o T36JTT_20160914T074612_B02.tif
```
Recompress it with similar parameters as original:
```
opj_compress -n 5 -c [256,256],[256,256],[256,256],[256,256],[256,256] -t 1024,1024 -PLT -i T36JTT_20160914T074612_B02.tif -o T36JTT_20160914T074612_B02_PLT.jp2
```
Dump codestream detail with GDAL dump_jp2.py utility (https://github.com/OSGeo/gdal/blob/master/gdal/swig/python/samples/dump_jp2.py)
```
python dump_jp2.py T36JTT_20160914T074612_B02.jp2 > /tmp/dump_sentinel2_ori.txt
python dump_jp2.py T36JTT_20160914T074612_B02_PLT.jp2 > /tmp/dump_sentinel2_openjpeg_plt.txt
```
The diff between both show very similar structure, and identical number of packets in PLT markers
Now testing with Kakadu (KDU803_Demo_Apps_for_Linux-x86-64_200210)
Full file decompression:
```
kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp.tif
Consumed 121 tile-part(s) from a total of 121 tile(s).
Consumed 80,318,806 codestream bytes (excluding any file format) = 5.329697
bits/pel.
Processed using the multi-threaded environment, with
8 parallel threads of execution
```
Partial decompresson (presumably using PLT markers):
```
kdu_expand -i T36JTT_20160914T074612_B02.jp2 -o tmp.pgm -region "{0.5,0.5},{0.01,0.01}"
kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp2.pgm -region "{0.5,0.5},{0.01,0.01}"
diff tmp.pgm tmp2.pgm && echo "same !"
```
-------
Funded by ESA for S2-MPC project
This was changed some time ago (https://google.github.io/oss-fuzz/getting-started/new-project-guide/) but the build didn't fail as there is a fallback mechanism. The main advantage of the new approach is that for libFuzzer this produces more performant binaries (as `$LIB_FUZZING_ENGINE` expands into `-fsanitize=fuzzer`, which links libFuzzer from the compiler-rt, allowing better optimization tricks).
I'm also experimenting with dataflow (https://github.com/google/oss-fuzz/issues/1632) on your project, and the dataflow config doesn't have a fallback (as it's a new configuration), therefore I'm proposing a change to migrate from `-lFuzzingEngine` to `$LIB_FUZZING_ENGINE`.
There is currently a false positive ABI check failure between v2.3.1
and current. It disappears when removing the generated reports of v2.3.1
and recreating them. It is likely that some tooling has evolved since
the initial v2.3.1 report generation.
width/length dimensions read from bmp headers are not necessarily
valid. For instance they may have been maliciously set to very large
values with the intention to cause DoS (large memory allocation, stack
overflow). In these cases we want to detect the invalid size as early
as possible.
This commit introduces a counter which verifies that the number of
written bytes corresponds to the advertized width/length.
See commit 8ee335227b for details.
Signed-off-by: Young Xiao <YangX92@hotmail.com>