Commit Graph

752 Commits

Author SHA1 Message Date
Even Rouault 1c5627ee74
T1 encoder: speed-up by aggressive inlining and more cache friendly data organization
~ 9% speed improvement seen on 10980x10980 uint16 image, T36JTT_20160914T074612_B02.tif
opj_compress time from 17.2s to 15.8s
2020-05-24 15:38:48 +02:00
Even Rouault 1e931fdb36
Forward DWT 9-7: major speed up by vectorizing vertical pass
`bench_dwt -I -encode` times goes from 8.6s to 2.1s
2020-05-23 01:01:05 +02:00
Even Rouault a38e970fa5
Forward DWT 5-3: major speed up by vectorizing vertical pass
`bench_dwt -encode` times goes from 7.9s to 1.7s
2020-05-23 01:01:05 +02:00
Even Rouault e69fa09f60
Forward DWT: small code refactoring to allow future improvements for the vertical pass 2020-05-22 16:01:45 +02:00
Even Rouault 33d3d0de07
dwt.c: remove unused typedef 2020-05-22 15:06:29 +02:00
Even Rouault 97b384aecd
Forward DWT 5x3: performance improvements in horizontal pass, and modest in vertical pass 2020-05-22 15:03:40 +02:00
Even Rouault bd5f5ee7de
Forward DWT: small code refactoring to allow future improvements for the horizontal pass 2020-05-22 15:02:33 +02:00
Even Rouault 45a35223b7
Speed-up 9x7 IDWD by ~30% with OPJ_NUM_THREADS=2
"bench_dwt -I" time goes from 2.2s to 1.5s
2020-05-21 17:21:55 +02:00
Even Rouault 272b3e0fb2
Remove useless + 5U margin in opj_dwt_decode_tile_97()
Nothing in code analysis nor test suite shows that this margin is
needed.
It dates back to commit dbeebe72b9
where vector 9x7 decoding was introduced.
2020-05-21 15:42:51 +02:00
Even Rouault 47943daa15
Speed-up 9x7 IDWD by ~20%
"bench_dwt -I" time goes from 2.8s to 2.2s
2020-05-21 15:42:51 +02:00
Even Rouault 0c09062464
bench_dwt.c: add a -I switch to test irreversible FWDT/IDWT 2020-05-20 23:20:48 +02:00
Even Rouault adccbc8336
Irreversible decoding: partially revert previous commit, to fix failures in test suite 2020-05-20 20:31:28 +02:00
Even Rouault 3cd1305596
Irreversible compression/decompression DWT: use 1/K constant as per standard
The previous constant opj_c13318 was mysteriously equal to 2/K , and in
the DWT, we had to divide K and opj_c13318 by 2... The issue was that the
band->stepsize computation in tcd.c didn't take into account the log2gain of
the band.

The effect of this change is expected to be mostly equivalent to the previous
situation, except some difference in rounding. But it leads to a dramatic
reduction of the mean square error and peak error in the irreversible encoding
of issue141.tif !
2020-05-20 20:31:28 +02:00
Even Rouault f38c069547
Irreversible decoding: align code more closely to the standard by avoid messing up with stepsize (no functional change) 2020-05-20 20:31:28 +02:00
Even Rouault e46e300de5
opj_dwt_encode_1_real(): avoid many bound comparisons, similarly to decoding side 2020-05-20 20:31:28 +02:00
Even Rouault 4ab2ed0907
opj_j2k_setup_encoder(): add validation of tile width and height to avoid potential division by zero 2020-05-20 20:31:28 +02:00
Even Rouault c6a413a423
opj_mct_encode_real(): add SSE optimization 2020-05-20 20:31:28 +02:00
Even Rouault 3d35d0f3af
tcd.c: add comment 2020-05-20 20:31:28 +02:00
Even Rouault 00cff6f5c0
Encoder: use floating-point operations for irreversible transformation 2020-05-20 20:31:28 +02:00
Even Rouault 99107d5e46
dwt.c: change sign of constants to match standard and compensate (no functional change) 2020-05-20 20:31:28 +02:00
Even Rouault 07d1f775a1
Add multithreaded support in the DWT encoder.
Update the bench_dwt utility to have a -decode/-encode switch

Measured performance gains for DWT encoder on a
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (4 cores, hyper threaded)

Encoding time:
$ ./bin/bench_dwt -encode -num_threads 1
time for dwt_encode: total = 8.348 s, wallclock = 8.352 s

$ ./bin/bench_dwt -encode -num_threads 2
time for dwt_encode: total = 9.776 s, wallclock = 4.904 s

$ ./bin/bench_dwt -encode -num_threads 4
time for dwt_encode: total = 13.188 s, wallclock = 3.310 s

$ ./bin/bench_dwt -encode -num_threads 8
time for dwt_encode: total = 30.024 s, wallclock = 4.064 s

Scaling is probably limited by memory access patterns causing
memory access to be the bottleneck.
The slightly worse results with threads==8 than with thread==4
is due to hyperthreading being not appropriate here.
2020-05-20 20:30:21 +02:00
Even Rouault 97eb7e0bf1
Add multithreading support in the T1 (entropy phase) encoder
- API wise, opj_codec_set_threads() can be used on the encoding side
- opj_compress has a -threads switch similar to opj_uncompress
2020-05-20 20:30:21 +02:00
Even Rouault 4edb8c8337
Add support for generation of PLT markers in encoder
* -PLT switch added to opj_compress
* Add a opj_encoder_set_extra_options() function that
  accepts a PLT=YES option, and could be expanded later
  for other uses.

-------

Testing with a Sentinel2 10m band, T36JTT_20160914T074612_B02.jp2,
coming from S2A_MSIL1C_20160914T074612_N0204_R135_T36JTT_20160914T081456.SAFE

Decompress it to TIFF:
```
opj_uncompress -i T36JTT_20160914T074612_B02.jp2 -o T36JTT_20160914T074612_B02.tif
```

Recompress it with similar parameters as original:
```
opj_compress -n 5 -c [256,256],[256,256],[256,256],[256,256],[256,256] -t 1024,1024 -PLT -i T36JTT_20160914T074612_B02.tif -o T36JTT_20160914T074612_B02_PLT.jp2
```

Dump codestream detail with GDAL dump_jp2.py utility (https://github.com/OSGeo/gdal/blob/master/gdal/swig/python/samples/dump_jp2.py)
```
python dump_jp2.py T36JTT_20160914T074612_B02.jp2 > /tmp/dump_sentinel2_ori.txt
python dump_jp2.py T36JTT_20160914T074612_B02_PLT.jp2 > /tmp/dump_sentinel2_openjpeg_plt.txt
```

The diff between both show very similar structure, and identical number of packets in PLT markers

Now testing with Kakadu (KDU803_Demo_Apps_for_Linux-x86-64_200210)

Full file decompression:
```
kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp.tif

Consumed 121 tile-part(s) from a total of 121 tile(s).
Consumed 80,318,806 codestream bytes (excluding any file format) = 5.329697
bits/pel.
Processed using the multi-threaded environment, with
    8 parallel threads of execution
```

Partial decompresson (presumably using PLT markers):
```
kdu_expand -i T36JTT_20160914T074612_B02.jp2 -o tmp.pgm -region "{0.5,0.5},{0.01,0.01}"
kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp2.pgm  -region "{0.5,0.5},{0.01,0.01}"
diff tmp.pgm tmp2.pgm && echo "same !"
```

-------

Funded by ESA for S2-MPC project
2020-04-21 15:55:44 +02:00
Even Rouault 64689d05df
struct opj_j2k: remove unused fields, and add some documentation 2020-04-18 18:25:44 +02:00
Even Rouault 271a71ef0f
Fix warnings about signed/unsigned casts in pi.c 2020-04-16 23:34:10 +02:00
Even Rouault 221a801a97
Rename mis-named function opj_tcd_get_encoded_tile_size() to opj_tcd_get_encoder_input_buffer_size() 2020-04-16 20:33:22 +02:00
Even Rouault 84f3bebbff
Implement writing of IMF profiles
Add -IMF switch to opj_compress as well
2020-02-12 15:55:25 +01:00
Even Rouault fffe32adcb
openjpeg.h: fix values of OPJ_PROFILE_IMF_ constants 2020-02-12 15:55:02 +01:00
Even Rouault 05f9b91e60
opj_tcd_init_tile(): avoid integer overflow
That could lead to later assertion failures.

Fixes #1231 / CVE-2020-8112
2020-01-30 00:59:57 +01:00
Even Rouault 024b840739
opj_j2k_update_image_dimensions(): reject images whose coordinates are beyond INT_MAX (fixes #1228) 2020-01-11 01:51:58 +01:00
Even Rouault 4cb1f66304
pi.c: avoid integer overflow, resulting in later invalid access to memory in opj_t2_decode_packets(). Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18979 2019-11-17 01:18:26 +01:00
Even Rouault 5875a6b446
opj_tcd_mct_decode()/opj_mct_decode()/opj_mct_encode_real()/opj_mct_decode_real(): proper deal with a number of samples larger than 4 billion (refs #1151) 2019-10-03 11:04:30 +02:00
Even Rouault e66125fe26
Merge pull request #1164 from sebras/master
openjp2/j2k: Report error if all wanted components are not decoded.
2019-09-03 17:03:54 +02:00
Even Rouault a94cfbd533
Change opj_j2k_check_poc_val() to take into account tile number 2019-04-25 15:06:45 +02:00
Even Rouault 6423163141
Fix POC in multi-tile scenarios: avoid almost endless loop when a tile has no POC settings 2019-04-25 14:40:56 +02:00
Even Rouault 23883458b9
opj_j2k_check_poc_val(): prevent potential write outside of allocated array 2019-04-25 14:40:56 +02:00
Even Rouault 6589c609f6
opj_j2k_check_poc_val(): fix starting index for checking layer dimension
The standard mandates that the layer index always starts at zero for every
progression.
2019-04-25 14:40:55 +02:00
Even Rouault 1e3a57563d
compression: emit POC marker when only one single POC is requested (fixes #1191) 2019-04-25 14:40:55 +02:00
Even Rouault 5dd75f62e2
j2k.c: use correct naming convention for total_data_size variable 2019-04-23 16:52:21 +02:00
Even Rouault a1d32a596a
opj_t1_encode_cblks: fix UBSAN signed integer overflow
Fixes #1053 / CVE-2018-5727

Note: I don't consider this issue to be a security vulnerability, in
practice.
At least with gcc or clang compilers on x86_64 which generate the same
assembly code with or without that fix.
2019-03-29 11:17:39 +01:00
Sebastian Rasmussen b2751967ec openjp2/j2k: Report error if all wanted components are not decoded.
Previously the caller had to check whether each component data had
been decoded. This means duplicating the checking in every user of
openjpeg which is unnecessary. If the caller wantes to decode all
or a set of, or a specific component then openjpeg ought to error
out if it was unable to do so.

Fixes #1158.
2019-02-21 16:48:02 +08:00
Young_X c58df14990 [OPENJP2] change the way to compute *p_tx0, *p_tx1, *p_ty0, *p_ty1 in function
opj_get_encoding_parameters

Signed-off-by: Young_X <YangX92@hotmail.com>
2018-11-28 14:39:14 +08:00
Stefan Weil 948332e6ed Fix some potential overflow issues (#1161)
* Fix some potential overflow issues

Put sizeof to the beginning of the multiplication to enforce that
size_t instead of smaller integer types is used for the calculation.

This fixes warnings from LGTM:

    Multiplication result may overflow 'unsigned int'
    before it is converted to 'unsigned long'.

It also allows removing some type casts.

Signed-off-by: Stefan Weil <sw@weilnetz.de>

* Fix code indentation

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-10-31 20:44:30 +01:00
Nikola Forró 943db0f1c2 Fix several memory and resource leaks
Signed-off-by: Nikola Forró <nforro@redhat.com>
2018-10-31 16:16:22 +01:00
Even Rouault cd900d9661
opj_thread_pool_setup(): fix infinite waiting if a thread creation failed 2018-10-18 11:45:45 +02:00
Even Rouault 8fc09e50e5
opj_jp2_apply_pclr(): remove useless assert that can trigger on some files (fixes #1125) 2018-09-22 23:47:56 +02:00
Even Rouault 5d94bcd89c
Merge pull request #1136 from reverson/master
Cast on uint ceildiv
2018-09-22 22:59:36 +02:00
Even Rouault 17bbb0e23f
Merge pull request #1128 from stweil/typos
Fix some typos in code comments and documentation
2018-09-22 22:55:33 +02:00
Stefan Weil 31a03b390a openjp2/jp2: Fix two format strings
Compiler warnings:

src/lib/openjp2/jp2.c:1008:35: warning:
 too many arguments for format [-Wformat-extra-args]
src/lib/openjp2/j2k.c:1928:73: warning:
 format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘OPJ_OFF_T {aka long int}’ [-Wformat=]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-05 21:52:43 +02:00
Stefan Weil 3d6ffaf3f3 Fix some typos in code comments and documentation
All typos were found by Codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2018-09-05 20:01:10 +02:00