There are situations where, given a tile size, at a resolution level,
there are sub-bands with x0==x1 or y0==y1, that consequently don't have any
valid codeblocks, but the other sub-bands may be non-empty.
Given that we recycle the memory from one tile to another one, those
ghost codeblocks might be non-0 and thus candidate for packet inclusion.
This saves comparing the current pointer with the end of buffer pointer.
This results at least in tiny speed improvement for raw decoding, and
smaller code size for MQC as well.
This kills the remains of the raw.h/.c files that were only used for
decoding. Encoding using the mqc structure already.
Ported from Carl Hetherington work (actually through Matthieu Darbois's port
on top of OpenJPEG 2.1.0)
Can reduce total encoding time by 10-15%
WARNING: VSC mode is not implemented, and so is a temporary regression
that must be fixed.
Visual Studio 2015 does not pass regression tests with `__restrict` so kept disabled for MSVC.
Need to check proper usage of OPJ_RESTRICT (if correct then there’s
probably a bug in vc14)
Closes#661
Addition flag array such that colflags[1+0] is for state of col=0,row=0..3,
colflags[1+1] for col=1, row=0..3, colflags[1+flags_stride] for col=0,row=4..7, ...
This array avoids too much cache trashing when processing by 4 vertical samples
as done in the various decoding steps.
Add a opj_t1_dec_clnpass_step_only_if_flag_not_sig_visit() method that
does the job of opj_t1_dec_clnpass_step_only() assuming the conditions
are met. And use it in opj_t1_dec_clnpass(). The compiler generates
more efficient code.
Fix issue 135
The fix is legal regarding the standard but I did not manage to find out if it covers a bug in opj_t2_read_packet_data or if the file is corrupted
dwt_interleave_h.gsr105.jp2 now has the same output as kakadu
issue399 is corrupted. Only the corrupted part changes.
Update known failures for x86 MD5
NR-DEC-kodak_2layers_lrcp.j2c-31-decode-md5
NR-DEC-kodak_2layers_lrcp.j2c-32-decode-md5
NR-DEC-issue135.j2k-68-decode-md5
Follow-up of #757
This shall have no performance impact on 2’s complement machine where
the compiler replaces the multiplication by power of two (constant) by
a left shift.
Verified at least on MacOS Xcode 7.3, same assembly generated after fix.
This shall have no performance impact on 2’s complement machine where
the compiler replaces the multiplication by power of two (constant) by
a left shift.
Verified at least on MacOS Xcode 7.3, same assembly generated after fix.