FFmpeg

Author	SHA1	Message	Date
Andreas Rheinhardt	5a72266d49	tests/checkasm/sw_rgb: Fix leaks Also use loop-scope for variables where appropriate. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-01-12 15:41:40 +01:00
James Almer	658a645e18	tests/checkasm/sw_rgb: remove bogus value truncation in check_yuv2packed1() Fixes out of array accesses. Signed-off-by: James Almer <jamrial@gmail.com>	2024-12-31 11:53:18 -03:00
Niklas Haas	a9ae2cc14d	checkasm/sw_rgb: add alpToYV12 check Mirroring lumToYV12 and chrToYV12. Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	2024-12-23 11:20:59 +01:00
Niklas Haas	c601bb8df5	checkasm/sw_rgb: add tests for yuv2packed{1,2,X} Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	2024-12-23 11:20:58 +01:00
Niklas Haas	57bbdb4fb1	checkasm/sw_scale: add test for yuv2nv12cX Mirroring yuv2yuvX. Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	2024-12-23 11:20:58 +01:00
Niklas Haas	fe9bf7cd52	checkasm/sw_scale: add assertion for hscale assumption This code only checks hcScale. In practice this is not an issue because the function pointers should always be identical to hyScale for the same filter size. Add an assertion just to make sure this assumption never regresses. Signed-off-by: Niklas Haas <git@haasn.dev> Sponsored-by: Sovereign Tech Fund	2024-12-23 11:20:58 +01:00
Martin Storsjö	4b524649ff	checkasm: Print benchmarks of C-only functions This corresponds to commit 9278a14cf406f8edb5052c42b83750112bf5b515 in dav1d. Omitting the C-only functions doesn't speed up benchmarking anyway (as those has to be benchmarked before we know if we have any corresponding assembly functions), and being able to benchmark those functions without corresponding assembly can be valuable in a number of cases. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-12-11 10:51:15 +02:00
sunyuechi	82da769492	checkasm/rv40dsp: cover more cases Co-Authored-By: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2024-12-10 11:24:45 -05:00
Martin Storsjö	47b1e1bd84	checkasm: vvc: Use checkasm_check for printing failing output Share the checkasm_check_pixel macro from hevc_pel in checkasm.h, to allow other tests to use the same. (To use it in other tests, those tests need to have a similar setup for high bitdepth pixels, with a local variable named "bit_depth".) This simplifies the code for checking the output, and can print the failing output (including a map of matching/mismatching elements) if checkasm is run with the -v/--verbose option. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-12-10 11:26:09 +02:00
Zhao Zhili	018ec4fe5f	tests/checkasm: Simplify logic for WASI signal handling Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> Reviewed-by: Martin Storsjö <martin@martin.st>	2024-12-06 10:48:11 +08:00
Ramiro Polla	384fe39623	swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats There is an issue with the constants used in YUV to YUV range conversion, where the upper bound is not respected when converting to mpeg range. With this commit, the constants are calculated at runtime, depending on the bit depth. This approach also allows us to more easily understand how the constants are derived. For bit depths <= 14, the number of fixed point bits has been set to 14 for all conversions, to simplify the code. For bit depths > 14, the number of fixed points bits has been raised and set to 18, to allow for the conversion to be accurate enough for the mpeg range to be respected. The convert functions now take the conversion constants (coeff and offset) as function arguments. For bit depths <= 14, coeff is unsigned 16-bit and offset is 32-bit. For bit depths > 14, coeff is unsigned 32-bit and offset is 64-bit. x86_64: chrRangeFromJpeg8_1920_c: 2127.4 2125.0 (1.00x) chrRangeFromJpeg16_1920_c: 2325.2 2127.2 (1.09x) chrRangeToJpeg8_1920_c: 3166.9 3168.7 (1.00x) chrRangeToJpeg16_1920_c: 2152.4 3164.8 (0.68x) lumRangeFromJpeg8_1920_c: 1263.0 1302.5 (0.97x) lumRangeFromJpeg16_1920_c: 1080.5 1299.2 (0.83x) lumRangeToJpeg8_1920_c: 1886.8 2112.2 (0.89x) lumRangeToJpeg16_1920_c: 1077.0 1906.5 (0.56x) aarch64 A55: chrRangeFromJpeg8_1920_c: 28835.2 28835.6 (1.00x) chrRangeFromJpeg16_1920_c: 28839.8 32680.8 (0.88x) chrRangeToJpeg8_1920_c: 23074.7 23075.4 (1.00x) chrRangeToJpeg16_1920_c: 17318.9 24996.0 (0.69x) lumRangeFromJpeg8_1920_c: 15389.7 15384.5 (1.00x) lumRangeFromJpeg16_1920_c: 15388.2 17306.7 (0.89x) lumRangeToJpeg8_1920_c: 19227.8 19226.6 (1.00x) lumRangeToJpeg16_1920_c: 15387.0 21146.3 (0.73x) aarch64 A76: chrRangeFromJpeg8_1920_c: 6324.4 6268.1 (1.01x) chrRangeFromJpeg16_1920_c: 6339.9 11521.5 (0.55x) chrRangeToJpeg8_1920_c: 9656.0 9612.8 (1.00x) chrRangeToJpeg16_1920_c: 6340.4 11651.8 (0.54x) lumRangeFromJpeg8_1920_c: 4422.0 4420.8 (1.00x) lumRangeFromJpeg16_1920_c: 4420.9 5762.0 (0.77x) lumRangeToJpeg8_1920_c: 5949.1 5977.5 (1.00x) lumRangeToJpeg16_1920_c: 4446.8 5946.2 (0.75x) NOTE: all simd optimizations for range_convert have been disabled. they will be re-enabled when they are fixed for each architecture. NOTE2: the same issue still exists in rgb2yuv conversions, which is not addressed in this commit.	2024-12-05 21:10:29 +01:00
Ramiro Polla	536a44e8dc	checkasm/sw_range_convert: test negative input values	2024-12-05 21:10:29 +01:00
Zhao Zhili	ea3d21c349	tests/checkasm: Add partial support for wasm WASI mssing signal and siglongjmp support. This patch workaround build error and add simd128 flag. Please note that many tests use large array on stack, so you need to increase the stack size when build checkasm, e.g., --extra-ldflags='-Wl,-z,stack-size=10485760' Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2024-12-04 16:43:07 +08:00
Niklas Haas	6a91a165fd	swscale: eliminate redundant SwsInternal accesses This is a purely cosmetic commit aimed at replacing accesses to SwsInternal.opts by direct access to SwsContext wherever convenient. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2024-11-25 10:59:52 +01:00
Niklas Haas	2d077f9acd	swscale/internal: group user-facing options together This is a preliminary step to separating these into a new struct. This commit contains no functional changes, it is a pure search-and-replace. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2024-11-21 12:49:56 +01:00
James Almer	9d8f7bf4b8	tests/checkasm/diracdsp: fix alignment for src and ombc_weight buffers They are supposed to be 16 byte aligned, not 8. Should fix crashes in some systems. Signed-off-by: James Almer <jamrial@gmail.com>	2024-11-19 12:32:49 -03:00
Rémi Denis-Courmont	55aa81d5cc	checkasm: add RISC-V vector width to arch info	2024-11-17 11:28:21 +02:00
Kyosuke Kawakami	711290f9a3	checkasm/diracdsp: test add_dirac_obmc Signed-off-by: Kyosuke Kawakami <kawakami150708@gmail.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2024-11-15 13:44:53 -05:00
Ramiro Polla	562524587e	checkasm/sw_range_convert: indent after previous couple of commits	2024-10-27 13:20:56 +01:00
Ramiro Polla	031d98790e	checkasm/sw_range_convert: test all supported bit depths This commit also reduces the number of times ff_sws_init_scale() gets called (only once per bit depth), and the number of times randomize_buffers() gets called (only if the function must be checked). Benchmarks are only performed on bit depths 8 and 16 (since they are different functions, and not only different constants).	2024-10-27 13:20:56 +01:00
Ramiro Polla	2c44393c01	checkasm/sw_range_convert: only run benchmarks on largest input width	2024-10-27 13:20:56 +01:00
Ramiro Polla	e308d09fba	checkasm/sw_range_convert: reduce number of input sizes tested Reduce input sizes to 8 (to test that the function works with widths smaller than the vector length) and 1920 (raising the largest input size to improve benchmark results).	2024-10-27 13:20:56 +01:00
Ramiro Polla	d1acd68d73	checkasm/sw_range_convert: use YUV pixel formats instead of YUVJ We are already setting the range, so we can use regular YUV pixel formats instead of YUVJ.	2024-10-27 13:20:56 +01:00
Ramiro Polla	a8ef1fac0d	checkasm: use FF_ARRAY_ELEMS instead of hardcoding size of arrays	2024-10-27 13:20:56 +01:00
Niklas Haas	67adb30322	swscale: rename SwsContext to SwsInternal And preserve the public SwsContext as separate name. The motivation here is that I want to turn SwsContext into a public struct, while keeping the internal implementation hidden. Additionally, I also want to be able to use multiple internal implementations, e.g. for GPU devices. This commit does not include any functional changes. For the most part, it is a simple rename. The only complications arise from the public facing API functions, which preserve their current type (and hence require an additional unwrapping step internally), and the checkasm test framework, which directly accesses SwsInternal. For consistency, the affected functions that need to maintain a distionction have generally been changed to refer to the SwsContext as sws, and the SwsInternal as c. In an upcoming commit, I will provide a backing definition for the public SwsContext, and update `sws_internal()` to dereference the internal struct instead of merely casting it. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2024-10-24 22:50:00 +02:00
James Almer	e1d1ba4cbc	tests/checkasm/sw_rgb: don't write random data past the end of the buffer Should fix fate-checkasm-sw_rgb under gcc-ubsan. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>	2024-10-17 13:08:39 +02:00
Martin Storsjö	6668268e16	checkasm: lls: Use relative tolerances rather than absolute ones Depending on the magnitude of the output values, the potential errors can be larger. This fixes errors in the lls tests on x86_32 for some seeds, observed with GCC 11 (on Ubuntu 22.04, with the distro compiler, with -m32). Signed-off-by: Martin Storsjö <martin@martin.st>	2024-10-09 15:52:56 +03:00
Martin Storsjö	c65a294f79	checkasm: Print the SVE vector length at startup Signed-off-by: Martin Storsjö <martin@martin.st>	2024-09-27 00:06:55 +03:00
Martin Storsjö	e6eabb7ce7	aarch64: Add CPU feature flags for SVE and SVE2 Add code for detecting the feature on Linux and Windows. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-09-27 00:04:30 +03:00
Martin Storsjö	157ce21939	checkasm/sw_rgb: Revert test additions from `e18b46d95f` The unaligned width test cases fail on i386; we have an assembly function of rgb24toyv12 which is enabled only within "#if ARCH_X86_32 && HAVE_7REGS", which seems to fail these new test cases for unaligned widths. As that assembly function has existed for a long time in that form, the issue probably isn't very recent, thus skip testing these cases for now. Once the assembly function has been fixed, these test cases can be readded. Signed-off-by: Martin Storsjö <martin@martin.st>	2024-09-26 13:16:56 +03:00
Zhao Zhili	e18b46d95f	swscale/aarch64: Fix rgb24toyv12 only works with aligned width Since `c0666d8b`, rgb24toyv12 is broken for width non-aligned to 16. Add a simple wrapper to handle the non-aligned part. Co-authored-by: johzzy <hellojinqiang@gmail.com> Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2024-09-24 10:24:14 +08:00
Ramiro Polla	e0cc06184c	checkasm/sw_rgb: add rgb24toyv12 tests	2024-09-06 23:06:35 +02:00
Ramiro Polla	c08bb33e41	checkasm/sw_rgb: add deinterleaveBytes	2024-09-06 23:05:06 +02:00
James Almer	2a6f84718b	fate/checkasm/sw_gbrp: don't randomly set internal values They are set by sws_init_context(). May help with signed integer overflows reported by gcc-usan. Signed-off-by: James Almer <jamrial@gmail.com>	2024-09-05 22:19:47 -03:00
Rémi Denis-Courmont	d9f594209f	checkasm/riscv: print official extension names	2024-09-04 22:04:11 +03:00
Anton Khirnov	3f9ca51015	lavc/opus*: move to opus/ subdir	2024-09-02 11:56:53 +02:00
Ramiro Polla	6aafe61285	avcodec/mpegvideoencdsp: convert stride parameters from int to ptrdiff_t	2024-09-01 13:42:30 +02:00
Nuo Mi	7175544c0b	checkasm: add vvc_bdof test apply_bdof_8_8x16_c: 5776.5 apply_bdof_8_8x16_avx2: 396.2 apply_bdof_8_16x8_c: 5722.0 apply_bdof_8_16x8_avx2: 216.0 apply_bdof_8_16x16_c: 11213.2 apply_bdof_8_16x16_avx2: 434.5 apply_bdof_10_8x16_c: 5657.7 apply_bdof_10_8x16_avx2: 1096.0 apply_bdof_10_16x8_c: 5531.7 apply_bdof_10_16x8_avx2: 212.5 apply_bdof_10_16x16_c: 11043.7 apply_bdof_10_16x16_avx2: 1252.7 apply_bdof_12_8x16_c: 5680.0 apply_bdof_12_8x16_avx2: 1096.5 apply_bdof_12_16x8_c: 5646.2 apply_bdof_12_16x8_avx2: 624.5 apply_bdof_12_16x16_c: 11076.0 apply_bdof_12_16x16_avx2: 1241.5	2024-08-31 14:08:54 +08:00
J. Dekker	e758b24396	checkasm: add wildcompares for test & functions Added: --test=<pattern> Filter tests by glob style pattern. --bench[=<pattern>] Run benchmark and optionally filter functions by glob style pattern. Example: $ ./tests/checkasm/checkasm --bench=yuva* [...] yuva420p_bgr24_8_c: 34.5 ( 1.00x) yuva420p_bgr24_8_ssse3: 31.1 ( 1.11x) yuva420p_bgr24_128_c: 310.6 ( 1.00x) yuva420p_bgr24_128_ssse3: 178.1 ( 1.74x) yuva420p_bgr24_1080_c: 2509.6 ( 1.00x) yuva420p_bgr24_1080_ssse3: 1471.5 ( 1.71x) yuva420p_bgr24_1920_c: 4462.6 ( 1.00x) yuva420p_bgr24_1920_ssse3: 2331.1 ( 1.91x) [...] Ported from dav1d. Signed-off-by: J. Dekker <jdek@itanimul.li>	2024-08-28 11:45:46 +02:00
J. Dekker	d0986709a8	checkasm: improve print format Port dav1d's checkasm output format to FFmpeg's checkasm, includes relative speedups and aligns results. Signed-off-by: J. Dekker <jdek@itanimul.li>	2024-08-28 11:45:46 +02:00
J. Dekker	03f26549cd	checkasm: print only results to stdout Signed-off-by: J. Dekker <jdek@itanimul.li>	2024-08-28 11:45:46 +02:00
J. Dekker	42528ff835	checkasm: add csv/tsv bench output When collecting performance information from checkasm it is common to parse the output for use in graphs to compare vs different architectures. Signed-off-by: J. Dekker <jdek@itanimul.li>	2024-08-28 11:45:46 +02:00
Ramiro Polla	834964ce1a	checkasm/mpegvideoencdsp: add pix_sum, pix_norm1, and draw_edges	2024-08-26 12:48:09 +02:00
Ramiro Polla	a2e01cade8	checkasm/yuv2yuv: add tests for semiplanar unscaled converters	2024-08-26 11:04:46 +02:00
Ramiro Polla	4545205a26	swscale/yuv2rgb: add yuv42{0,2}p -> gbrp unscaled colorspace converters	2024-08-18 22:26:11 +02:00
Nuo Mi	7eb1df44ae	checkasm: add tests for vvc dmvr dmvr_8_12x20_c: 186.2 dmvr_8_12x20_avx2: 25.7 dmvr_8_20x12_c: 181.7 dmvr_8_20x12_avx2: 25.2 dmvr_8_20x20_c: 283.2 dmvr_8_20x20_avx2: 32.0 dmvr_10_12x20_c: 90.0 dmvr_10_12x20_avx2: 15.7 dmvr_10_20x12_c: 41.0 dmvr_10_20x12_avx2: 14.7 dmvr_10_20x20_c: 81.5 dmvr_10_20x20_avx2: 26.7 dmvr_12_12x20_c: 190.7 dmvr_12_12x20_avx2: 20.2 dmvr_12_20x12_c: 187.2 dmvr_12_20x12_avx2: 20.2 dmvr_12_20x20_c: 292.7 dmvr_12_20x20_avx2: 27.2 dmvr_h_8_12x20_c: 317.0 dmvr_h_8_12x20_avx2: 37.0 dmvr_h_8_20x12_c: 340.0 dmvr_h_8_20x12_avx2: 41.0 dmvr_h_8_20x20_c: 540.7 dmvr_h_8_20x20_avx2: 64.0 dmvr_h_10_12x20_c: 322.7 dmvr_h_10_12x20_avx2: 30.7 dmvr_h_10_20x12_c: 344.2 dmvr_h_10_20x12_avx2: 34.0 dmvr_h_10_20x20_c: 529.0 dmvr_h_10_20x20_avx2: 51.5 dmvr_h_12_12x20_c: 326.7 dmvr_h_12_12x20_avx2: 33.5 dmvr_h_12_20x12_c: 331.7 dmvr_h_12_20x12_avx2: 51.2 dmvr_h_12_20x20_c: 534.0 dmvr_h_12_20x20_avx2: 62.7 dmvr_hv_8_12x20_c: 650.0 dmvr_hv_8_12x20_avx2: 57.2 dmvr_hv_8_20x12_c: 676.2 dmvr_hv_8_20x12_avx2: 70.0 dmvr_hv_8_20x20_c: 1068.5 dmvr_hv_8_20x20_avx2: 103.2 dmvr_hv_10_12x20_c: 649.0 dmvr_hv_10_12x20_avx2: 48.2 dmvr_hv_10_20x12_c: 677.7 dmvr_hv_10_20x12_avx2: 59.7 dmvr_hv_10_20x20_c: 1093.5 dmvr_hv_10_20x20_avx2: 91.7 dmvr_hv_12_12x20_c: 660.0 dmvr_hv_12_12x20_avx2: 58.7 dmvr_hv_12_20x12_c: 682.7 dmvr_hv_12_20x12_avx2: 72.0 dmvr_hv_12_20x20_c: 1094.0 dmvr_hv_12_20x20_avx2: 113.2 dmvr_v_8_12x20_c: 325.7 dmvr_v_8_12x20_avx2: 31.2 dmvr_v_8_20x12_c: 326.2 dmvr_v_8_20x12_avx2: 38.5 dmvr_v_8_20x20_c: 538.5 dmvr_v_8_20x20_avx2: 54.2 dmvr_v_10_12x20_c: 318.5 dmvr_v_10_12x20_avx2: 23.7 dmvr_v_10_20x12_c: 330.7 dmvr_v_10_20x12_avx2: 40.5 dmvr_v_10_20x20_c: 567.5 dmvr_v_10_20x20_avx2: 48.0 dmvr_v_12_12x20_c: 335.2 dmvr_v_12_12x20_avx2: 30.0 dmvr_v_12_20x12_c: 330.2 dmvr_v_12_20x12_avx2: 39.5 dmvr_v_12_20x20_c: 535.2 dmvr_v_12_20x20_avx2: 60.0	2024-08-15 20:19:45 +08:00
Rémi Denis-Courmont	d1326b6347	lavu/riscv: drop probing for zba CPU capability	2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont	1b2a925e94	lavc/riscv: drop probing for F & D extensions F and D extensions are included in all RISC-V application profiles ever made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be selected at compilation time. Currently, there are no consumers for these two flags. If there is ever a need to reintroduce F- or D-specific optimisations, we can always use __riscv_f or __riscv_d compiler predefined macros respectively.	2024-08-01 22:56:50 +03:00
Rémi Denis-Courmont	656a9664bf	checkasm/riscv: preserve T1 whilst calling... This preserves T1 whilst calling the instrumented function. In a Sci-Fi setting where type-based Control Flow Integrity (CFI) is supported, the calling code (i.e., the `checkasm` test case) will set T1 to the expected value of the landing pad label (LPL) of the instrumented function. The call wrapper will always use LPL zero which is a wild card. We should preserve the value of T1 at least until the indirect call to the instrumented function. Of course this is Sci-Fi, because: 1) there is no hardware (or even QEMU) support yet, 2) all our assembler functions currently use LPL zero anyway. This uses T3 rather than T2 because indirect branches with T2 is reserved for notionally direct calls made with an indirect call instruction (e.g. due to GOT indirection), and are exempted from forward-edge CFI checks.	2024-08-01 18:44:01 +03:00
Rémi Denis-Courmont	8030876d1c	checkasm/riscv: align the landing pads	2024-07-25 23:10:14 +03:00

1 2 3 4 5 ...

585 commits