Builder ffmpeg-solaris10-sparc Build #13172
Results:
Failed shell_2 shell_3 shell_4 shell_5
SourceStamp:
| Project | ffmpeg |
| Repository | https://git.ffmpeg.org/ffmpeg.git |
| Branch | master |
| Revision | 6c1c1720cf940057ace674efbcb4c0621bf535d2 |
| Got Revision | 6c1c1720cf940057ace674efbcb4c0621bf535d2 |
| Changes | 12 changes |
BuildSlave:
unstable10sReason:
The SingleBranchScheduler scheduler named 'schedule-ffmpeg-solaris10-sparc' triggered this build
Steps and Logfiles:
-
git update ( 11 secs )
-
shell 'gsed -i ...' ( 0 secs )
-
shell_1 'gsed -i ...' ( 0 secs )
-
shell_2 'gsed -i ...' failed ( 0 secs )
-
shell_3 './configure --samples="../../../ffmpeg/fate-suite" ...' failed ( 7 secs )
-
shell_4 'gmake fate-rsync' failed ( 0 secs )
-
shell_5 '../../../ffmpeg/fate.sh ../../../ffmpeg/fate_config.sh' failed ( 0 secs )
Build Properties:
| Name | Value | Source |
|---|---|---|
| branch | master | Build |
| builddir | /export/home/buildbot-unstable10s/slave/ffmpeg-solaris10-sparc | slave |
| buildername | ffmpeg-solaris10-sparc | Builder |
| buildnumber | 13172 | Build |
| codebase | Build | |
| got_revision | 6c1c1720cf940057ace674efbcb4c0621bf535d2 | Git |
| project | ffmpeg | Build |
| repository | https://git.ffmpeg.org/ffmpeg.git | Build |
| revision | 6c1c1720cf940057ace674efbcb4c0621bf535d2 | Build |
| scheduler | schedule-ffmpeg-solaris10-sparc | Scheduler |
| slavename | unstable10s | BuildSlave |
| workdir | /export/home/buildbot-unstable10s/slave/ffmpeg-solaris10-sparc | slave (deprecated) |
Forced Build Properties:
| Name | Label | Value |
|---|
Responsible Users:
- Andreas Rheinhardtandreas.rheinhardt@outlook.com
Timing:
| Start | Sun Feb 22 03:05:03 2026 |
| End | Sun Feb 22 03:05:24 2026 |
| Elapsed | 20 secs |
All Changes:
:
Change #258482
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 00:57:56 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 7bf9c1e3f6effbe7d2dd53096bf2a7dbbb07d7ff Comments
avcodec/x86/vvc/mc: Avoid redundant clipping for 8bit It is already done by packuswb. Old benchmarks: avg_8_2x2_c: 11.1 ( 1.00x) avg_8_2x2_avx2: 8.6 ( 1.28x) avg_8_4x4_c: 30.0 ( 1.00x) avg_8_4x4_avx2: 10.8 ( 2.78x) avg_8_8x8_c: 132.0 ( 1.00x) avg_8_8x8_avx2: 25.7 ( 5.14x) avg_8_16x16_c: 254.6 ( 1.00x) avg_8_16x16_avx2: 33.2 ( 7.67x) avg_8_32x32_c: 897.5 ( 1.00x) avg_8_32x32_avx2: 115.6 ( 7.76x) avg_8_64x64_c: 3316.9 ( 1.00x) avg_8_64x64_avx2: 626.5 ( 5.29x) avg_8_128x128_c: 12973.6 ( 1.00x) avg_8_128x128_avx2: 1914.0 ( 6.78x) w_avg_8_2x2_c: 16.7 ( 1.00x) w_avg_8_2x2_avx2: 14.4 ( 1.16x) w_avg_8_4x4_c: 48.2 ( 1.00x) w_avg_8_4x4_avx2: 16.5 ( 2.92x) w_avg_8_8x8_c: 168.1 ( 1.00x) w_avg_8_8x8_avx2: 49.7 ( 3.38x) w_avg_8_16x16_c: 392.4 ( 1.00x) w_avg_8_16x16_avx2: 61.1 ( 6.43x) w_avg_8_32x32_c: 1455.3 ( 1.00x) w_avg_8_32x32_avx2: 224.6 ( 6.48x) w_avg_8_64x64_c: 5632.1 ( 1.00x) w_avg_8_64x64_avx2: 896.9 ( 6.28x) w_avg_8_128x128_c: 22136.3 ( 1.00x) w_avg_8_128x128_avx2: 3626.7 ( 6.10x) New benchmarks: avg_8_2x2_c: 12.3 ( 1.00x) avg_8_2x2_avx2: 8.1 ( 1.52x) avg_8_4x4_c: 30.3 ( 1.00x) avg_8_4x4_avx2: 11.3 ( 2.67x) avg_8_8x8_c: 131.8 ( 1.00x) avg_8_8x8_avx2: 21.3 ( 6.20x) avg_8_16x16_c: 255.0 ( 1.00x) avg_8_16x16_avx2: 30.6 ( 8.33x) avg_8_32x32_c: 898.5 ( 1.00x) avg_8_32x32_avx2: 104.9 ( 8.57x) avg_8_64x64_c: 3317.7 ( 1.00x) avg_8_64x64_avx2: 540.9 ( 6.13x) avg_8_128x128_c: 12986.5 ( 1.00x) avg_8_128x128_avx2: 1663.4 ( 7.81x) w_avg_8_2x2_c: 16.8 ( 1.00x) w_avg_8_2x2_avx2: 13.9 ( 1.21x) w_avg_8_4x4_c: 48.2 ( 1.00x) w_avg_8_4x4_avx2: 16.2 ( 2.98x) w_avg_8_8x8_c: 168.6 ( 1.00x) w_avg_8_8x8_avx2: 46.3 ( 3.64x) w_avg_8_16x16_c: 392.4 ( 1.00x) w_avg_8_16x16_avx2: 57.7 ( 6.80x) w_avg_8_32x32_c: 1454.6 ( 1.00x) w_avg_8_32x32_avx2: 214.6 ( 6.78x) w_avg_8_64x64_c: 5638.4 ( 1.00x) w_avg_8_64x64_avx2: 875.6 ( 6.44x) w_avg_8_128x128_c: 22133.5 ( 1.00x) w_avg_8_128x128_avx2: 3334.3 ( 6.64x) Also saves 550B of .text here. The improvements will likely be even better on Win64, because it avoids using two nonvolatile registers in the weighted average case. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/mc.asm
Change #258483
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 00:57:56 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision caa0ae0cfb35de0ae3fd5f346caef89d62eeaf7c Comments
avcodec/x86/vvc/mc: Avoid pextr[dq], v{insert,extract}i128 Use mov[dq], movdqu instead if the least significant parts are set (i.e. if the immediate value is 0x0). Old benchmarks: avg_8_2x2_c: 11.3 ( 1.00x) avg_8_2x2_avx2: 7.5 ( 1.50x) avg_8_4x4_c: 31.2 ( 1.00x) avg_8_4x4_avx2: 10.7 ( 2.91x) avg_8_8x8_c: 133.5 ( 1.00x) avg_8_8x8_avx2: 21.2 ( 6.30x) avg_8_16x16_c: 254.7 ( 1.00x) avg_8_16x16_avx2: 30.1 ( 8.46x) avg_8_32x32_c: 896.9 ( 1.00x) avg_8_32x32_avx2: 103.9 ( 8.63x) avg_8_64x64_c: 3320.7 ( 1.00x) avg_8_64x64_avx2: 539.4 ( 6.16x) avg_8_128x128_c: 12991.5 ( 1.00x) avg_8_128x128_avx2: 1661.3 ( 7.82x) avg_10_2x2_c: 21.3 ( 1.00x) avg_10_2x2_avx2: 8.3 ( 2.55x) avg_10_4x4_c: 34.9 ( 1.00x) avg_10_4x4_avx2: 10.6 ( 3.28x) avg_10_8x8_c: 76.3 ( 1.00x) avg_10_8x8_avx2: 20.2 ( 3.77x) avg_10_16x16_c: 255.9 ( 1.00x) avg_10_16x16_avx2: 24.1 (10.60x) avg_10_32x32_c: 932.4 ( 1.00x) avg_10_32x32_avx2: 73.3 (12.72x) avg_10_64x64_c: 3516.4 ( 1.00x) avg_10_64x64_avx2: 601.7 ( 5.84x) avg_10_128x128_c: 13690.6 ( 1.00x) avg_10_128x128_avx2: 1613.2 ( 8.49x) avg_12_2x2_c: 14.0 ( 1.00x) avg_12_2x2_avx2: 8.3 ( 1.67x) avg_12_4x4_c: 35.3 ( 1.00x) avg_12_4x4_avx2: 10.9 ( 3.26x) avg_12_8x8_c: 76.5 ( 1.00x) avg_12_8x8_avx2: 20.3 ( 3.77x) avg_12_16x16_c: 256.7 ( 1.00x) avg_12_16x16_avx2: 24.1 (10.63x) avg_12_32x32_c: 932.5 ( 1.00x) avg_12_32x32_avx2: 73.3 (12.72x) avg_12_64x64_c: 3520.5 ( 1.00x) avg_12_64x64_avx2: 602.6 ( 5.84x) avg_12_128x128_c: 13689.6 ( 1.00x) avg_12_128x128_avx2: 1613.1 ( 8.49x) w_avg_8_2x2_c: 16.7 ( 1.00x) w_avg_8_2x2_avx2: 13.4 ( 1.25x) w_avg_8_4x4_c: 44.5 ( 1.00x) w_avg_8_4x4_avx2: 15.9 ( 2.81x) w_avg_8_8x8_c: 166.1 ( 1.00x) w_avg_8_8x8_avx2: 45.7 ( 3.63x) w_avg_8_16x16_c: 392.9 ( 1.00x) w_avg_8_16x16_avx2: 57.8 ( 6.80x) w_avg_8_32x32_c: 1455.5 ( 1.00x) w_avg_8_32x32_avx2: 215.0 ( 6.77x) w_avg_8_64x64_c: 5621.8 ( 1.00x) w_avg_8_64x64_avx2: 875.2 ( 6.42x) w_avg_8_128x128_c: 22131.3 ( 1.00x) w_avg_8_128x128_avx2: 3390.1 ( 6.53x) w_avg_10_2x2_c: 18.0 ( 1.00x) w_avg_10_2x2_avx2: 14.0 ( 1.28x) w_avg_10_4x4_c: 53.9 ( 1.00x) w_avg_10_4x4_avx2: 15.9 ( 3.40x) w_avg_10_8x8_c: 109.5 ( 1.00x) w_avg_10_8x8_avx2: 40.4 ( 2.71x) w_avg_10_16x16_c: 395.7 ( 1.00x) w_avg_10_16x16_avx2: 44.7 ( 8.86x) w_avg_10_32x32_c: 1532.7 ( 1.00x) w_avg_10_32x32_avx2: 142.4 (10.77x) w_avg_10_64x64_c: 6007.7 ( 1.00x) w_avg_10_64x64_avx2: 745.5 ( 8.06x) w_avg_10_128x128_c: 23719.7 ( 1.00x) w_avg_10_128x128_avx2: 2217.7 (10.70x) w_avg_12_2x2_c: 18.9 ( 1.00x) w_avg_12_2x2_avx2: 13.6 ( 1.38x) w_avg_12_4x4_c: 47.5 ( 1.00x) w_avg_12_4x4_avx2: 15.9 ( 2.99x) w_avg_12_8x8_c: 109.3 ( 1.00x) w_avg_12_8x8_avx2: 40.9 ( 2.67x) w_avg_12_16x16_c: 395.6 ( 1.00x) w_avg_12_16x16_avx2: 44.8 ( 8.84x) w_avg_12_32x32_c: 1531.0 ( 1.00x) w_avg_12_32x32_avx2: 141.8 (10.80x) w_avg_12_64x64_c: 6016.7 ( 1.00x) w_avg_12_64x64_avx2: 732.8 ( 8.21x) w_avg_12_128x128_c: 23762.2 ( 1.00x) w_avg_12_128x128_avx2: 2223.4 (10.69x) New benchmarks: avg_8_2x2_c: 11.3 ( 1.00x) avg_8_2x2_avx2: 7.6 ( 1.49x) avg_8_4x4_c: 31.2 ( 1.00x) avg_8_4x4_avx2: 10.8 ( 2.89x) avg_8_8x8_c: 131.6 ( 1.00x) avg_8_8x8_avx2: 15.6 ( 8.42x) avg_8_16x16_c: 255.3 ( 1.00x) avg_8_16x16_avx2: 27.9 ( 9.16x) avg_8_32x32_c: 897.9 ( 1.00x) avg_8_32x32_avx2: 81.2 (11.06x) avg_8_64x64_c: 3320.0 ( 1.00x) avg_8_64x64_avx2: 335.1 ( 9.91x) avg_8_128x128_c: 12999.1 ( 1.00x) avg_8_128x128_avx2: 1456.3 ( 8.93x) avg_10_2x2_c: 12.0 ( 1.00x) avg_10_2x2_avx2: 8.6 ( 1.40x) avg_10_4x4_c: 34.9 ( 1.00x) avg_10_4x4_avx2: 9.7 ( 3.61x) avg_10_8x8_c: 76.7 ( 1.00x) avg_10_8x8_avx2: 16.3 ( 4.69x) avg_10_16x16_c: 256.3 ( 1.00x) avg_10_16x16_avx2: 25.2 (10.18x) avg_10_32x32_c: 932.8 ( 1.00x) avg_10_32x32_avx2: 73.3 (12.72x) avg_10_64x64_c: 3518.8 ( 1.00x) avg_10_64x64_avx2: 416.8 ( 8.44x) avg_10_128x128_c: 13691.6 ( 1.00x) avg_10_128x128_avx2: 1612.9 ( 8.49x) avg_12_2x2_c: 14.1 ( 1.00x) avg_12_2x2_avx2: 8.7 ( 1.62x) avg_12_4x4_c: 35.7 ( 1.00x) avg_12_4x4_avx2: 9.7 ( 3.68x) avg_12_8x8_c: 77.0 ( 1.00x) avg_12_8x8_avx2: 16.9 ( 4.57x) avg_12_16x16_c: 256.2 ( 1.00x) avg_12_16x16_avx2: 25.7 ( 9.96x) avg_12_32x32_c: 933.5 ( 1.00x) avg_12_32x32_avx2: 74.0 (12.62x) avg_12_64x64_c: 3516.4 ( 1.00x) avg_12_64x64_avx2: 408.7 ( 8.60x) avg_12_128x128_c: 13691.6 ( 1.00x) avg_12_128x128_avx2: 1613.8 ( 8.48x) w_avg_8_2x2_c: 16.7 ( 1.00x) w_avg_8_2x2_avx2: 14.0 ( 1.19x) w_avg_8_4x4_c: 48.2 ( 1.00x) w_avg_8_4x4_avx2: 16.1 ( 3.00x) w_avg_8_8x8_c: 168.0 ( 1.00x) w_avg_8_8x8_avx2: 22.5 ( 7.47x) w_avg_8_16x16_c: 392.5 ( 1.00x) w_avg_8_16x16_avx2: 47.9 ( 8.19x) w_avg_8_32x32_c: 1453.7 ( 1.00x) w_avg_8_32x32_avx2: 176.1 ( 8.26x) w_avg_8_64x64_c: 5631.4 ( 1.00x) w_avg_8_64x64_avx2: 690.8 ( 8.15x) w_avg_8_128x128_c: 22139.5 ( 1.00x) w_avg_8_128x128_avx2: 2742.4 ( 8.07x) w_avg_10_2x2_c: 18.1 ( 1.00x) w_avg_10_2x2_avx2: 13.8 ( 1.31x) w_avg_10_4x4_c: 47.0 ( 1.00x) w_avg_10_4x4_avx2: 16.4 ( 2.87x) w_avg_10_8x8_c: 110.0 ( 1.00x) w_avg_10_8x8_avx2: 21.6 ( 5.09x) w_avg_10_16x16_c: 395.2 ( 1.00x) w_avg_10_16x16_avx2: 45.4 ( 8.71x) w_avg_10_32x32_c: 1533.8 ( 1.00x) w_avg_10_32x32_avx2: 142.6 (10.76x) w_avg_10_64x64_c: 6004.4 ( 1.00x) w_avg_10_64x64_avx2: 672.8 ( 8.92x) w_avg_10_128x128_c: 23748.5 ( 1.00x) w_avg_10_128x128_avx2: 2198.0 (10.80x) w_avg_12_2x2_c: 17.2 ( 1.00x) w_avg_12_2x2_avx2: 13.9 ( 1.24x) w_avg_12_4x4_c: 51.4 ( 1.00x) w_avg_12_4x4_avx2: 16.5 ( 3.11x) w_avg_12_8x8_c: 109.1 ( 1.00x) w_avg_12_8x8_avx2: 22.0 ( 4.96x) w_avg_12_16x16_c: 395.9 ( 1.00x) w_avg_12_16x16_avx2: 44.9 ( 8.81x) w_avg_12_32x32_c: 1533.5 ( 1.00x) w_avg_12_32x32_avx2: 142.3 (10.78x) w_avg_12_64x64_c: 6002.0 ( 1.00x) w_avg_12_64x64_avx2: 557.5 (10.77x) w_avg_12_128x128_c: 23749.5 ( 1.00x) w_avg_12_128x128_avx2: 2202.0 (10.79x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>Changed files
- libavcodec/x86/vvc/mc.asm
Change #258484
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 00:57:56 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 9317fb2b2ed2e7b7718f99f3521997425134a78d Comments
avcodec/x86/vvc/mc: Avoid ymm registers where possible Widths 2 and 4 fit into xmm registers. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/mc.asm
Change #258485
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 00:57:56 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision eabf52e787f60ad57d9bce28bcce0cca730da6da Comments
avcodec/x86/vvc/mc: Avoid unused work The high quadword of these registers is zero for width 2. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/mc.asm
Change #258486
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 00:57:56 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 59f8ff4c183d3d91156d9b1054b8e070dbbfd5fb Comments
avcodec/x86/vvc/mc: Remove unused constants Also avoid overaligning .rodata. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/mc.asm
Change #258487
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 00:57:56 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 5a60b3f1a618ec9981f2950558df59087f799995 Comments
avcodec/x86/vvc/mc: Remove always-false branches The C versions of the average and weighted average functions contains "FFMAX(3, 15 - BIT_DEPTH)" and the code here followed this; yet it is only instantiated for bit depths 8, 10 and 12, for which the above is just 15-BIT_DEPTH. So the comparisons are unnecessary. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/mc.asm
Change #258488
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 00:58:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision ea78402e9c78173118fb1e71d4192cb6840388e7 Comments
avcodec/x86/vvc/mc,dsp_init: Avoid pointless wrappers for avg Up until now, there were two averaging assembly functions, one for eight bit content and one for <=16 bit content; there are also three C-wrappers around these functions, for 8, 10 and 12 bpp. These wrappers simply forward the maximum permissible value (i.e. (1<<bpp)-1) and promote some integer values to ptrdiff_t. Yet these wrappers are absolutely useless: The assembly functions rederive the bpp from the maximum and only the integer part of the promoted ptrdiff_t values is ever used. Of course, these wrappers also entail an additional call (not a tail call, because the additional maximum parameter is passed on the stack). Remove the wrappers and add per-bpp assembly functions instead. Given that the only difference between 10 and 12 bits are some constants in registers, the main part of these functions can be shared (given that this code uses a jumptable, it can even be done without adding any additional jump). Old benchmarks: avg_8_2x2_c: 11.4 ( 1.00x) avg_8_2x2_avx2: 7.9 ( 1.44x) avg_8_4x4_c: 30.7 ( 1.00x) avg_8_4x4_avx2: 10.4 ( 2.95x) avg_8_8x8_c: 134.5 ( 1.00x) avg_8_8x8_avx2: 16.6 ( 8.12x) avg_8_16x16_c: 255.6 ( 1.00x) avg_8_16x16_avx2: 28.2 ( 9.07x) avg_8_32x32_c: 897.7 ( 1.00x) avg_8_32x32_avx2: 83.9 (10.70x) avg_8_64x64_c: 3320.0 ( 1.00x) avg_8_64x64_avx2: 321.1 (10.34x) avg_8_128x128_c: 12981.8 ( 1.00x) avg_8_128x128_avx2: 1480.1 ( 8.77x) avg_10_2x2_c: 12.0 ( 1.00x) avg_10_2x2_avx2: 8.4 ( 1.43x) avg_10_4x4_c: 34.9 ( 1.00x) avg_10_4x4_avx2: 9.8 ( 3.56x) avg_10_8x8_c: 76.8 ( 1.00x) avg_10_8x8_avx2: 15.1 ( 5.08x) avg_10_16x16_c: 256.6 ( 1.00x) avg_10_16x16_avx2: 25.1 (10.20x) avg_10_32x32_c: 932.9 ( 1.00x) avg_10_32x32_avx2: 73.4 (12.72x) avg_10_64x64_c: 3517.9 ( 1.00x) avg_10_64x64_avx2: 414.8 ( 8.48x) avg_10_128x128_c: 13695.3 ( 1.00x) avg_10_128x128_avx2: 1648.1 ( 8.31x) avg_12_2x2_c: 13.1 ( 1.00x) avg_12_2x2_avx2: 8.6 ( 1.53x) avg_12_4x4_c: 35.4 ( 1.00x) avg_12_4x4_avx2: 10.1 ( 3.49x) avg_12_8x8_c: 76.6 ( 1.00x) avg_12_8x8_avx2: 16.7 ( 4.60x) avg_12_16x16_c: 256.6 ( 1.00x) avg_12_16x16_avx2: 25.5 (10.07x) avg_12_32x32_c: 933.2 ( 1.00x) avg_12_32x32_avx2: 75.7 (12.34x) avg_12_64x64_c: 3519.1 ( 1.00x) avg_12_64x64_avx2: 416.8 ( 8.44x) avg_12_128x128_c: 13695.1 ( 1.00x) avg_12_128x128_avx2: 1651.6 ( 8.29x) New benchmarks: avg_8_2x2_c: 11.5 ( 1.00x) avg_8_2x2_avx2: 6.0 ( 1.91x) avg_8_4x4_c: 29.7 ( 1.00x) avg_8_4x4_avx2: 8.0 ( 3.72x) avg_8_8x8_c: 131.4 ( 1.00x) avg_8_8x8_avx2: 12.2 (10.74x) avg_8_16x16_c: 254.3 ( 1.00x) avg_8_16x16_avx2: 24.8 (10.25x) avg_8_32x32_c: 897.7 ( 1.00x) avg_8_32x32_avx2: 77.8 (11.54x) avg_8_64x64_c: 3321.3 ( 1.00x) avg_8_64x64_avx2: 318.7 (10.42x) avg_8_128x128_c: 12988.4 ( 1.00x) avg_8_128x128_avx2: 1430.1 ( 9.08x) avg_10_2x2_c: 12.1 ( 1.00x) avg_10_2x2_avx2: 5.7 ( 2.13x) avg_10_4x4_c: 35.0 ( 1.00x) avg_10_4x4_avx2: 9.0 ( 3.88x) avg_10_8x8_c: 77.2 ( 1.00x) avg_10_8x8_avx2: 12.4 ( 6.24x) avg_10_16x16_c: 256.2 ( 1.00x) avg_10_16x16_avx2: 24.3 (10.56x) avg_10_32x32_c: 932.9 ( 1.00x) avg_10_32x32_avx2: 71.9 (12.97x) avg_10_64x64_c: 3516.8 ( 1.00x) avg_10_64x64_avx2: 414.7 ( 8.48x) avg_10_128x128_c: 13693.7 ( 1.00x) avg_10_128x128_avx2: 1609.3 ( 8.51x) avg_12_2x2_c: 14.1 ( 1.00x) avg_12_2x2_avx2: 5.7 ( 2.48x) avg_12_4x4_c: 35.8 ( 1.00x) avg_12_4x4_avx2: 9.0 ( 3.96x) avg_12_8x8_c: 76.9 ( 1.00x) avg_12_8x8_avx2: 12.4 ( 6.22x) avg_12_16x16_c: 256.5 ( 1.00x) avg_12_16x16_avx2: 24.4 (10.50x) avg_12_32x32_c: 934.1 ( 1.00x) avg_12_32x32_avx2: 72.0 (12.97x) avg_12_64x64_c: 3518.2 ( 1.00x) avg_12_64x64_avx2: 414.8 ( 8.48x) avg_12_128x128_c: 13689.5 ( 1.00x) avg_12_128x128_avx2: 1611.1 ( 8.50x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/dsp_init.c
- libavcodec/x86/vvc/mc.asm
Change #258489
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 01:01:27 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 81fb70c833ac675ac8e09b38ad845a90de4c3e1c Comments
avcodec/x86/vvc/mc,dsp_init: Avoid pointless wrappers for w_avg They only add overhead (in form of another function call, sign-extending some parameters to 64bit (although the upper bits are not used at all) and rederiving the actual number of bits (from the maximum value (1<<bpp)-1)). Old benchmarks: w_avg_8_2x2_c: 16.4 ( 1.00x) w_avg_8_2x2_avx2: 12.9 ( 1.27x) w_avg_8_4x4_c: 48.0 ( 1.00x) w_avg_8_4x4_avx2: 14.9 ( 3.23x) w_avg_8_8x8_c: 168.2 ( 1.00x) w_avg_8_8x8_avx2: 22.4 ( 7.49x) w_avg_8_16x16_c: 396.5 ( 1.00x) w_avg_8_16x16_avx2: 47.9 ( 8.28x) w_avg_8_32x32_c: 1466.3 ( 1.00x) w_avg_8_32x32_avx2: 172.8 ( 8.48x) w_avg_8_64x64_c: 5629.3 ( 1.00x) w_avg_8_64x64_avx2: 678.7 ( 8.29x) w_avg_8_128x128_c: 22122.4 ( 1.00x) w_avg_8_128x128_avx2: 2743.5 ( 8.06x) w_avg_10_2x2_c: 18.7 ( 1.00x) w_avg_10_2x2_avx2: 13.1 ( 1.43x) w_avg_10_4x4_c: 50.3 ( 1.00x) w_avg_10_4x4_avx2: 15.9 ( 3.17x) w_avg_10_8x8_c: 109.3 ( 1.00x) w_avg_10_8x8_avx2: 20.6 ( 5.30x) w_avg_10_16x16_c: 395.5 ( 1.00x) w_avg_10_16x16_avx2: 44.8 ( 8.83x) w_avg_10_32x32_c: 1534.2 ( 1.00x) w_avg_10_32x32_avx2: 141.4 (10.85x) w_avg_10_64x64_c: 6003.6 ( 1.00x) w_avg_10_64x64_avx2: 557.4 (10.77x) w_avg_10_128x128_c: 23722.7 ( 1.00x) w_avg_10_128x128_avx2: 2205.0 (10.76x) w_avg_12_2x2_c: 18.6 ( 1.00x) w_avg_12_2x2_avx2: 13.1 ( 1.42x) w_avg_12_4x4_c: 52.2 ( 1.00x) w_avg_12_4x4_avx2: 16.1 ( 3.24x) w_avg_12_8x8_c: 109.2 ( 1.00x) w_avg_12_8x8_avx2: 20.6 ( 5.29x) w_avg_12_16x16_c: 396.1 ( 1.00x) w_avg_12_16x16_avx2: 45.0 ( 8.81x) w_avg_12_32x32_c: 1532.6 ( 1.00x) w_avg_12_32x32_avx2: 142.1 (10.79x) w_avg_12_64x64_c: 6002.2 ( 1.00x) w_avg_12_64x64_avx2: 557.3 (10.77x) w_avg_12_128x128_c: 23748.7 ( 1.00x) w_avg_12_128x128_avx2: 2206.4 (10.76x) New benchmarks: w_avg_8_2x2_c: 16.0 ( 1.00x) w_avg_8_2x2_avx2: 9.3 ( 1.71x) w_avg_8_4x4_c: 48.4 ( 1.00x) w_avg_8_4x4_avx2: 12.4 ( 3.91x) w_avg_8_8x8_c: 168.7 ( 1.00x) w_avg_8_8x8_avx2: 21.1 ( 8.00x) w_avg_8_16x16_c: 394.5 ( 1.00x) w_avg_8_16x16_avx2: 46.2 ( 8.54x) w_avg_8_32x32_c: 1456.3 ( 1.00x) w_avg_8_32x32_avx2: 171.8 ( 8.48x) w_avg_8_64x64_c: 5636.2 ( 1.00x) w_avg_8_64x64_avx2: 676.9 ( 8.33x) w_avg_8_128x128_c: 22129.1 ( 1.00x) w_avg_8_128x128_avx2: 2734.3 ( 8.09x) w_avg_10_2x2_c: 18.7 ( 1.00x) w_avg_10_2x2_avx2: 10.3 ( 1.82x) w_avg_10_4x4_c: 50.8 ( 1.00x) w_avg_10_4x4_avx2: 13.4 ( 3.79x) w_avg_10_8x8_c: 109.7 ( 1.00x) w_avg_10_8x8_avx2: 20.4 ( 5.38x) w_avg_10_16x16_c: 395.2 ( 1.00x) w_avg_10_16x16_avx2: 41.7 ( 9.48x) w_avg_10_32x32_c: 1535.6 ( 1.00x) w_avg_10_32x32_avx2: 137.9 (11.13x) w_avg_10_64x64_c: 6002.1 ( 1.00x) w_avg_10_64x64_avx2: 548.5 (10.94x) w_avg_10_128x128_c: 23742.7 ( 1.00x) w_avg_10_128x128_avx2: 2179.8 (10.89x) w_avg_12_2x2_c: 18.9 ( 1.00x) w_avg_12_2x2_avx2: 10.3 ( 1.84x) w_avg_12_4x4_c: 52.4 ( 1.00x) w_avg_12_4x4_avx2: 13.4 ( 3.91x) w_avg_12_8x8_c: 109.2 ( 1.00x) w_avg_12_8x8_avx2: 20.3 ( 5.39x) w_avg_12_16x16_c: 396.3 ( 1.00x) w_avg_12_16x16_avx2: 41.7 ( 9.51x) w_avg_12_32x32_c: 1532.6 ( 1.00x) w_avg_12_32x32_avx2: 138.6 (11.06x) w_avg_12_64x64_c: 5996.7 ( 1.00x) w_avg_12_64x64_avx2: 549.6 (10.91x) w_avg_12_128x128_c: 23738.0 ( 1.00x) w_avg_12_128x128_avx2: 2177.2 (10.90x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/dsp_init.c
- libavcodec/x86/vvc/mc.asm
Change #258490
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 01:02:20 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 8e82416434962b2c68a7bd794d76e998f1990f68 Comments
avcodec/x86/vvc/of: Avoid unused register Avoids a push+pop. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/of.asm
Change #258491
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 01:03:22 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 19dc7b79a4e7d2f9804d48369dbb132a39e1ac5d Comments
avcodec/x86/vvc/of: Unify shuffling One can use the same shuffles for the width 8 and width 16 case if one also changes the permutation in vpermd (that always follows pshufb for width 16). This also allows to load it before checking width. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/of.asm
Change #258492
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 01:05:12 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision af3f8f5bd2ed0d55cf8614064d722b533eef77e9 Comments
avcodec/x86/vvc/of: Break dependency chain Don't extract and update one word of one and the same register at a time; use separate src and dst registers, so that pextrw and bsr can be done in parallel. Also use movd instead of pinsrw for the first word. Old benchmarks: apply_bdof_8_8x16_c: 3275.2 ( 1.00x) apply_bdof_8_8x16_avx2: 487.6 ( 6.72x) apply_bdof_8_16x8_c: 3243.1 ( 1.00x) apply_bdof_8_16x8_avx2: 284.4 (11.40x) apply_bdof_8_16x16_c: 6501.8 ( 1.00x) apply_bdof_8_16x16_avx2: 570.0 (11.41x) apply_bdof_10_8x16_c: 3286.5 ( 1.00x) apply_bdof_10_8x16_avx2: 461.7 ( 7.12x) apply_bdof_10_16x8_c: 3274.5 ( 1.00x) apply_bdof_10_16x8_avx2: 271.4 (12.06x) apply_bdof_10_16x16_c: 6590.0 ( 1.00x) apply_bdof_10_16x16_avx2: 543.9 (12.12x) apply_bdof_12_8x16_c: 3307.6 ( 1.00x) apply_bdof_12_8x16_avx2: 462.2 ( 7.16x) apply_bdof_12_16x8_c: 3287.4 ( 1.00x) apply_bdof_12_16x8_avx2: 271.8 (12.10x) apply_bdof_12_16x16_c: 6465.7 ( 1.00x) apply_bdof_12_16x16_avx2: 543.8 (11.89x) New benchmarks: apply_bdof_8_8x16_c: 3255.7 ( 1.00x) apply_bdof_8_8x16_avx2: 349.3 ( 9.32x) apply_bdof_8_16x8_c: 3262.5 ( 1.00x) apply_bdof_8_16x8_avx2: 214.8 (15.19x) apply_bdof_8_16x16_c: 6471.6 ( 1.00x) apply_bdof_8_16x16_avx2: 429.8 (15.06x) apply_bdof_10_8x16_c: 3227.7 ( 1.00x) apply_bdof_10_8x16_avx2: 321.6 (10.04x) apply_bdof_10_16x8_c: 3250.2 ( 1.00x) apply_bdof_10_16x8_avx2: 201.2 (16.16x) apply_bdof_10_16x16_c: 6476.5 ( 1.00x) apply_bdof_10_16x16_avx2: 400.9 (16.16x) apply_bdof_12_8x16_c: 3230.7 ( 1.00x) apply_bdof_12_8x16_avx2: 321.8 (10.04x) apply_bdof_12_16x8_c: 3210.5 ( 1.00x) apply_bdof_12_16x8_avx2: 200.9 (15.98x) apply_bdof_12_16x16_c: 6474.5 ( 1.00x) apply_bdof_12_16x16_avx2: 400.2 (16.18x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/of.asm
Change #258493
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Sun 22 Feb 2026 01:05:12 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 6c1c1720cf940057ace674efbcb4c0621bf535d2 Comments
avcodec/x86/vvc/dsp_init: Mark dsp init function as av_cold Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/vvc/dsp_init.c