Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Builder ffmpeg-solaris10-sparc Build #13172

Results:

Failed shell_2 shell_3 shell_4 shell_5

SourceStamp:

Projectffmpeg
Repositoryhttps://git.ffmpeg.org/ffmpeg.git
Branchmaster
Revision6c1c1720cf940057ace674efbcb4c0621bf535d2
Got Revision6c1c1720cf940057ace674efbcb4c0621bf535d2
Changes12 changes

BuildSlave:

unstable10s

Reason:

The SingleBranchScheduler scheduler named 'schedule-ffmpeg-solaris10-sparc' triggered this build

Steps and Logfiles:

  1. git update ( 11 secs )
    1. stdio
  2. shell 'gsed -i ...' ( 0 secs )
    1. stdio
  3. shell_1 'gsed -i ...' ( 0 secs )
    1. stdio
  4. shell_2 'gsed -i ...' failed ( 0 secs )
    1. stdio
  5. shell_3 './configure --samples="../../../ffmpeg/fate-suite" ...' failed ( 7 secs )
    1. stdio
    2. config.log
  6. shell_4 'gmake fate-rsync' failed ( 0 secs )
    1. stdio
  7. shell_5 '../../../ffmpeg/fate.sh ../../../ffmpeg/fate_config.sh' failed ( 0 secs )
    1. stdio
    2. configure.log
    3. compile.log
    4. test.log

Build Properties:

NameValueSource
branch master Build
builddir /export/home/buildbot-unstable10s/slave/ffmpeg-solaris10-sparc slave
buildername ffmpeg-solaris10-sparc Builder
buildnumber 13172 Build
codebase Build
got_revision 6c1c1720cf940057ace674efbcb4c0621bf535d2 Git
project ffmpeg Build
repository https://git.ffmpeg.org/ffmpeg.git Build
revision 6c1c1720cf940057ace674efbcb4c0621bf535d2 Build
scheduler schedule-ffmpeg-solaris10-sparc Scheduler
slavename unstable10s BuildSlave
workdir /export/home/buildbot-unstable10s/slave/ffmpeg-solaris10-sparc slave (deprecated)

Forced Build Properties:

NameLabelValue

Responsible Users:

  1. Andreas Rheinhardt

Timing:

StartSun Feb 22 03:05:03 2026
EndSun Feb 22 03:05:24 2026
Elapsed20 secs

All Changes:

:

  1. Change #258482

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 00:57:56
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 7bf9c1e3f6effbe7d2dd53096bf2a7dbbb07d7ff

    Comments

    avcodec/x86/vvc/mc: Avoid redundant clipping for 8bit
    It is already done by packuswb.
    
    Old benchmarks:
    avg_8_2x2_c:                                            11.1 ( 1.00x)
    avg_8_2x2_avx2:                                          8.6 ( 1.28x)
    avg_8_4x4_c:                                            30.0 ( 1.00x)
    avg_8_4x4_avx2:                                         10.8 ( 2.78x)
    avg_8_8x8_c:                                           132.0 ( 1.00x)
    avg_8_8x8_avx2:                                         25.7 ( 5.14x)
    avg_8_16x16_c:                                         254.6 ( 1.00x)
    avg_8_16x16_avx2:                                       33.2 ( 7.67x)
    avg_8_32x32_c:                                         897.5 ( 1.00x)
    avg_8_32x32_avx2:                                      115.6 ( 7.76x)
    avg_8_64x64_c:                                        3316.9 ( 1.00x)
    avg_8_64x64_avx2:                                      626.5 ( 5.29x)
    avg_8_128x128_c:                                     12973.6 ( 1.00x)
    avg_8_128x128_avx2:                                   1914.0 ( 6.78x)
    w_avg_8_2x2_c:                                          16.7 ( 1.00x)
    w_avg_8_2x2_avx2:                                       14.4 ( 1.16x)
    w_avg_8_4x4_c:                                          48.2 ( 1.00x)
    w_avg_8_4x4_avx2:                                       16.5 ( 2.92x)
    w_avg_8_8x8_c:                                         168.1 ( 1.00x)
    w_avg_8_8x8_avx2:                                       49.7 ( 3.38x)
    w_avg_8_16x16_c:                                       392.4 ( 1.00x)
    w_avg_8_16x16_avx2:                                     61.1 ( 6.43x)
    w_avg_8_32x32_c:                                      1455.3 ( 1.00x)
    w_avg_8_32x32_avx2:                                    224.6 ( 6.48x)
    w_avg_8_64x64_c:                                      5632.1 ( 1.00x)
    w_avg_8_64x64_avx2:                                    896.9 ( 6.28x)
    w_avg_8_128x128_c:                                   22136.3 ( 1.00x)
    w_avg_8_128x128_avx2:                                 3626.7 ( 6.10x)
    
    New benchmarks:
    avg_8_2x2_c:                                            12.3 ( 1.00x)
    avg_8_2x2_avx2:                                          8.1 ( 1.52x)
    avg_8_4x4_c:                                            30.3 ( 1.00x)
    avg_8_4x4_avx2:                                         11.3 ( 2.67x)
    avg_8_8x8_c:                                           131.8 ( 1.00x)
    avg_8_8x8_avx2:                                         21.3 ( 6.20x)
    avg_8_16x16_c:                                         255.0 ( 1.00x)
    avg_8_16x16_avx2:                                       30.6 ( 8.33x)
    avg_8_32x32_c:                                         898.5 ( 1.00x)
    avg_8_32x32_avx2:                                      104.9 ( 8.57x)
    avg_8_64x64_c:                                        3317.7 ( 1.00x)
    avg_8_64x64_avx2:                                      540.9 ( 6.13x)
    avg_8_128x128_c:                                     12986.5 ( 1.00x)
    avg_8_128x128_avx2:                                   1663.4 ( 7.81x)
    w_avg_8_2x2_c:                                          16.8 ( 1.00x)
    w_avg_8_2x2_avx2:                                       13.9 ( 1.21x)
    w_avg_8_4x4_c:                                          48.2 ( 1.00x)
    w_avg_8_4x4_avx2:                                       16.2 ( 2.98x)
    w_avg_8_8x8_c:                                         168.6 ( 1.00x)
    w_avg_8_8x8_avx2:                                       46.3 ( 3.64x)
    w_avg_8_16x16_c:                                       392.4 ( 1.00x)
    w_avg_8_16x16_avx2:                                     57.7 ( 6.80x)
    w_avg_8_32x32_c:                                      1454.6 ( 1.00x)
    w_avg_8_32x32_avx2:                                    214.6 ( 6.78x)
    w_avg_8_64x64_c:                                      5638.4 ( 1.00x)
    w_avg_8_64x64_avx2:                                    875.6 ( 6.44x)
    w_avg_8_128x128_c:                                   22133.5 ( 1.00x)
    w_avg_8_128x128_avx2:                                 3334.3 ( 6.64x)
    
    Also saves 550B of .text here. The improvements will likely
    be even better on Win64, because it avoids using two nonvolatile
    registers in the weighted average case.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/mc.asm
  2. Change #258483

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 00:57:56
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision caa0ae0cfb35de0ae3fd5f346caef89d62eeaf7c

    Comments

    avcodec/x86/vvc/mc: Avoid pextr[dq], v{insert,extract}i128
    Use mov[dq], movdqu instead if the least significant parts
    are set (i.e. if the immediate value is 0x0).
    
    Old benchmarks:
    avg_8_2x2_c:                                            11.3 ( 1.00x)
    avg_8_2x2_avx2:                                          7.5 ( 1.50x)
    avg_8_4x4_c:                                            31.2 ( 1.00x)
    avg_8_4x4_avx2:                                         10.7 ( 2.91x)
    avg_8_8x8_c:                                           133.5 ( 1.00x)
    avg_8_8x8_avx2:                                         21.2 ( 6.30x)
    avg_8_16x16_c:                                         254.7 ( 1.00x)
    avg_8_16x16_avx2:                                       30.1 ( 8.46x)
    avg_8_32x32_c:                                         896.9 ( 1.00x)
    avg_8_32x32_avx2:                                      103.9 ( 8.63x)
    avg_8_64x64_c:                                        3320.7 ( 1.00x)
    avg_8_64x64_avx2:                                      539.4 ( 6.16x)
    avg_8_128x128_c:                                     12991.5 ( 1.00x)
    avg_8_128x128_avx2:                                   1661.3 ( 7.82x)
    avg_10_2x2_c:                                           21.3 ( 1.00x)
    avg_10_2x2_avx2:                                         8.3 ( 2.55x)
    avg_10_4x4_c:                                           34.9 ( 1.00x)
    avg_10_4x4_avx2:                                        10.6 ( 3.28x)
    avg_10_8x8_c:                                           76.3 ( 1.00x)
    avg_10_8x8_avx2:                                        20.2 ( 3.77x)
    avg_10_16x16_c:                                        255.9 ( 1.00x)
    avg_10_16x16_avx2:                                      24.1 (10.60x)
    avg_10_32x32_c:                                        932.4 ( 1.00x)
    avg_10_32x32_avx2:                                      73.3 (12.72x)
    avg_10_64x64_c:                                       3516.4 ( 1.00x)
    avg_10_64x64_avx2:                                     601.7 ( 5.84x)
    avg_10_128x128_c:                                    13690.6 ( 1.00x)
    avg_10_128x128_avx2:                                  1613.2 ( 8.49x)
    avg_12_2x2_c:                                           14.0 ( 1.00x)
    avg_12_2x2_avx2:                                         8.3 ( 1.67x)
    avg_12_4x4_c:                                           35.3 ( 1.00x)
    avg_12_4x4_avx2:                                        10.9 ( 3.26x)
    avg_12_8x8_c:                                           76.5 ( 1.00x)
    avg_12_8x8_avx2:                                        20.3 ( 3.77x)
    avg_12_16x16_c:                                        256.7 ( 1.00x)
    avg_12_16x16_avx2:                                      24.1 (10.63x)
    avg_12_32x32_c:                                        932.5 ( 1.00x)
    avg_12_32x32_avx2:                                      73.3 (12.72x)
    avg_12_64x64_c:                                       3520.5 ( 1.00x)
    avg_12_64x64_avx2:                                     602.6 ( 5.84x)
    avg_12_128x128_c:                                    13689.6 ( 1.00x)
    avg_12_128x128_avx2:                                  1613.1 ( 8.49x)
    w_avg_8_2x2_c:                                          16.7 ( 1.00x)
    w_avg_8_2x2_avx2:                                       13.4 ( 1.25x)
    w_avg_8_4x4_c:                                          44.5 ( 1.00x)
    w_avg_8_4x4_avx2:                                       15.9 ( 2.81x)
    w_avg_8_8x8_c:                                         166.1 ( 1.00x)
    w_avg_8_8x8_avx2:                                       45.7 ( 3.63x)
    w_avg_8_16x16_c:                                       392.9 ( 1.00x)
    w_avg_8_16x16_avx2:                                     57.8 ( 6.80x)
    w_avg_8_32x32_c:                                      1455.5 ( 1.00x)
    w_avg_8_32x32_avx2:                                    215.0 ( 6.77x)
    w_avg_8_64x64_c:                                      5621.8 ( 1.00x)
    w_avg_8_64x64_avx2:                                    875.2 ( 6.42x)
    w_avg_8_128x128_c:                                   22131.3 ( 1.00x)
    w_avg_8_128x128_avx2:                                 3390.1 ( 6.53x)
    w_avg_10_2x2_c:                                         18.0 ( 1.00x)
    w_avg_10_2x2_avx2:                                      14.0 ( 1.28x)
    w_avg_10_4x4_c:                                         53.9 ( 1.00x)
    w_avg_10_4x4_avx2:                                      15.9 ( 3.40x)
    w_avg_10_8x8_c:                                        109.5 ( 1.00x)
    w_avg_10_8x8_avx2:                                      40.4 ( 2.71x)
    w_avg_10_16x16_c:                                      395.7 ( 1.00x)
    w_avg_10_16x16_avx2:                                    44.7 ( 8.86x)
    w_avg_10_32x32_c:                                     1532.7 ( 1.00x)
    w_avg_10_32x32_avx2:                                   142.4 (10.77x)
    w_avg_10_64x64_c:                                     6007.7 ( 1.00x)
    w_avg_10_64x64_avx2:                                   745.5 ( 8.06x)
    w_avg_10_128x128_c:                                  23719.7 ( 1.00x)
    w_avg_10_128x128_avx2:                                2217.7 (10.70x)
    w_avg_12_2x2_c:                                         18.9 ( 1.00x)
    w_avg_12_2x2_avx2:                                      13.6 ( 1.38x)
    w_avg_12_4x4_c:                                         47.5 ( 1.00x)
    w_avg_12_4x4_avx2:                                      15.9 ( 2.99x)
    w_avg_12_8x8_c:                                        109.3 ( 1.00x)
    w_avg_12_8x8_avx2:                                      40.9 ( 2.67x)
    w_avg_12_16x16_c:                                      395.6 ( 1.00x)
    w_avg_12_16x16_avx2:                                    44.8 ( 8.84x)
    w_avg_12_32x32_c:                                     1531.0 ( 1.00x)
    w_avg_12_32x32_avx2:                                   141.8 (10.80x)
    w_avg_12_64x64_c:                                     6016.7 ( 1.00x)
    w_avg_12_64x64_avx2:                                   732.8 ( 8.21x)
    w_avg_12_128x128_c:                                  23762.2 ( 1.00x)
    w_avg_12_128x128_avx2:                                2223.4 (10.69x)
    
    New benchmarks:
    avg_8_2x2_c:                                            11.3 ( 1.00x)
    avg_8_2x2_avx2:                                          7.6 ( 1.49x)
    avg_8_4x4_c:                                            31.2 ( 1.00x)
    avg_8_4x4_avx2:                                         10.8 ( 2.89x)
    avg_8_8x8_c:                                           131.6 ( 1.00x)
    avg_8_8x8_avx2:                                         15.6 ( 8.42x)
    avg_8_16x16_c:                                         255.3 ( 1.00x)
    avg_8_16x16_avx2:                                       27.9 ( 9.16x)
    avg_8_32x32_c:                                         897.9 ( 1.00x)
    avg_8_32x32_avx2:                                       81.2 (11.06x)
    avg_8_64x64_c:                                        3320.0 ( 1.00x)
    avg_8_64x64_avx2:                                      335.1 ( 9.91x)
    avg_8_128x128_c:                                     12999.1 ( 1.00x)
    avg_8_128x128_avx2:                                   1456.3 ( 8.93x)
    avg_10_2x2_c:                                           12.0 ( 1.00x)
    avg_10_2x2_avx2:                                         8.6 ( 1.40x)
    avg_10_4x4_c:                                           34.9 ( 1.00x)
    avg_10_4x4_avx2:                                         9.7 ( 3.61x)
    avg_10_8x8_c:                                           76.7 ( 1.00x)
    avg_10_8x8_avx2:                                        16.3 ( 4.69x)
    avg_10_16x16_c:                                        256.3 ( 1.00x)
    avg_10_16x16_avx2:                                      25.2 (10.18x)
    avg_10_32x32_c:                                        932.8 ( 1.00x)
    avg_10_32x32_avx2:                                      73.3 (12.72x)
    avg_10_64x64_c:                                       3518.8 ( 1.00x)
    avg_10_64x64_avx2:                                     416.8 ( 8.44x)
    avg_10_128x128_c:                                    13691.6 ( 1.00x)
    avg_10_128x128_avx2:                                  1612.9 ( 8.49x)
    avg_12_2x2_c:                                           14.1 ( 1.00x)
    avg_12_2x2_avx2:                                         8.7 ( 1.62x)
    avg_12_4x4_c:                                           35.7 ( 1.00x)
    avg_12_4x4_avx2:                                         9.7 ( 3.68x)
    avg_12_8x8_c:                                           77.0 ( 1.00x)
    avg_12_8x8_avx2:                                        16.9 ( 4.57x)
    avg_12_16x16_c:                                        256.2 ( 1.00x)
    avg_12_16x16_avx2:                                      25.7 ( 9.96x)
    avg_12_32x32_c:                                        933.5 ( 1.00x)
    avg_12_32x32_avx2:                                      74.0 (12.62x)
    avg_12_64x64_c:                                       3516.4 ( 1.00x)
    avg_12_64x64_avx2:                                     408.7 ( 8.60x)
    avg_12_128x128_c:                                    13691.6 ( 1.00x)
    avg_12_128x128_avx2:                                  1613.8 ( 8.48x)
    w_avg_8_2x2_c:                                          16.7 ( 1.00x)
    w_avg_8_2x2_avx2:                                       14.0 ( 1.19x)
    w_avg_8_4x4_c:                                          48.2 ( 1.00x)
    w_avg_8_4x4_avx2:                                       16.1 ( 3.00x)
    w_avg_8_8x8_c:                                         168.0 ( 1.00x)
    w_avg_8_8x8_avx2:                                       22.5 ( 7.47x)
    w_avg_8_16x16_c:                                       392.5 ( 1.00x)
    w_avg_8_16x16_avx2:                                     47.9 ( 8.19x)
    w_avg_8_32x32_c:                                      1453.7 ( 1.00x)
    w_avg_8_32x32_avx2:                                    176.1 ( 8.26x)
    w_avg_8_64x64_c:                                      5631.4 ( 1.00x)
    w_avg_8_64x64_avx2:                                    690.8 ( 8.15x)
    w_avg_8_128x128_c:                                   22139.5 ( 1.00x)
    w_avg_8_128x128_avx2:                                 2742.4 ( 8.07x)
    w_avg_10_2x2_c:                                         18.1 ( 1.00x)
    w_avg_10_2x2_avx2:                                      13.8 ( 1.31x)
    w_avg_10_4x4_c:                                         47.0 ( 1.00x)
    w_avg_10_4x4_avx2:                                      16.4 ( 2.87x)
    w_avg_10_8x8_c:                                        110.0 ( 1.00x)
    w_avg_10_8x8_avx2:                                      21.6 ( 5.09x)
    w_avg_10_16x16_c:                                      395.2 ( 1.00x)
    w_avg_10_16x16_avx2:                                    45.4 ( 8.71x)
    w_avg_10_32x32_c:                                     1533.8 ( 1.00x)
    w_avg_10_32x32_avx2:                                   142.6 (10.76x)
    w_avg_10_64x64_c:                                     6004.4 ( 1.00x)
    w_avg_10_64x64_avx2:                                   672.8 ( 8.92x)
    w_avg_10_128x128_c:                                  23748.5 ( 1.00x)
    w_avg_10_128x128_avx2:                                2198.0 (10.80x)
    w_avg_12_2x2_c:                                         17.2 ( 1.00x)
    w_avg_12_2x2_avx2:                                      13.9 ( 1.24x)
    w_avg_12_4x4_c:                                         51.4 ( 1.00x)
    w_avg_12_4x4_avx2:                                      16.5 ( 3.11x)
    w_avg_12_8x8_c:                                        109.1 ( 1.00x)
    w_avg_12_8x8_avx2:                                      22.0 ( 4.96x)
    w_avg_12_16x16_c:                                      395.9 ( 1.00x)
    w_avg_12_16x16_avx2:                                    44.9 ( 8.81x)
    w_avg_12_32x32_c:                                     1533.5 ( 1.00x)
    w_avg_12_32x32_avx2:                                   142.3 (10.78x)
    w_avg_12_64x64_c:                                     6002.0 ( 1.00x)
    w_avg_12_64x64_avx2:                                   557.5 (10.77x)
    w_avg_12_128x128_c:                                  23749.5 ( 1.00x)
    w_avg_12_128x128_avx2:                                2202.0 (10.79x)
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/mc.asm
  3. Change #258484

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 00:57:56
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 9317fb2b2ed2e7b7718f99f3521997425134a78d

    Comments

    avcodec/x86/vvc/mc: Avoid ymm registers where possible
    Widths 2 and 4 fit into xmm registers.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/mc.asm
  4. Change #258485

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 00:57:56
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision eabf52e787f60ad57d9bce28bcce0cca730da6da

    Comments

    avcodec/x86/vvc/mc: Avoid unused work
    The high quadword of these registers is zero for width 2.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/mc.asm
  5. Change #258486

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 00:57:56
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 59f8ff4c183d3d91156d9b1054b8e070dbbfd5fb

    Comments

    avcodec/x86/vvc/mc: Remove unused constants
    Also avoid overaligning .rodata.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/mc.asm
  6. Change #258487

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 00:57:56
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 5a60b3f1a618ec9981f2950558df59087f799995

    Comments

    avcodec/x86/vvc/mc: Remove always-false branches
    The C versions of the average and weighted average functions
    contains "FFMAX(3, 15 - BIT_DEPTH)" and the code here followed
    this; yet it is only instantiated for bit depths 8, 10 and 12,
    for which the above is just 15-BIT_DEPTH. So the comparisons
    are unnecessary.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/mc.asm
  7. Change #258488

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 00:58:33
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision ea78402e9c78173118fb1e71d4192cb6840388e7

    Comments

    avcodec/x86/vvc/mc,dsp_init: Avoid pointless wrappers for avg
    Up until now, there were two averaging assembly functions,
    one for eight bit content and one for <=16 bit content;
    there are also three C-wrappers around these functions,
    for 8, 10 and 12 bpp. These wrappers simply forward the
    maximum permissible value (i.e. (1<<bpp)-1) and promote
    some integer values to ptrdiff_t.
    
    Yet these wrappers are absolutely useless: The assembly functions
    rederive the bpp from the maximum and only the integer part
    of the promoted ptrdiff_t values is ever used. Of course,
    these wrappers also entail an additional call (not a tail call,
    because the additional maximum parameter is passed on the stack).
    
    Remove the wrappers and add per-bpp assembly functions instead.
    Given that the only difference between 10 and 12 bits are some
    constants in registers, the main part of these functions can be
    shared (given that this code uses a jumptable, it can even
    be done without adding any additional jump).
    
    Old benchmarks:
    avg_8_2x2_c:                                            11.4 ( 1.00x)
    avg_8_2x2_avx2:                                          7.9 ( 1.44x)
    avg_8_4x4_c:                                            30.7 ( 1.00x)
    avg_8_4x4_avx2:                                         10.4 ( 2.95x)
    avg_8_8x8_c:                                           134.5 ( 1.00x)
    avg_8_8x8_avx2:                                         16.6 ( 8.12x)
    avg_8_16x16_c:                                         255.6 ( 1.00x)
    avg_8_16x16_avx2:                                       28.2 ( 9.07x)
    avg_8_32x32_c:                                         897.7 ( 1.00x)
    avg_8_32x32_avx2:                                       83.9 (10.70x)
    avg_8_64x64_c:                                        3320.0 ( 1.00x)
    avg_8_64x64_avx2:                                      321.1 (10.34x)
    avg_8_128x128_c:                                     12981.8 ( 1.00x)
    avg_8_128x128_avx2:                                   1480.1 ( 8.77x)
    avg_10_2x2_c:                                           12.0 ( 1.00x)
    avg_10_2x2_avx2:                                         8.4 ( 1.43x)
    avg_10_4x4_c:                                           34.9 ( 1.00x)
    avg_10_4x4_avx2:                                         9.8 ( 3.56x)
    avg_10_8x8_c:                                           76.8 ( 1.00x)
    avg_10_8x8_avx2:                                        15.1 ( 5.08x)
    avg_10_16x16_c:                                        256.6 ( 1.00x)
    avg_10_16x16_avx2:                                      25.1 (10.20x)
    avg_10_32x32_c:                                        932.9 ( 1.00x)
    avg_10_32x32_avx2:                                      73.4 (12.72x)
    avg_10_64x64_c:                                       3517.9 ( 1.00x)
    avg_10_64x64_avx2:                                     414.8 ( 8.48x)
    avg_10_128x128_c:                                    13695.3 ( 1.00x)
    avg_10_128x128_avx2:                                  1648.1 ( 8.31x)
    avg_12_2x2_c:                                           13.1 ( 1.00x)
    avg_12_2x2_avx2:                                         8.6 ( 1.53x)
    avg_12_4x4_c:                                           35.4 ( 1.00x)
    avg_12_4x4_avx2:                                        10.1 ( 3.49x)
    avg_12_8x8_c:                                           76.6 ( 1.00x)
    avg_12_8x8_avx2:                                        16.7 ( 4.60x)
    avg_12_16x16_c:                                        256.6 ( 1.00x)
    avg_12_16x16_avx2:                                      25.5 (10.07x)
    avg_12_32x32_c:                                        933.2 ( 1.00x)
    avg_12_32x32_avx2:                                      75.7 (12.34x)
    avg_12_64x64_c:                                       3519.1 ( 1.00x)
    avg_12_64x64_avx2:                                     416.8 ( 8.44x)
    avg_12_128x128_c:                                    13695.1 ( 1.00x)
    avg_12_128x128_avx2:                                  1651.6 ( 8.29x)
    
    New benchmarks:
    avg_8_2x2_c:                                            11.5 ( 1.00x)
    avg_8_2x2_avx2:                                          6.0 ( 1.91x)
    avg_8_4x4_c:                                            29.7 ( 1.00x)
    avg_8_4x4_avx2:                                          8.0 ( 3.72x)
    avg_8_8x8_c:                                           131.4 ( 1.00x)
    avg_8_8x8_avx2:                                         12.2 (10.74x)
    avg_8_16x16_c:                                         254.3 ( 1.00x)
    avg_8_16x16_avx2:                                       24.8 (10.25x)
    avg_8_32x32_c:                                         897.7 ( 1.00x)
    avg_8_32x32_avx2:                                       77.8 (11.54x)
    avg_8_64x64_c:                                        3321.3 ( 1.00x)
    avg_8_64x64_avx2:                                      318.7 (10.42x)
    avg_8_128x128_c:                                     12988.4 ( 1.00x)
    avg_8_128x128_avx2:                                   1430.1 ( 9.08x)
    avg_10_2x2_c:                                           12.1 ( 1.00x)
    avg_10_2x2_avx2:                                         5.7 ( 2.13x)
    avg_10_4x4_c:                                           35.0 ( 1.00x)
    avg_10_4x4_avx2:                                         9.0 ( 3.88x)
    avg_10_8x8_c:                                           77.2 ( 1.00x)
    avg_10_8x8_avx2:                                        12.4 ( 6.24x)
    avg_10_16x16_c:                                        256.2 ( 1.00x)
    avg_10_16x16_avx2:                                      24.3 (10.56x)
    avg_10_32x32_c:                                        932.9 ( 1.00x)
    avg_10_32x32_avx2:                                      71.9 (12.97x)
    avg_10_64x64_c:                                       3516.8 ( 1.00x)
    avg_10_64x64_avx2:                                     414.7 ( 8.48x)
    avg_10_128x128_c:                                    13693.7 ( 1.00x)
    avg_10_128x128_avx2:                                  1609.3 ( 8.51x)
    avg_12_2x2_c:                                           14.1 ( 1.00x)
    avg_12_2x2_avx2:                                         5.7 ( 2.48x)
    avg_12_4x4_c:                                           35.8 ( 1.00x)
    avg_12_4x4_avx2:                                         9.0 ( 3.96x)
    avg_12_8x8_c:                                           76.9 ( 1.00x)
    avg_12_8x8_avx2:                                        12.4 ( 6.22x)
    avg_12_16x16_c:                                        256.5 ( 1.00x)
    avg_12_16x16_avx2:                                      24.4 (10.50x)
    avg_12_32x32_c:                                        934.1 ( 1.00x)
    avg_12_32x32_avx2:                                      72.0 (12.97x)
    avg_12_64x64_c:                                       3518.2 ( 1.00x)
    avg_12_64x64_avx2:                                     414.8 ( 8.48x)
    avg_12_128x128_c:                                    13689.5 ( 1.00x)
    avg_12_128x128_avx2:                                  1611.1 ( 8.50x)
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/dsp_init.c
    • libavcodec/x86/vvc/mc.asm
  8. Change #258489

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 01:01:27
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 81fb70c833ac675ac8e09b38ad845a90de4c3e1c

    Comments

    avcodec/x86/vvc/mc,dsp_init: Avoid pointless wrappers for w_avg
    They only add overhead (in form of another function call,
    sign-extending some parameters to 64bit (although the upper
    bits are not used at all) and rederiving the actual number
    of bits (from the maximum value (1<<bpp)-1)).
    
    Old benchmarks:
    w_avg_8_2x2_c:                                          16.4 ( 1.00x)
    w_avg_8_2x2_avx2:                                       12.9 ( 1.27x)
    w_avg_8_4x4_c:                                          48.0 ( 1.00x)
    w_avg_8_4x4_avx2:                                       14.9 ( 3.23x)
    w_avg_8_8x8_c:                                         168.2 ( 1.00x)
    w_avg_8_8x8_avx2:                                       22.4 ( 7.49x)
    w_avg_8_16x16_c:                                       396.5 ( 1.00x)
    w_avg_8_16x16_avx2:                                     47.9 ( 8.28x)
    w_avg_8_32x32_c:                                      1466.3 ( 1.00x)
    w_avg_8_32x32_avx2:                                    172.8 ( 8.48x)
    w_avg_8_64x64_c:                                      5629.3 ( 1.00x)
    w_avg_8_64x64_avx2:                                    678.7 ( 8.29x)
    w_avg_8_128x128_c:                                   22122.4 ( 1.00x)
    w_avg_8_128x128_avx2:                                 2743.5 ( 8.06x)
    w_avg_10_2x2_c:                                         18.7 ( 1.00x)
    w_avg_10_2x2_avx2:                                      13.1 ( 1.43x)
    w_avg_10_4x4_c:                                         50.3 ( 1.00x)
    w_avg_10_4x4_avx2:                                      15.9 ( 3.17x)
    w_avg_10_8x8_c:                                        109.3 ( 1.00x)
    w_avg_10_8x8_avx2:                                      20.6 ( 5.30x)
    w_avg_10_16x16_c:                                      395.5 ( 1.00x)
    w_avg_10_16x16_avx2:                                    44.8 ( 8.83x)
    w_avg_10_32x32_c:                                     1534.2 ( 1.00x)
    w_avg_10_32x32_avx2:                                   141.4 (10.85x)
    w_avg_10_64x64_c:                                     6003.6 ( 1.00x)
    w_avg_10_64x64_avx2:                                   557.4 (10.77x)
    w_avg_10_128x128_c:                                  23722.7 ( 1.00x)
    w_avg_10_128x128_avx2:                                2205.0 (10.76x)
    w_avg_12_2x2_c:                                         18.6 ( 1.00x)
    w_avg_12_2x2_avx2:                                      13.1 ( 1.42x)
    w_avg_12_4x4_c:                                         52.2 ( 1.00x)
    w_avg_12_4x4_avx2:                                      16.1 ( 3.24x)
    w_avg_12_8x8_c:                                        109.2 ( 1.00x)
    w_avg_12_8x8_avx2:                                      20.6 ( 5.29x)
    w_avg_12_16x16_c:                                      396.1 ( 1.00x)
    w_avg_12_16x16_avx2:                                    45.0 ( 8.81x)
    w_avg_12_32x32_c:                                     1532.6 ( 1.00x)
    w_avg_12_32x32_avx2:                                   142.1 (10.79x)
    w_avg_12_64x64_c:                                     6002.2 ( 1.00x)
    w_avg_12_64x64_avx2:                                   557.3 (10.77x)
    w_avg_12_128x128_c:                                  23748.7 ( 1.00x)
    w_avg_12_128x128_avx2:                                2206.4 (10.76x)
    
    New benchmarks:
    w_avg_8_2x2_c:                                          16.0 ( 1.00x)
    w_avg_8_2x2_avx2:                                        9.3 ( 1.71x)
    w_avg_8_4x4_c:                                          48.4 ( 1.00x)
    w_avg_8_4x4_avx2:                                       12.4 ( 3.91x)
    w_avg_8_8x8_c:                                         168.7 ( 1.00x)
    w_avg_8_8x8_avx2:                                       21.1 ( 8.00x)
    w_avg_8_16x16_c:                                       394.5 ( 1.00x)
    w_avg_8_16x16_avx2:                                     46.2 ( 8.54x)
    w_avg_8_32x32_c:                                      1456.3 ( 1.00x)
    w_avg_8_32x32_avx2:                                    171.8 ( 8.48x)
    w_avg_8_64x64_c:                                      5636.2 ( 1.00x)
    w_avg_8_64x64_avx2:                                    676.9 ( 8.33x)
    w_avg_8_128x128_c:                                   22129.1 ( 1.00x)
    w_avg_8_128x128_avx2:                                 2734.3 ( 8.09x)
    w_avg_10_2x2_c:                                         18.7 ( 1.00x)
    w_avg_10_2x2_avx2:                                      10.3 ( 1.82x)
    w_avg_10_4x4_c:                                         50.8 ( 1.00x)
    w_avg_10_4x4_avx2:                                      13.4 ( 3.79x)
    w_avg_10_8x8_c:                                        109.7 ( 1.00x)
    w_avg_10_8x8_avx2:                                      20.4 ( 5.38x)
    w_avg_10_16x16_c:                                      395.2 ( 1.00x)
    w_avg_10_16x16_avx2:                                    41.7 ( 9.48x)
    w_avg_10_32x32_c:                                     1535.6 ( 1.00x)
    w_avg_10_32x32_avx2:                                   137.9 (11.13x)
    w_avg_10_64x64_c:                                     6002.1 ( 1.00x)
    w_avg_10_64x64_avx2:                                   548.5 (10.94x)
    w_avg_10_128x128_c:                                  23742.7 ( 1.00x)
    w_avg_10_128x128_avx2:                                2179.8 (10.89x)
    w_avg_12_2x2_c:                                         18.9 ( 1.00x)
    w_avg_12_2x2_avx2:                                      10.3 ( 1.84x)
    w_avg_12_4x4_c:                                         52.4 ( 1.00x)
    w_avg_12_4x4_avx2:                                      13.4 ( 3.91x)
    w_avg_12_8x8_c:                                        109.2 ( 1.00x)
    w_avg_12_8x8_avx2:                                      20.3 ( 5.39x)
    w_avg_12_16x16_c:                                      396.3 ( 1.00x)
    w_avg_12_16x16_avx2:                                    41.7 ( 9.51x)
    w_avg_12_32x32_c:                                     1532.6 ( 1.00x)
    w_avg_12_32x32_avx2:                                   138.6 (11.06x)
    w_avg_12_64x64_c:                                     5996.7 ( 1.00x)
    w_avg_12_64x64_avx2:                                   549.6 (10.91x)
    w_avg_12_128x128_c:                                  23738.0 ( 1.00x)
    w_avg_12_128x128_avx2:                                2177.2 (10.90x)
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/dsp_init.c
    • libavcodec/x86/vvc/mc.asm
  9. Change #258490

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 01:02:20
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 8e82416434962b2c68a7bd794d76e998f1990f68

    Comments

    avcodec/x86/vvc/of: Avoid unused register
    Avoids a push+pop.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/of.asm
  10. Change #258491

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 01:03:22
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 19dc7b79a4e7d2f9804d48369dbb132a39e1ac5d

    Comments

    avcodec/x86/vvc/of: Unify shuffling
    One can use the same shuffles for the width 8 and width 16
    case if one also changes the permutation in vpermd (that always
    follows pshufb for width 16).
    
    This also allows to load it before checking width.
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/of.asm
  11. Change #258492

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 01:05:12
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision af3f8f5bd2ed0d55cf8614064d722b533eef77e9

    Comments

    avcodec/x86/vvc/of: Break dependency chain
    Don't extract and update one word of one and the same register
    at a time; use separate src and dst registers, so that pextrw
    and bsr can be done in parallel. Also use movd instead of pinsrw
    for the first word.
    
    Old benchmarks:
    apply_bdof_8_8x16_c:                                  3275.2 ( 1.00x)
    apply_bdof_8_8x16_avx2:                                487.6 ( 6.72x)
    apply_bdof_8_16x8_c:                                  3243.1 ( 1.00x)
    apply_bdof_8_16x8_avx2:                                284.4 (11.40x)
    apply_bdof_8_16x16_c:                                 6501.8 ( 1.00x)
    apply_bdof_8_16x16_avx2:                               570.0 (11.41x)
    apply_bdof_10_8x16_c:                                 3286.5 ( 1.00x)
    apply_bdof_10_8x16_avx2:                               461.7 ( 7.12x)
    apply_bdof_10_16x8_c:                                 3274.5 ( 1.00x)
    apply_bdof_10_16x8_avx2:                               271.4 (12.06x)
    apply_bdof_10_16x16_c:                                6590.0 ( 1.00x)
    apply_bdof_10_16x16_avx2:                              543.9 (12.12x)
    apply_bdof_12_8x16_c:                                 3307.6 ( 1.00x)
    apply_bdof_12_8x16_avx2:                               462.2 ( 7.16x)
    apply_bdof_12_16x8_c:                                 3287.4 ( 1.00x)
    apply_bdof_12_16x8_avx2:                               271.8 (12.10x)
    apply_bdof_12_16x16_c:                                6465.7 ( 1.00x)
    apply_bdof_12_16x16_avx2:                              543.8 (11.89x)
    
    New benchmarks:
    apply_bdof_8_8x16_c:                                  3255.7 ( 1.00x)
    apply_bdof_8_8x16_avx2:                                349.3 ( 9.32x)
    apply_bdof_8_16x8_c:                                  3262.5 ( 1.00x)
    apply_bdof_8_16x8_avx2:                                214.8 (15.19x)
    apply_bdof_8_16x16_c:                                 6471.6 ( 1.00x)
    apply_bdof_8_16x16_avx2:                               429.8 (15.06x)
    apply_bdof_10_8x16_c:                                 3227.7 ( 1.00x)
    apply_bdof_10_8x16_avx2:                               321.6 (10.04x)
    apply_bdof_10_16x8_c:                                 3250.2 ( 1.00x)
    apply_bdof_10_16x8_avx2:                               201.2 (16.16x)
    apply_bdof_10_16x16_c:                                6476.5 ( 1.00x)
    apply_bdof_10_16x16_avx2:                              400.9 (16.16x)
    apply_bdof_12_8x16_c:                                 3230.7 ( 1.00x)
    apply_bdof_12_8x16_avx2:                               321.8 (10.04x)
    apply_bdof_12_16x8_c:                                 3210.5 ( 1.00x)
    apply_bdof_12_16x8_avx2:                               200.9 (15.98x)
    apply_bdof_12_16x16_c:                                6474.5 ( 1.00x)
    apply_bdof_12_16x16_avx2:                              400.2 (16.18x)
    
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/of.asm
  12. Change #258493

    Category ffmpeg
    Changed by Andreas Rheinhardt <andreas.rheinhardtohnoyoudont@outlook.com>
    Changed at Sun 22 Feb 2026 01:05:12
    Repository https://git.ffmpeg.org/ffmpeg.git
    Project ffmpeg
    Branch master
    Revision 6c1c1720cf940057ace674efbcb4c0621bf535d2

    Comments

    avcodec/x86/vvc/dsp_init: Mark dsp init function as av_cold
    Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

    Changed files

    • libavcodec/x86/vvc/dsp_init.c