Builder ffmpeg-solaris10-i386 Build #13696
Results:
Failed shell_2 shell_3 shell_4 shell_5
SourceStamp:
| Project | ffmpeg |
| Repository | https://git.ffmpeg.org/ffmpeg.git |
| Branch | master |
| Revision | cc3ca1712760ae53957a3a5987cb7c61c290a451 |
| Got Revision | cc3ca1712760ae53957a3a5987cb7c61c290a451 |
| Changes | 18 changes |
BuildSlave:
unstable10xReason:
The SingleBranchScheduler scheduler named 'schedule-ffmpeg-solaris10-i386' triggered this build
Steps and Logfiles:
-
git update ( 7 secs )
-
shell 'gsed -i ...' ( 0 secs )
-
shell_1 'gsed -i ...' ( 0 secs )
-
shell_2 'gsed -i ...' failed ( 0 secs )
-
shell_3 './configure --samples="../../../ffmpeg/fate-suite" ...' failed ( 9 secs )
-
shell_4 'gmake fate-rsync' failed ( 0 secs )
-
shell_5 '../../../ffmpeg/fate.sh ../../../ffmpeg/fate_config.sh' failed ( 3 secs )
Build Properties:
| Name | Value | Source |
|---|---|---|
| branch | master | Build |
| builddir | /export/home/buildbot/slave/ffmpeg-solaris10-i386 | slave |
| buildername | ffmpeg-solaris10-i386 | Builder |
| buildnumber | 13696 | Build |
| codebase | Build | |
| got_revision | cc3ca1712760ae53957a3a5987cb7c61c290a451 | Git |
| project | ffmpeg | Build |
| repository | https://git.ffmpeg.org/ffmpeg.git | Build |
| revision | cc3ca1712760ae53957a3a5987cb7c61c290a451 | Build |
| scheduler | schedule-ffmpeg-solaris10-i386 | Scheduler |
| slavename | unstable10x | BuildSlave |
| workdir | /export/home/buildbot/slave/ffmpeg-solaris10-i386 | slave (deprecated) |
Forced Build Properties:
| Name | Label | Value |
|---|
Responsible Users:
- Andreas Rheinhardtandreas.rheinhardt@outlook.com
Timing:
| Start | Thu Apr 30 11:22:53 2026 |
| End | Thu Apr 30 11:23:15 2026 |
| Elapsed | 22 secs |
All Changes:
:
Change #265959
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:32 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision d46414b46becc927e89b7824424df9e34d05c8e7 Comments
avcodec/x86/qpeldsp: Simplify resetting output pointer Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
Change #265960
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:32 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision d3bd1318b3ef38c34af52ee65fedc27e183f06d9 Comments
avcodec/x86/qpeldsp: Don't zero unnecessarily This value is write-only. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
Change #265961
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:32 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 69906d31c51306f2b87868fce234c8385bf06cd7 Comments
avcodec/x86/qpeldsp_init: Don't use unnecessarily big stack buffer Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp_init.c
Change #265962
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:32 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision cf79d8052d84c08dde399ee9c6bd1ce8e1ff47b7 Comments
avcodec/x86/qpeldsp_init: Specify alignment properly Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp_init.c
Change #265963
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:32 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision c2685234a650d7a533cb7a72229caf2d48cab2e2 Comments
avcodec/x86/qpeldsp_init: Deduplicate 8x8 and 16x16 code Also split the big macro into smaller ones for the pure horizontal vs the pure vertical and the mixed directions. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp_init.c
Change #265964
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:32 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 7b56259dd5bc3eb08c1dc2dfe6b31f71db160378 Comments
avcodec/x86/constants: Move ff_pw_{15,20} to qpeldsp.asm Only used there. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>Changed files
- libavcodec/x86/constants.c
- libavcodec/x86/constants.h
- libavcodec/x86/qpeldsp.asm
Change #265965
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision bcf7293a211b7c06e54d1c16a85f6c8e3826a7e8 Comments
avcodec/x86/qpeldsp: Remove unused declaration Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
Change #265966
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 188df9549c5c120728b43f953ec281e67e4bb3c3 Comments
avcodec/x86/qpeldsp: Don't use too much stack We only need (SIZE+1)*SIZE words. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
Change #265967
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 405465700cf8fd529f2dbbc0317eab6a9ede23f2 Comments
avcodec/x86/qpeldsp: Don't allocate stack unnecessarily Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
Change #265968
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 9beecb26704e8d9a4a27c07fd8da05eb94cf45ed Comments
avcodec/x86/qpeldsp: Add SSE2 vertical lowpass functions Benchmarks ([4], [8] and [12] are pure vertical functions and therefore show the biggest improvements): avg_qpel_pixels_tab[0][4]_c: 844.5 ( 1.00x) avg_qpel_pixels_tab[0][4]_mmxext: 225.5 ( 3.74x) avg_qpel_pixels_tab[0][4]_sse2: 146.6 ( 5.76x) avg_qpel_pixels_tab[0][5]_c: 1915.9 ( 1.00x) avg_qpel_pixels_tab[0][5]_mmxext: 499.6 ( 3.83x) avg_qpel_pixels_tab[0][5]_sse2: 405.5 ( 4.72x) avg_qpel_pixels_tab[0][6]_c: 1775.9 ( 1.00x) avg_qpel_pixels_tab[0][6]_mmxext: 484.9 ( 3.66x) avg_qpel_pixels_tab[0][6]_sse2: 385.4 ( 4.61x) avg_qpel_pixels_tab[0][7]_c: 1937.0 ( 1.00x) avg_qpel_pixels_tab[0][7]_mmxext: 501.3 ( 3.86x) avg_qpel_pixels_tab[0][7]_sse2: 403.6 ( 4.80x) avg_qpel_pixels_tab[0][8]_c: 976.7 ( 1.00x) avg_qpel_pixels_tab[0][8]_mmxext: 216.9 ( 4.50x) avg_qpel_pixels_tab[0][8]_sse2: 113.1 ( 8.64x) avg_qpel_pixels_tab[0][9]_c: 1971.8 ( 1.00x) avg_qpel_pixels_tab[0][9]_mmxext: 494.9 ( 3.98x) avg_qpel_pixels_tab[0][9]_sse2: 388.3 ( 5.08x) avg_qpel_pixels_tab[0][10]_c: 1900.8 ( 1.00x) avg_qpel_pixels_tab[0][10]_mmxext: 476.4 ( 3.99x) avg_qpel_pixels_tab[0][10]_sse2: 362.4 ( 5.24x) avg_qpel_pixels_tab[0][11]_c: 2003.3 ( 1.00x) avg_qpel_pixels_tab[0][11]_mmxext: 496.5 ( 4.04x) avg_qpel_pixels_tab[0][11]_sse2: 385.9 ( 5.19x) avg_qpel_pixels_tab[0][12]_c: 841.8 ( 1.00x) avg_qpel_pixels_tab[0][12]_mmxext: 226.7 ( 3.71x) avg_qpel_pixels_tab[0][12]_sse2: 143.3 ( 5.87x) avg_qpel_pixels_tab[0][13]_c: 1929.0 ( 1.00x) avg_qpel_pixels_tab[0][13]_mmxext: 499.6 ( 3.86x) avg_qpel_pixels_tab[0][13]_sse2: 412.1 ( 4.68x) avg_qpel_pixels_tab[0][14]_c: 1777.9 ( 1.00x) avg_qpel_pixels_tab[0][14]_mmxext: 484.8 ( 3.67x) avg_qpel_pixels_tab[0][14]_sse2: 385.9 ( 4.61x) avg_qpel_pixels_tab[0][15]_c: 1914.8 ( 1.00x) avg_qpel_pixels_tab[0][15]_mmxext: 501.8 ( 3.82x) avg_qpel_pixels_tab[0][15]_sse2: 405.0 ( 4.73x) avg_qpel_pixels_tab[1][4]_c: 203.4 ( 1.00x) avg_qpel_pixels_tab[1][4]_mmxext: 64.7 ( 3.14x) avg_qpel_pixels_tab[1][4]_sse2: 40.3 ( 5.05x) avg_qpel_pixels_tab[1][5]_c: 488.8 ( 1.00x) avg_qpel_pixels_tab[1][5]_mmxext: 134.6 ( 3.63x) avg_qpel_pixels_tab[1][5]_sse2: 108.5 ( 4.50x) avg_qpel_pixels_tab[1][6]_c: 448.2 ( 1.00x) avg_qpel_pixels_tab[1][6]_mmxext: 128.8 ( 3.48x) avg_qpel_pixels_tab[1][6]_sse2: 102.5 ( 4.37x) avg_qpel_pixels_tab[1][7]_c: 489.6 ( 1.00x) avg_qpel_pixels_tab[1][7]_mmxext: 134.5 ( 3.64x) avg_qpel_pixels_tab[1][7]_sse2: 108.8 ( 4.50x) avg_qpel_pixels_tab[1][8]_c: 223.8 ( 1.00x) avg_qpel_pixels_tab[1][8]_mmxext: 57.5 ( 3.89x) avg_qpel_pixels_tab[1][8]_sse2: 36.3 ( 6.16x) avg_qpel_pixels_tab[1][9]_c: 496.6 ( 1.00x) avg_qpel_pixels_tab[1][9]_mmxext: 129.8 ( 3.82x) avg_qpel_pixels_tab[1][9]_sse2: 105.1 ( 4.72x) avg_qpel_pixels_tab[1][10]_c: 466.1 ( 1.00x) avg_qpel_pixels_tab[1][10]_mmxext: 123.2 ( 3.78x) avg_qpel_pixels_tab[1][10]_sse2: 99.1 ( 4.70x) avg_qpel_pixels_tab[1][11]_c: 497.9 ( 1.00x) avg_qpel_pixels_tab[1][11]_mmxext: 129.9 ( 3.83x) avg_qpel_pixels_tab[1][11]_sse2: 105.4 ( 4.72x) avg_qpel_pixels_tab[1][12]_c: 203.5 ( 1.00x) avg_qpel_pixels_tab[1][12]_mmxext: 63.8 ( 3.19x) avg_qpel_pixels_tab[1][12]_sse2: 38.8 ( 5.25x) avg_qpel_pixels_tab[1][13]_c: 487.9 ( 1.00x) avg_qpel_pixels_tab[1][13]_mmxext: 134.7 ( 3.62x) avg_qpel_pixels_tab[1][13]_sse2: 108.4 ( 4.50x) avg_qpel_pixels_tab[1][14]_c: 447.4 ( 1.00x) avg_qpel_pixels_tab[1][14]_mmxext: 128.2 ( 3.49x) avg_qpel_pixels_tab[1][14]_sse2: 102.4 ( 4.37x) avg_qpel_pixels_tab[1][15]_c: 487.5 ( 1.00x) avg_qpel_pixels_tab[1][15]_mmxext: 134.0 ( 3.64x) avg_qpel_pixels_tab[1][15]_sse2: 109.9 ( 4.44x) put_no_rnd_qpel_pixels_tab[0][4]_c: 825.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][4]_mmxext: 242.5 ( 3.40x) put_no_rnd_qpel_pixels_tab[0][4]_sse2: 136.0 ( 6.07x) put_no_rnd_qpel_pixels_tab[0][5]_c: 1837.4 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][5]_mmxext: 542.5 ( 3.39x) put_no_rnd_qpel_pixels_tab[0][5]_sse2: 446.5 ( 4.11x) put_no_rnd_qpel_pixels_tab[0][6]_c: 1766.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][6]_mmxext: 493.6 ( 3.58x) put_no_rnd_qpel_pixels_tab[0][6]_sse2: 394.6 ( 4.48x) put_no_rnd_qpel_pixels_tab[0][7]_c: 1877.4 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][7]_mmxext: 541.9 ( 3.46x) put_no_rnd_qpel_pixels_tab[0][7]_sse2: 447.6 ( 4.19x) put_no_rnd_qpel_pixels_tab[0][8]_c: 785.1 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][8]_mmxext: 206.2 ( 3.81x) put_no_rnd_qpel_pixels_tab[0][8]_sse2: 101.6 ( 7.73x) put_no_rnd_qpel_pixels_tab[0][9]_c: 1772.2 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][9]_mmxext: 489.5 ( 3.62x) put_no_rnd_qpel_pixels_tab[0][9]_sse2: 394.8 ( 4.49x) put_no_rnd_qpel_pixels_tab[0][10]_c: 1711.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][10]_mmxext: 461.2 ( 3.71x) put_no_rnd_qpel_pixels_tab[0][10]_sse2: 357.9 ( 4.78x) put_no_rnd_qpel_pixels_tab[0][11]_c: 1815.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][11]_mmxext: 490.8 ( 3.70x) put_no_rnd_qpel_pixels_tab[0][11]_sse2: 394.0 ( 4.61x) put_no_rnd_qpel_pixels_tab[0][12]_c: 824.8 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][12]_mmxext: 242.9 ( 3.40x) put_no_rnd_qpel_pixels_tab[0][12]_sse2: 135.3 ( 6.10x) put_no_rnd_qpel_pixels_tab[0][13]_c: 1843.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][13]_mmxext: 545.4 ( 3.38x) put_no_rnd_qpel_pixels_tab[0][13]_sse2: 444.9 ( 4.14x) put_no_rnd_qpel_pixels_tab[0][14]_c: 1758.1 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][14]_mmxext: 497.7 ( 3.53x) put_no_rnd_qpel_pixels_tab[0][14]_sse2: 393.5 ( 4.47x) put_no_rnd_qpel_pixels_tab[0][15]_c: 1861.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][15]_mmxext: 545.0 ( 3.42x) put_no_rnd_qpel_pixels_tab[0][15]_sse2: 445.7 ( 4.18x) put_no_rnd_qpel_pixels_tab[1][4]_c: 198.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][4]_mmxext: 64.3 ( 3.08x) put_no_rnd_qpel_pixels_tab[1][4]_sse2: 39.8 ( 4.98x) put_no_rnd_qpel_pixels_tab[1][5]_c: 460.7 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][5]_mmxext: 137.2 ( 3.36x) put_no_rnd_qpel_pixels_tab[1][5]_sse2: 113.5 ( 4.06x) put_no_rnd_qpel_pixels_tab[1][6]_c: 441.4 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][6]_mmxext: 126.7 ( 3.49x) put_no_rnd_qpel_pixels_tab[1][6]_sse2: 103.7 ( 4.26x) put_no_rnd_qpel_pixels_tab[1][7]_c: 465.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][7]_mmxext: 137.7 ( 3.38x) put_no_rnd_qpel_pixels_tab[1][7]_sse2: 114.0 ( 4.09x) put_no_rnd_qpel_pixels_tab[1][8]_c: 193.8 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][8]_mmxext: 52.1 ( 3.72x) put_no_rnd_qpel_pixels_tab[1][8]_sse2: 27.8 ( 6.97x) put_no_rnd_qpel_pixels_tab[1][9]_c: 450.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][9]_mmxext: 126.2 ( 3.57x) put_no_rnd_qpel_pixels_tab[1][9]_sse2: 104.3 ( 4.32x) put_no_rnd_qpel_pixels_tab[1][10]_c: 436.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][10]_mmxext: 118.1 ( 3.69x) put_no_rnd_qpel_pixels_tab[1][10]_sse2: 92.4 ( 4.73x) put_no_rnd_qpel_pixels_tab[1][11]_c: 453.6 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][11]_mmxext: 128.7 ( 3.52x) put_no_rnd_qpel_pixels_tab[1][11]_sse2: 103.6 ( 4.38x) put_no_rnd_qpel_pixels_tab[1][12]_c: 201.2 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][12]_mmxext: 64.2 ( 3.13x) put_no_rnd_qpel_pixels_tab[1][12]_sse2: 39.6 ( 5.08x) put_no_rnd_qpel_pixels_tab[1][13]_c: 461.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][13]_mmxext: 137.6 ( 3.36x) put_no_rnd_qpel_pixels_tab[1][13]_sse2: 113.4 ( 4.07x) put_no_rnd_qpel_pixels_tab[1][14]_c: 442.6 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][14]_mmxext: 127.0 ( 3.49x) put_no_rnd_qpel_pixels_tab[1][14]_sse2: 102.2 ( 4.33x) put_no_rnd_qpel_pixels_tab[1][15]_c: 462.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][15]_mmxext: 139.5 ( 3.32x) put_no_rnd_qpel_pixels_tab[1][15]_sse2: 113.3 ( 4.09x) put_qpel_pixels_tab[0][4]_c: 824.6 ( 1.00x) put_qpel_pixels_tab[0][4]_mmxext: 220.1 ( 3.75x) put_qpel_pixels_tab[0][4]_sse2: 137.8 ( 5.98x) put_qpel_pixels_tab[0][5]_c: 1892.0 ( 1.00x) put_qpel_pixels_tab[0][5]_mmxext: 508.0 ( 3.72x) put_qpel_pixels_tab[0][5]_sse2: 408.6 ( 4.63x) put_qpel_pixels_tab[0][6]_c: 1758.0 ( 1.00x) put_qpel_pixels_tab[0][6]_mmxext: 476.7 ( 3.69x) put_qpel_pixels_tab[0][6]_sse2: 381.4 ( 4.61x) put_qpel_pixels_tab[0][7]_c: 1924.3 ( 1.00x) put_qpel_pixels_tab[0][7]_mmxext: 495.1 ( 3.89x) put_qpel_pixels_tab[0][7]_sse2: 417.2 ( 4.61x) put_qpel_pixels_tab[0][8]_c: 772.1 ( 1.00x) put_qpel_pixels_tab[0][8]_mmxext: 197.5 ( 3.91x) put_qpel_pixels_tab[0][8]_sse2: 118.4 ( 6.52x) put_qpel_pixels_tab[0][9]_c: 1778.2 ( 1.00x) put_qpel_pixels_tab[0][9]_mmxext: 476.7 ( 3.73x) put_qpel_pixels_tab[0][9]_sse2: 379.6 ( 4.68x) put_qpel_pixels_tab[0][10]_c: 1714.6 ( 1.00x) put_qpel_pixels_tab[0][10]_mmxext: 460.7 ( 3.72x) put_qpel_pixels_tab[0][10]_sse2: 386.8 ( 4.43x) put_qpel_pixels_tab[0][11]_c: 1819.1 ( 1.00x) put_qpel_pixels_tab[0][11]_mmxext: 474.9 ( 3.83x) put_qpel_pixels_tab[0][11]_sse2: 404.5 ( 4.50x) put_qpel_pixels_tab[0][12]_c: 829.7 ( 1.00x) put_qpel_pixels_tab[0][12]_mmxext: 221.5 ( 3.75x) put_qpel_pixels_tab[0][12]_sse2: 138.7 ( 5.98x) put_qpel_pixels_tab[0][13]_c: 1892.8 ( 1.00x) put_qpel_pixels_tab[0][13]_mmxext: 494.4 ( 3.83x) put_qpel_pixels_tab[0][13]_sse2: 413.9 ( 4.57x) put_qpel_pixels_tab[0][14]_c: 1763.1 ( 1.00x) put_qpel_pixels_tab[0][14]_mmxext: 473.4 ( 3.72x) put_qpel_pixels_tab[0][14]_sse2: 377.8 ( 4.67x) put_qpel_pixels_tab[0][15]_c: 1896.4 ( 1.00x) put_qpel_pixels_tab[0][15]_mmxext: 492.5 ( 3.85x) put_qpel_pixels_tab[0][15]_sse2: 399.0 ( 4.75x) put_qpel_pixels_tab[1][4]_c: 198.6 ( 1.00x) put_qpel_pixels_tab[1][4]_mmxext: 60.9 ( 3.26x) put_qpel_pixels_tab[1][4]_sse2: 40.1 ( 4.95x) put_qpel_pixels_tab[1][5]_c: 471.4 ( 1.00x) put_qpel_pixels_tab[1][5]_mmxext: 131.8 ( 3.58x) put_qpel_pixels_tab[1][5]_sse2: 107.2 ( 4.40x) put_qpel_pixels_tab[1][6]_c: 440.3 ( 1.00x) put_qpel_pixels_tab[1][6]_mmxext: 126.3 ( 3.49x) put_qpel_pixels_tab[1][6]_sse2: 100.6 ( 4.38x) put_qpel_pixels_tab[1][7]_c: 469.2 ( 1.00x) put_qpel_pixels_tab[1][7]_mmxext: 131.7 ( 3.56x) put_qpel_pixels_tab[1][7]_sse2: 106.9 ( 4.39x) put_qpel_pixels_tab[1][8]_c: 194.2 ( 1.00x) put_qpel_pixels_tab[1][8]_mmxext: 52.9 ( 3.67x) put_qpel_pixels_tab[1][8]_sse2: 28.0 ( 6.95x) put_qpel_pixels_tab[1][9]_c: 464.6 ( 1.00x) put_qpel_pixels_tab[1][9]_mmxext: 125.1 ( 3.71x) put_qpel_pixels_tab[1][9]_sse2: 100.9 ( 4.60x) put_qpel_pixels_tab[1][10]_c: 433.8 ( 1.00x) put_qpel_pixels_tab[1][10]_mmxext: 118.2 ( 3.67x) put_qpel_pixels_tab[1][10]_sse2: 94.5 ( 4.59x) put_qpel_pixels_tab[1][11]_c: 463.9 ( 1.00x) put_qpel_pixels_tab[1][11]_mmxext: 125.5 ( 3.70x) put_qpel_pixels_tab[1][11]_sse2: 102.6 ( 4.52x) put_qpel_pixels_tab[1][12]_c: 199.2 ( 1.00x) put_qpel_pixels_tab[1][12]_mmxext: 63.7 ( 3.12x) put_qpel_pixels_tab[1][12]_sse2: 36.2 ( 5.50x) put_qpel_pixels_tab[1][13]_c: 475.6 ( 1.00x) put_qpel_pixels_tab[1][13]_mmxext: 139.5 ( 3.41x) put_qpel_pixels_tab[1][13]_sse2: 107.3 ( 4.43x) put_qpel_pixels_tab[1][14]_c: 441.9 ( 1.00x) put_qpel_pixels_tab[1][14]_mmxext: 126.9 ( 3.48x) put_qpel_pixels_tab[1][14]_sse2: 101.3 ( 4.36x) put_qpel_pixels_tab[1][15]_c: 475.9 ( 1.00x) put_qpel_pixels_tab[1][15]_mmxext: 131.9 ( 3.61x) put_qpel_pixels_tab[1][15]_sse2: 107.0 ( 4.45x) The new functions (in qpeldsp.asm) occupy 8244B (the MMXEXT functions which they will replace occupy only 6720B). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c
Change #265969
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision dad0c010761cfaf98ddebdb188300f117c370295 Comments
avcodec/x86/qpeldsp: Remove vertical MMXEXT mc functions Superseded by SSE2. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c
Change #265970
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision a3d747f3446e828c5e03880d0459d51994f6ec15 Comments
avcodec/x86/qpeldsp{,_init}: Use SSE2 pixels16x16_l2 functions put and avg versions have been added and used in H264 in b91081274f5a5b5f0f1ce820331f702378a425e8. This commit adds the size 16 version of put_no_rnd and uses all three of them in the SSE2 size 16 qpel functions (i.e. it uses them in the ones that have a vertical component); it also removes the 16x17 MMXEXT versions (which are no longer used). This is particularly beneficial for put_no_rnd: avg_qpel_pixels_tab[0][5]_c: 1910.9 ( 1.00x) avg_qpel_pixels_tab[0][5]_sse2 (old): 405.1 ( 4.72x) avg_qpel_pixels_tab[0][5]_sse2: 392.9 ( 4.86x) avg_qpel_pixels_tab[0][6]_c: 1778.9 ( 1.00x) avg_qpel_pixels_tab[0][6]_sse2 (old): 385.5 ( 4.61x) avg_qpel_pixels_tab[0][6]_sse2: 374.9 ( 4.75x) avg_qpel_pixels_tab[0][7]_c: 1935.3 ( 1.00x) avg_qpel_pixels_tab[0][7]_sse2 (old): 403.1 ( 4.80x) avg_qpel_pixels_tab[0][7]_sse2: 391.6 ( 4.94x) avg_qpel_pixels_tab[0][9]_c: 1969.0 ( 1.00x) avg_qpel_pixels_tab[0][9]_sse2 (old): 384.1 ( 5.13x) avg_qpel_pixels_tab[0][9]_sse2: 380.3 ( 5.18x) avg_qpel_pixels_tab[0][11]_c: 2014.9 ( 1.00x) avg_qpel_pixels_tab[0][11]_sse2 (old): 385.6 ( 5.23x) avg_qpel_pixels_tab[0][11]_sse2: 380.2 ( 5.30x) avg_qpel_pixels_tab[0][13]_c: 1925.7 ( 1.00x) avg_qpel_pixels_tab[0][13]_sse2 (old): 406.1 ( 4.74x) avg_qpel_pixels_tab[0][13]_sse2: 390.4 ( 4.93x) avg_qpel_pixels_tab[0][14]_c: 1793.0 ( 1.00x) avg_qpel_pixels_tab[0][14]_sse2 (old): 389.6 ( 4.60x) avg_qpel_pixels_tab[0][14]_sse2: 377.1 ( 4.75x) avg_qpel_pixels_tab[0][15]_c: 1913.0 ( 1.00x) avg_qpel_pixels_tab[0][15]_sse2 (old): 404.2 ( 4.73x) avg_qpel_pixels_tab[0][15]_sse2: 390.8 ( 4.89x) put_no_rnd_qpel_pixels_tab[0][5]_c: 1864.1 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][5]_sse2 (old): 425.6 ( 4.38x) put_no_rnd_qpel_pixels_tab[0][5]_sse2: 396.2 ( 4.71x) put_no_rnd_qpel_pixels_tab[0][6]_c: 1767.1 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][6]_sse2 (old): 388.4 ( 4.55x) put_no_rnd_qpel_pixels_tab[0][6]_sse2: 377.7 ( 4.68x) put_no_rnd_qpel_pixels_tab[0][7]_c: 1874.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][7]_sse2 (old): 427.6 ( 4.38x) put_no_rnd_qpel_pixels_tab[0][7]_sse2: 400.0 ( 4.69x) put_no_rnd_qpel_pixels_tab[0][9]_c: 1759.7 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][9]_sse2 (old): 393.0 ( 4.48x) put_no_rnd_qpel_pixels_tab[0][9]_sse2: 379.7 ( 4.63x) put_no_rnd_qpel_pixels_tab[0][11]_c: 1820.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][11]_sse2 (old): 392.7 ( 4.64x) put_no_rnd_qpel_pixels_tab[0][11]_sse2: 377.4 ( 4.82x) put_no_rnd_qpel_pixels_tab[0][13]_c: 1841.2 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][13]_sse2 (old): 427.1 ( 4.31x) put_no_rnd_qpel_pixels_tab[0][13]_sse2: 395.9 ( 4.65x) put_no_rnd_qpel_pixels_tab[0][14]_c: 1761.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][14]_sse2 (old): 392.3 ( 4.49x) put_no_rnd_qpel_pixels_tab[0][14]_sse2: 375.9 ( 4.69x) put_no_rnd_qpel_pixels_tab[0][15]_c: 1869.1 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][15]_sse2 (old): 425.6 ( 4.39x) put_no_rnd_qpel_pixels_tab[0][15]_sse2: 397.3 ( 4.70x) put_qpel_pixels_tab[0][5]_c: 1888.2 ( 1.00x) put_qpel_pixels_tab[0][5]_sse2 (old): 396.5 ( 4.76x) put_qpel_pixels_tab[0][5]_sse2: 382.5 ( 4.94x) put_qpel_pixels_tab[0][6]_c: 1760.4 ( 1.00x) put_qpel_pixels_tab[0][6]_sse2 (old): 377.0 ( 4.67x) put_qpel_pixels_tab[0][6]_sse2: 372.1 ( 4.73x) put_qpel_pixels_tab[0][7]_c: 1927.6 ( 1.00x) put_qpel_pixels_tab[0][7]_sse2 (old): 396.5 ( 4.86x) put_qpel_pixels_tab[0][7]_sse2: 383.4 ( 5.03x) put_qpel_pixels_tab[0][9]_c: 1775.9 ( 1.00x) put_qpel_pixels_tab[0][9]_sse2 (old): 377.9 ( 4.70x) put_qpel_pixels_tab[0][9]_sse2: 372.3 ( 4.77x) put_qpel_pixels_tab[0][11]_c: 1809.0 ( 1.00x) put_qpel_pixels_tab[0][11]_sse2 (old): 374.6 ( 4.83x) put_qpel_pixels_tab[0][11]_sse2: 380.3 ( 4.76x) put_qpel_pixels_tab[0][13]_c: 1893.2 ( 1.00x) put_qpel_pixels_tab[0][13]_sse2 (old): 399.2 ( 4.74x) put_qpel_pixels_tab[0][13]_sse2: 384.7 ( 4.92x) put_qpel_pixels_tab[0][14]_c: 1756.2 ( 1.00x) put_qpel_pixels_tab[0][14]_sse2 (old): 377.9 ( 4.65x) put_qpel_pixels_tab[0][14]_sse2: 374.4 ( 4.69x) put_qpel_pixels_tab[0][15]_c: 1922.8 ( 1.00x) put_qpel_pixels_tab[0][15]_sse2 (old): 399.0 ( 4.82x) put_qpel_pixels_tab[0][15]_sse2: 387.8 ( 4.96x) The purely vertical size 16 mc functions now no longer use any MMX. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>Changed files
- libavcodec/x86/qpel.asm
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c
Change #265971
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision c0e1c1d6b3245a5bf46b5cb5c22cd16a9138a21b Comments
avcodec/x86/qpeldsp: Add SSSE3 size 16 horizontal filter Beats the mmxext version by a lot (in the following, [0][1-3] refers to horizontal-only size 16 mc; the _sse2 comparators for the other cases use mmxext horizontal mc coupled with vertical SSE2 mc): avg_qpel_pixels_tab[0][1]_c: 945.5 ( 1.00x) avg_qpel_pixels_tab[0][1]_mmxext: 262.6 ( 3.60x) avg_qpel_pixels_tab[0][1]_ssse3: 110.4 ( 8.57x) avg_qpel_pixels_tab[0][2]_c: 1042.1 ( 1.00x) avg_qpel_pixels_tab[0][2]_mmxext: 245.1 ( 4.25x) avg_qpel_pixels_tab[0][2]_ssse3: 91.7 (11.37x) avg_qpel_pixels_tab[0][3]_c: 941.8 ( 1.00x) avg_qpel_pixels_tab[0][3]_mmxext: 260.1 ( 3.62x) avg_qpel_pixels_tab[0][3]_ssse3: 110.1 ( 8.56x) avg_qpel_pixels_tab[0][5]_c: 1939.5 ( 1.00x) avg_qpel_pixels_tab[0][5]_sse2: 394.3 ( 4.92x) avg_qpel_pixels_tab[0][5]_ssse3: 247.4 ( 7.84x) avg_qpel_pixels_tab[0][6]_c: 1785.8 ( 1.00x) avg_qpel_pixels_tab[0][6]_sse2: 380.6 ( 4.69x) avg_qpel_pixels_tab[0][6]_ssse3: 221.1 ( 8.08x) avg_qpel_pixels_tab[0][7]_c: 1932.5 ( 1.00x) avg_qpel_pixels_tab[0][7]_sse2: 393.4 ( 4.91x) avg_qpel_pixels_tab[0][7]_ssse3: 238.8 ( 8.09x) avg_qpel_pixels_tab[0][9]_c: 1976.9 ( 1.00x) avg_qpel_pixels_tab[0][9]_sse2: 380.8 ( 5.19x) avg_qpel_pixels_tab[0][9]_ssse3: 223.3 ( 8.85x) avg_qpel_pixels_tab[0][10]_c: 1911.9 ( 1.00x) avg_qpel_pixels_tab[0][10]_sse2: 366.9 ( 5.21x) avg_qpel_pixels_tab[0][10]_ssse3: 207.0 ( 9.24x) avg_qpel_pixels_tab[0][11]_c: 2046.9 ( 1.00x) avg_qpel_pixels_tab[0][11]_sse2: 385.5 ( 5.31x) avg_qpel_pixels_tab[0][11]_ssse3: 227.9 ( 8.98x) avg_qpel_pixels_tab[0][13]_c: 1940.8 ( 1.00x) avg_qpel_pixels_tab[0][13]_sse2: 389.7 ( 4.98x) avg_qpel_pixels_tab[0][13]_ssse3: 244.2 ( 7.95x) avg_qpel_pixels_tab[0][14]_c: 1778.4 ( 1.00x) avg_qpel_pixels_tab[0][14]_sse2: 379.2 ( 4.69x) avg_qpel_pixels_tab[0][14]_ssse3: 223.5 ( 7.96x) avg_qpel_pixels_tab[0][15]_c: 1905.9 ( 1.00x) avg_qpel_pixels_tab[0][15]_sse2: 398.9 ( 4.78x) avg_qpel_pixels_tab[0][15]_ssse3: 238.3 ( 8.00x) put_no_rnd_qpel_pixels_tab[0][1]_c: 922.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][1]_mmxext: 275.0 ( 3.35x) put_no_rnd_qpel_pixels_tab[0][1]_ssse3: 108.4 ( 8.51x) put_no_rnd_qpel_pixels_tab[0][2]_c: 889.7 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][2]_mmxext: 236.7 ( 3.76x) put_no_rnd_qpel_pixels_tab[0][2]_ssse3: 86.8 (10.25x) put_no_rnd_qpel_pixels_tab[0][3]_c: 915.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][3]_mmxext: 274.3 ( 3.34x) put_no_rnd_qpel_pixels_tab[0][3]_ssse3: 108.2 ( 8.46x) put_no_rnd_qpel_pixels_tab[0][5]_sse2: 400.0 ( 4.63x) put_no_rnd_qpel_pixels_tab[0][5]_ssse3: 246.0 ( 7.53x) put_no_rnd_qpel_pixels_tab[0][6]_c: 1753.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][6]_sse2: 382.5 ( 4.59x) put_no_rnd_qpel_pixels_tab[0][6]_ssse3: 226.4 ( 7.75x) put_no_rnd_qpel_pixels_tab[0][7]_c: 1854.6 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][7]_sse2: 393.5 ( 4.71x) put_no_rnd_qpel_pixels_tab[0][7]_ssse3: 248.6 ( 7.46x) put_no_rnd_qpel_pixels_tab[0][9]_c: 1794.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][9]_sse2: 382.2 ( 4.70x) put_no_rnd_qpel_pixels_tab[0][9]_ssse3: 228.0 ( 7.87x) put_no_rnd_qpel_pixels_tab[0][10]_c: 1724.7 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][10]_sse2: 353.8 ( 4.88x) put_no_rnd_qpel_pixels_tab[0][10]_ssse3: 206.5 ( 8.35x) put_no_rnd_qpel_pixels_tab[0][11]_c: 1796.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][11]_sse2: 378.1 ( 4.75x) put_no_rnd_qpel_pixels_tab[0][11]_ssse3: 227.1 ( 7.91x) put_no_rnd_qpel_pixels_tab[0][13]_c: 1834.4 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][13]_sse2: 400.7 ( 4.58x) put_no_rnd_qpel_pixels_tab[0][13]_ssse3: 244.2 ( 7.51x) put_no_rnd_qpel_pixels_tab[0][14]_c: 1755.7 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][14]_sse2: 387.2 ( 4.53x) put_no_rnd_qpel_pixels_tab[0][14]_ssse3: 226.8 ( 7.74x) put_no_rnd_qpel_pixels_tab[0][15]_c: 1847.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[0][15]_sse2: 400.6 ( 4.61x) put_no_rnd_qpel_pixels_tab[0][15]_ssse3: 246.1 ( 7.51x) put_qpel_pixels_tab[0][1]_c: 919.6 ( 1.00x) put_qpel_pixels_tab[0][1]_mmxext: 255.5 ( 3.60x) put_qpel_pixels_tab[0][1]_ssse3: 108.3 ( 8.49x) put_qpel_pixels_tab[0][2]_c: 883.9 ( 1.00x) put_qpel_pixels_tab[0][2]_mmxext: 238.1 ( 3.71x) put_qpel_pixels_tab[0][2]_ssse3: 86.7 (10.19x) put_qpel_pixels_tab[0][3]_c: 921.9 ( 1.00x) put_qpel_pixels_tab[0][3]_mmxext: 258.9 ( 3.56x) put_qpel_pixels_tab[0][3]_ssse3: 108.1 ( 8.53x) put_qpel_pixels_tab[0][5]_c: 1907.5 ( 1.00x) put_qpel_pixels_tab[0][5]_sse2: 384.2 ( 4.96x) put_qpel_pixels_tab[0][5]_ssse3: 234.8 ( 8.13x) put_qpel_pixels_tab[0][6]_c: 1757.4 ( 1.00x) put_qpel_pixels_tab[0][6]_sse2: 382.8 ( 4.59x) put_qpel_pixels_tab[0][6]_ssse3: 217.6 ( 8.08x) put_qpel_pixels_tab[0][7]_c: 1927.5 ( 1.00x) put_qpel_pixels_tab[0][7]_sse2: 384.6 ( 5.01x) put_qpel_pixels_tab[0][7]_ssse3: 231.2 ( 8.34x) put_qpel_pixels_tab[0][9]_c: 1832.1 ( 1.00x) put_qpel_pixels_tab[0][9]_sse2: 374.8 ( 4.89x) put_qpel_pixels_tab[0][9]_ssse3: 219.4 ( 8.35x) put_qpel_pixels_tab[0][10]_c: 1710.3 ( 1.00x) put_qpel_pixels_tab[0][10]_sse2: 384.5 ( 4.45x) put_qpel_pixels_tab[0][10]_ssse3: 202.9 ( 8.43x) put_qpel_pixels_tab[0][11]_c: 1825.0 ( 1.00x) put_qpel_pixels_tab[0][11]_sse2: 369.6 ( 4.94x) put_qpel_pixels_tab[0][11]_ssse3: 216.8 ( 8.42x) put_qpel_pixels_tab[0][13]_c: 1898.4 ( 1.00x) put_qpel_pixels_tab[0][13]_sse2: 384.9 ( 4.93x) put_qpel_pixels_tab[0][13]_ssse3: 238.6 ( 7.96x) put_qpel_pixels_tab[0][14]_c: 1779.1 ( 1.00x) put_qpel_pixels_tab[0][14]_sse2: 373.3 ( 4.77x) put_qpel_pixels_tab[0][14]_ssse3: 218.1 ( 8.16x) put_qpel_pixels_tab[0][15]_c: 1918.2 ( 1.00x) put_qpel_pixels_tab[0][15]_sse2: 385.3 ( 4.98x) put_qpel_pixels_tab[0][15]_ssse3: 236.8 ( 8.10x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c
Change #265972
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 1d040c527d5877ede8e54e0e4bb3f737b74f1f21 Comments
avcodec/x86/qpeldsp: Add SSSE3 size 8 horizontal filter Beats the mmxext version by a lot (in the following, [1][1-3] refers to horizontal-only size 8 mc; the _sse2 comparators for the other cases use mmxext horizontal mc coupled with vertical SSE2 mc): avg_qpel_pixels_tab[1][1]_c: 223.9 ( 1.00x) avg_qpel_pixels_tab[1][1]_mmxext: 66.2 ( 3.38x) avg_qpel_pixels_tab[1][1]_ssse3: 36.8 ( 6.08x) avg_qpel_pixels_tab[1][2]_c: 251.0 ( 1.00x) avg_qpel_pixels_tab[1][2]_mmxext: 58.5 ( 4.29x) avg_qpel_pixels_tab[1][2]_ssse3: 25.5 ( 9.84x) avg_qpel_pixels_tab[1][3]_c: 226.9 ( 1.00x) avg_qpel_pixels_tab[1][3]_mmxext: 66.3 ( 3.42x) avg_qpel_pixels_tab[1][3]_ssse3: 35.8 ( 6.34x) avg_qpel_pixels_tab[1][5]_c: 473.9 ( 1.00x) avg_qpel_pixels_tab[1][5]_sse2: 110.7 ( 4.28x) avg_qpel_pixels_tab[1][5]_ssse3: 76.0 ( 6.24x) avg_qpel_pixels_tab[1][6]_c: 440.9 ( 1.00x) avg_qpel_pixels_tab[1][6]_sse2: 102.1 ( 4.32x) avg_qpel_pixels_tab[1][6]_ssse3: 67.1 ( 6.58x) avg_qpel_pixels_tab[1][7]_c: 473.8 ( 1.00x) avg_qpel_pixels_tab[1][7]_sse2: 108.0 ( 4.39x) avg_qpel_pixels_tab[1][7]_ssse3: 74.6 ( 6.35x) avg_qpel_pixels_tab[1][9]_c: 492.9 ( 1.00x) avg_qpel_pixels_tab[1][9]_sse2: 102.1 ( 4.83x) avg_qpel_pixels_tab[1][9]_ssse3: 67.1 ( 7.35x) avg_qpel_pixels_tab[1][10]_c: 465.6 ( 1.00x) avg_qpel_pixels_tab[1][10]_sse2: 94.9 ( 4.91x) avg_qpel_pixels_tab[1][10]_ssse3: 57.5 ( 8.10x) avg_qpel_pixels_tab[1][11]_c: 492.8 ( 1.00x) avg_qpel_pixels_tab[1][11]_sse2: 102.4 ( 4.81x) avg_qpel_pixels_tab[1][11]_ssse3: 68.7 ( 7.17x) avg_qpel_pixels_tab[1][13]_c: 476.6 ( 1.00x) avg_qpel_pixels_tab[1][13]_sse2: 108.6 ( 4.39x) avg_qpel_pixels_tab[1][13]_ssse3: 74.7 ( 6.38x) avg_qpel_pixels_tab[1][14]_c: 434.9 ( 1.00x) avg_qpel_pixels_tab[1][14]_sse2: 102.2 ( 4.25x) avg_qpel_pixels_tab[1][14]_ssse3: 66.6 ( 6.53x) avg_qpel_pixels_tab[1][15]_c: 474.1 ( 1.00x) avg_qpel_pixels_tab[1][15]_sse2: 107.9 ( 4.39x) avg_qpel_pixels_tab[1][15]_ssse3: 74.3 ( 6.38x) put_no_rnd_qpel_pixels_tab[1][1]_c: 222.1 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][1]_mmxext: 66.0 ( 3.37x) put_no_rnd_qpel_pixels_tab[1][1]_ssse3: 35.2 ( 6.31x) put_no_rnd_qpel_pixels_tab[1][2]_c: 212.2 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][2]_mmxext: 56.8 ( 3.74x) put_no_rnd_qpel_pixels_tab[1][2]_ssse3: 25.0 ( 8.48x) put_no_rnd_qpel_pixels_tab[1][3]_c: 224.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][3]_mmxext: 65.8 ( 3.41x) put_no_rnd_qpel_pixels_tab[1][3]_ssse3: 35.8 ( 6.26x) put_no_rnd_qpel_pixels_tab[1][5]_c: 460.1 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][5]_sse2: 114.6 ( 4.01x) put_no_rnd_qpel_pixels_tab[1][5]_ssse3: 83.1 ( 5.53x) put_no_rnd_qpel_pixels_tab[1][6]_c: 438.6 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][6]_sse2: 104.2 ( 4.21x) put_no_rnd_qpel_pixels_tab[1][6]_ssse3: 67.5 ( 6.50x) put_no_rnd_qpel_pixels_tab[1][7]_c: 458.0 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][7]_sse2: 113.8 ( 4.02x) put_no_rnd_qpel_pixels_tab[1][7]_ssse3: 79.9 ( 5.73x) put_no_rnd_qpel_pixels_tab[1][9]_c: 439.0 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][9]_sse2: 103.7 ( 4.23x) put_no_rnd_qpel_pixels_tab[1][9]_ssse3: 68.9 ( 6.37x) put_no_rnd_qpel_pixels_tab[1][10]_c: 427.0 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][10]_sse2: 93.2 ( 4.58x) put_no_rnd_qpel_pixels_tab[1][10]_ssse3: 57.9 ( 7.37x) put_no_rnd_qpel_pixels_tab[1][11]_c: 439.9 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][11]_sse2: 104.0 ( 4.23x) put_no_rnd_qpel_pixels_tab[1][11]_ssse3: 69.2 ( 6.36x) put_no_rnd_qpel_pixels_tab[1][13]_c: 459.3 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][13]_sse2: 113.2 ( 4.06x) put_no_rnd_qpel_pixels_tab[1][13]_ssse3: 83.8 ( 5.48x) put_no_rnd_qpel_pixels_tab[1][14]_c: 439.5 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][14]_sse2: 103.3 ( 4.25x) put_no_rnd_qpel_pixels_tab[1][14]_ssse3: 67.9 ( 6.47x) put_no_rnd_qpel_pixels_tab[1][15]_c: 453.6 ( 1.00x) put_no_rnd_qpel_pixels_tab[1][15]_sse2: 113.7 ( 3.99x) put_no_rnd_qpel_pixels_tab[1][15]_ssse3: 80.0 ( 5.67x) put_qpel_pixels_tab[1][1]_c: 229.0 ( 1.00x) put_qpel_pixels_tab[1][1]_mmxext: 65.5 ( 3.50x) put_qpel_pixels_tab[1][1]_ssse3: 33.8 ( 6.77x) put_qpel_pixels_tab[1][2]_c: 212.5 ( 1.00x) put_qpel_pixels_tab[1][2]_mmxext: 56.6 ( 3.75x) put_qpel_pixels_tab[1][2]_ssse3: 23.4 ( 9.08x) put_qpel_pixels_tab[1][3]_c: 227.5 ( 1.00x) put_qpel_pixels_tab[1][3]_mmxext: 64.4 ( 3.53x) put_qpel_pixels_tab[1][3]_ssse3: 33.5 ( 6.79x) put_qpel_pixels_tab[1][5]_c: 466.5 ( 1.00x) put_qpel_pixels_tab[1][5]_sse2: 106.8 ( 4.37x) put_qpel_pixels_tab[1][5]_ssse3: 71.8 ( 6.50x) put_qpel_pixels_tab[1][6]_c: 438.7 ( 1.00x) put_qpel_pixels_tab[1][6]_sse2: 102.0 ( 4.30x) put_qpel_pixels_tab[1][6]_ssse3: 65.3 ( 6.72x) put_qpel_pixels_tab[1][7]_c: 466.0 ( 1.00x) put_qpel_pixels_tab[1][7]_sse2: 106.3 ( 4.38x) put_qpel_pixels_tab[1][7]_ssse3: 70.9 ( 6.57x) put_qpel_pixels_tab[1][9]_c: 456.0 ( 1.00x) put_qpel_pixels_tab[1][9]_sse2: 100.1 ( 4.55x) put_qpel_pixels_tab[1][9]_ssse3: 64.0 ( 7.13x) put_qpel_pixels_tab[1][10]_c: 425.1 ( 1.00x) put_qpel_pixels_tab[1][10]_sse2: 92.6 ( 4.59x) put_qpel_pixels_tab[1][10]_ssse3: 55.1 ( 7.71x) put_qpel_pixels_tab[1][11]_c: 452.7 ( 1.00x) put_qpel_pixels_tab[1][11]_sse2: 99.6 ( 4.55x) put_qpel_pixels_tab[1][11]_ssse3: 63.8 ( 7.09x) put_qpel_pixels_tab[1][13]_c: 471.2 ( 1.00x) put_qpel_pixels_tab[1][13]_sse2: 106.4 ( 4.43x) put_qpel_pixels_tab[1][13]_ssse3: 71.4 ( 6.60x) put_qpel_pixels_tab[1][14]_c: 439.7 ( 1.00x) put_qpel_pixels_tab[1][14]_sse2: 101.8 ( 4.32x) put_qpel_pixels_tab[1][14]_ssse3: 64.8 ( 6.79x) put_qpel_pixels_tab[1][15]_c: 467.8 ( 1.00x) put_qpel_pixels_tab[1][15]_sse2: 106.1 ( 4.41x) put_qpel_pixels_tab[1][15]_ssse3: 72.6 ( 6.44x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c
Change #265973
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision f946cac2d9fd8a816a72fe4fad81587b14af53fc Comments
avcodec/x86/qpeldsp: Remove horizontal mmxext mc functions Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpel.asm
- libavcodec/x86/qpel.h
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c
Change #265974
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision 23d3116af93027866689e52e50533cd3121679ab Comments
avcodec/x86/qpeldsp: Add combination of h_lowpass + l2 If the subpel part of the horizontal component of the motion vector is 1/4 or 3/4, the MPEG-4 qpel motion compensation first computes the mc for the corresponding motion vector with 1/2 horizontal subpel part and then averages this with the left (for 1/4) or the right (for 3/4) source pixel. These two stages are currently performed in two different functions, involving a stack buffer as intermediate. This means that horizontal prediction for every function with a 1/4 or 3/4 horizontal subpel mv is more expensive code-size wise (and also performance-wise) as it involves two calls. Given that the horizontal lowpass functions are not that long, adding combinations of h_lowpass+l2 actually reduces binary size: An increase of 1136B in the asm files is more than offset by size reductions in the wrappers: 1968B here when not using stack protection, 2256B when using stack protection. Of course it also improves performance. Old benchmarks: avg_qpel_pixels_tab[0][1]_ssse3: 106.9 ( 8.69x) avg_qpel_pixels_tab[0][3]_ssse3: 105.5 ( 8.84x) avg_qpel_pixels_tab[0][5]_ssse3: 226.9 ( 8.57x) avg_qpel_pixels_tab[0][7]_ssse3: 231.1 ( 8.38x) avg_qpel_pixels_tab[0][9]_ssse3: 217.8 ( 9.04x) avg_qpel_pixels_tab[0][11]_ssse3: 214.9 ( 9.32x) avg_qpel_pixels_tab[0][13]_ssse3: 227.1 ( 8.48x) avg_qpel_pixels_tab[0][15]_ssse3: 236.1 ( 8.02x) New benchmarks: avg_qpel_pixels_tab[0][1]_ssse3: 96.7 ( 9.65x) avg_qpel_pixels_tab[0][3]_ssse3: 96.6 ( 9.73x) avg_qpel_pixels_tab[0][5]_ssse3: 225.8 ( 8.61x) avg_qpel_pixels_tab[0][7]_ssse3: 228.4 ( 8.51x) avg_qpel_pixels_tab[0][9]_ssse3: 217.1 ( 9.05x) avg_qpel_pixels_tab[0][11]_ssse3: 217.8 ( 9.32x) avg_qpel_pixels_tab[0][13]_ssse3: 227.2 ( 8.54x) avg_qpel_pixels_tab[0][15]_ssse3: 220.5 ( 8.72x) Note: The l2 functions are also used for vertical lowpass functions, yet given that they are much bigger, duplicating them would lead to massive code size increase. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpel.asm
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c
Change #265975
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision ca43bc6202382518137b938f681c57614087f5c8 Comments
avcodec/x86/qpeldsp_init: Mark functions as hidden It allows pic 32bit code to call the underlying assembly functions directly, without loading the GOT first; this saves 1245B of .text here (for 32bit pic code). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Changed files
- libavcodec/x86/qpel.h
- libavcodec/x86/qpeldsp_init.c
Change #265976
Category ffmpeg Changed by Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Changed at Thu 30 Apr 2026 10:39:33 Repository https://git.ffmpeg.org/ffmpeg.git Project ffmpeg Branch master Revision cc3ca1712760ae53957a3a5987cb7c61c290a451 Comments
avcodec/x86/qpeldsp{,_init}: Use proper prefix E.g. rename ff_put_mpeg4_qpel8_h_lowpass_ssse3 to ff_mpeg4_put_qpel8_h_lowpass_ssse3. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>Changed files
- libavcodec/x86/qpeldsp.asm
- libavcodec/x86/qpeldsp_init.c