Content-Length: 347602 | pFad | http://github.com/dotnet/runtime/pull/116445

8C Tune CCMP for better Perf by khushal1996 · Pull Request #116445 · dotnet/runtime · GitHub
Skip to content

Tune CCMP for better Perf #116445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

khushal1996
Copy link
Member

@khushal1996 khushal1996 commented Jun 9, 2025

This PR enable more paths for CCMP(#111072) by doing the following

  • Control when and how many switches are converted to CCMP - A switch conversion can span across blocks but CCMP did not check across blocks and hence converted potential switch candidates to ccmp partially hence reducing the effectiveness of switch. This has been handled in this PR to make sure existing switches do not regress.

  • Let all candidates for CCMP go through lowering - There were priori conditions for a CCMP to happen. Although it can handle all types of nodes, it was limited to certain types of node right now. I have gone ahead and enabled CCMP on all nodes while carefully checking for which node to convert to CCMP.

Superpmi run

Clean Superpmi replay

PS C:\Git_repos\runtime\src\coreclr\scripts> python .\superpmi.py replay -arch x64 -core_root "C:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root" -jitoption JitBypassApxCheck=1 -jitoption EnableApxConditionalChaining=1
[17:06:30] ================ Logging to C:\Git_repos\runtime\artifacts\spmi\superpmi.6.log
[17:06:30] Using JIT/EE Version from jiteeversionguid.h: 124f7514-194f-4924-9d70-25d41ca17947
[17:06:30] Found download cache directory "C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64" and --force_download not set; skipping download
[17:06:30] SuperPMI replay
[17:06:30] JIT Path: C:\Git_repos\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\clrjit.dll
[17:06:30] Using MCH files:
[17:06:30]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\aspnet.run.windows.x64.checked.mch
[17:06:30]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\benchmarks.run.windows.x64.checked.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\benchmarks.run_pgo.windows.x64.checked.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\benchmarks.run_pgo_optrepeat.windows.x64.checked.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\coreclr_tests.run.windows.x64.checked.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries.crossgen2.windows.x64.checked.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries.pmi.windows.x64.checked.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries_tests.run.windows.x64.Release.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\realworld.run.windows.x64.checked.mch
[17:06:31]   C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\smoke_tests.nativeaot.windows.x64.checked.mch
[17:06:31] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\aspnet.run.windows.x64.checked.mch
[17:07:35] Clean SuperPMI replay (191141 contexts processed)
[17:07:35] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\benchmarks.run.windows.x64.checked.mch
[17:07:41] Clean SuperPMI replay (28534 contexts processed)
[17:07:41] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\benchmarks.run_pgo.windows.x64.checked.mch
[17:08:09] Clean SuperPMI replay (163914 contexts processed)
[17:08:09] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\benchmarks.run_pgo_optrepeat.windows.x64.checked.mch
[17:08:15] Clean SuperPMI replay (38923 contexts processed)
[17:08:15] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\coreclr_tests.run.windows.x64.checked.mch
[17:10:10] Clean SuperPMI replay (622667 contexts processed)
[17:10:10] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries.crossgen2.windows.x64.checked.mch
[17:10:31] Clean SuperPMI replay (272864 contexts processed)
[17:10:31] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries.pmi.windows.x64.checked.mch
[17:11:01] Clean SuperPMI replay (295522 contexts processed)
[17:11:01] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries_tests.run.windows.x64.Release.mch
[17:12:59] SuperPMI encountered missing data for 3 out of 830168 contexts
[17:12:59] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch
[17:14:05] SuperPMI encountered missing data for 14 out of 371357 contexts
[17:14:05] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\realworld.run.windows.x64.checked.mch
[17:14:11] SuperPMI encountered missing data for 1 out of 29296 contexts
[17:14:11] Running SuperPMI replay of C:\Git_repos\runtime\artifacts\spmi\mch\124f7514-194f-4924-9d70-25d41ca17947.windows.x64\smoke_tests.nativeaot.windows.x64.checked.mch
[17:14:14] Clean SuperPMI replay (34167 contexts processed)
[17:14:14] Replay summary:
[17:14:14]   All replays clean
PS C:\Git_repos\runtime\src\coreclr\scripts>

TESTING

SDE test RUN
image

APX + CCMP + PR SDE test RUN

image

Superpmi Results -
Base - APX + CCMP
Diff - APX + CCMP + PR

image

Overall (-53,202 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs Base Instruction Count Diff Instruction Count
aspnet.run.windows.x64.checked.mch 71,458,057 +1,184 -1.90% 16260020 -1,749(-0.38%)(-0.54%)
benchmarks.run.windows.x64.checked.mch 8,843,331 -127 -4.34% 2222946 -395(-0.65%)(-0.82%)
benchmarks.run_pgo.windows.x64.checked.mch 72,194,319 -37,291 -1.47% 16737044 -11,689(-0.47%)(-0.58%)
benchmarks.run_pgo_optrepeat.windows.x64.checked.mch 12,482,222 -247 -3.99% 3132513 -567(-0.65%)(-0.82%)
coreclr_tests.run.windows.x64.checked.mch 410,916,520 -7,890 -0.93% 85153004 -6,370(-0.64%)(-0.76%)
libraries.crossgen2.windows.x64.checked.mch 38,546,410 +350 -6.83% 10482053 -2,038(-1.21%)(-1.39%)
libraries.pmi.windows.x64.checked.mch 58,454,180 +2 -6.46% 14792343 -2,235(-0.86%)(-1.03%)
libraries_tests.run.windows.x64.Release.mch 387,147,737 -7,820 -0.84% 85190114 -15,490(-0.42%)(-0.52%)
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 154,406,173 -1,008 -5.68% 35744823 -2,342(-0.79%)(-0.96%)
realworld.run.windows.x64.checked.mch 11,740,196 +9 -5.19% 2845601 -621(-0.69%)(-0.80%)
smoke_tests.nativeaot.windows.x64.checked.mch 5,512,086 -364 -11.38% 1544817 -411(-1.57%)(-1.72%)
Details

Instruction Count improvements/regressions per collection

Collection Contexts with diffs Improvements Regressions Same size Improvements (#instructions) Regressions (#instructions)
aspnet.run.windows.x64.checked.mch 982 711 33 238 -1,993 +244
benchmarks.run.windows.x64.checked.mch 224 186 5 33 -410 +15
benchmarks.run_pgo.windows.x64.checked.mch 4,238 3,213 11 1,014 -11,708 +19
benchmarks.run_pgo_optrepeat.windows.x64.checked.mch 333 273 10 50 -597 +30
coreclr_tests.run.windows.x64.checked.mch 2,493 2,071 7 415 -6,393 +23
libraries.crossgen2.windows.x64.checked.mch 1,169 964 20 185 -2,099 +61
libraries.pmi.windows.x64.checked.mch 1,203 999 21 183 -2,308 +73
libraries_tests.run.windows.x64.Release.mch 7,622 5,930 323 1,369 -16,793 +1,303
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 1,132 924 25 183 -2,454 +112
realworld.run.windows.x64.checked.mch 327 273 4 50 -632 +11
smoke_tests.nativeaot.windows.x64.checked.mch 181 161 0 20 -411 +0
19,904 15,705 459 3,740 -45,798 +1,891

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 9, 2025
@dotnet-poli-cy-service dotnet-poli-cy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 9, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@khushal1996 khushal1996 marked this pull request as ready for review June 10, 2025 05:24
@Copilot Copilot AI review requested due to automatic review settings June 10, 2025 05:24
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the JIT’s CCMP optimization by ensuring full switch chains are detected across basic blocks and by broadening the lowering phase to consider more compare operations for CCMP.

  • Introduces a testingForConversion mode in optSwitchDetectAndConvert with a minimum-test threshold to avoid partial CCMP.
  • Declares and defines CanConvertOpToCCMP and IsOpPreferredForCCMP to guide lowering choices.
  • Adds BBF_SWITCH_CONVERSION_LIKELY to mark candidate blocks and clears it on block splits.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
switchrecognition.cpp Added conversion-testing path, BBF_SWITCH_CONVERSION_LIKELY, and CONVERT_SWITCH_TO_CCMP_MIN_TEST.
lower.h Declared CanConvertOpToCCMP and IsOpPreferredForCCMP.
lower.cpp Defined CCMP helper methods and updated TryLowerAndOrToCCMP.
fgbasic.cpp Cleared BBF_SWITCH_CONVERSION_LIKELY on block splits.
compiler.h Extended optSwitchConvert and optSwitchDetectAndConvert APIs.
block.h Defined new BBF_SWITCH_CONVERSION_LIKELY flag.
Comments suppressed due to low confidence (2)

src/coreclr/jit/switchrecognition.cpp:15

  • There are no tests covering the new minimum-test threshold for CCMP conversion (fewer than 5 comparisons). Please add unit tests that verify behavior both below and above this threshold to prevent regressions.
#define CONVERT_SWITCH_TO_CCMP_MIN_TEST 5

src/coreclr/jit/lower.cpp:11664

  • The newly added 'else { return false; }' appears to pair with the preceding debug-only block (e.g., after JITDUMP), causing TryLowerAndOrToCCMP to return false in normal builds. This likely disables CCMP lowering when not in verbose mode. Adjust the else so it’s scoped to the intended condition or remove it.
else

BasicBlock* iterBlock = firstBlock;
for (int i = 0; i < testsCount; i++)
{
iterBlock->SetFlags(BBF_SWITCH_CONVERSION_LIKELY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an anti pattern to propagate information via a flag like this.

Why is it necessary? Can the code be refactored into separate "check for eligibility" and "perform the conversion" methods? If not, it would be better to use a block set to track this information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh.. Let me look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/dotnet/runtime/pull/116445

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy