[prev in list] [next in list] [prev in thread] [next in thread]
List: llvm-bugs
Subject: [llvm-bugs] [Bug 43508] New: Coroutine symmetric transfer tail call optimization not working on AArc
From: via llvm-bugs <llvm-bugs () lists ! llvm ! org>
Date: 2019-09-30 16:17:22
Message-ID: bug-43508-206 () http ! bugs ! llvm ! org/
[Download RAW message or body]
--1569860243.811DA5.21507
Date: Mon, 30 Sep 2019 09:17:23 -0700
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.llvm.org/
Auto-Submitted: auto-generated
https://bugs.llvm.org/show_bug.cgi?id=43508
Bug ID: 43508
Summary: Coroutine symmetric transfer tail call optimization
not working on AArch64
Product: clang
Version: 9.0
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: C++2a
Assignee: unassignedclangbugs@nondot.org
Reporter: bartde@microsoft.com
CC: blitzrakete@gmail.com, erik.pilkington@gmail.com,
llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
The following code:
task<void> sync_async() { co_return; }
task<void> do_async()
{
for (int i = 0; i < 1024 * 1024; i++)
{
co_await sync_async();
}
}
Causes a stack overflow
sync_async
do_async
sync_async
do_async
…
using clang9 on AArch64, without optimization (-O0). The task<T> implementation
uses symmetric transfer for the final awaiter:
template <typename Promise>
coroutine_handle_t await_suspend(std::experimental::coroutine_handle<Promise>
h) const noexcept
{
return h.promise().m_waiter;
}
and for its operator co_await implementation:
coroutine_handle_t await_suspend(coroutine_handle_t h) const
{
m_coro.promise().m_waiter = h;
return m_coro;
}
I think the task<T> provided by cppcoro should behave completely similar, so it
can be used for the repro.
The stack overflow doesn't repro on x86/x64, or for higher levels of
optimization on AArch64. Both for -O0 builds, the x86 version emits a tail call
by means of a jmp:
b68d2: e8 49 bc f7 ff callq 32520
<_ZNKSt12experimental16coroutine_handleIvE7addressEv>
b68d7: 48 89 c1 mov %rax,%rcx
b68da: 48 8b 00 mov (%rax),%rax
b68dd: 48 89 cf mov %rcx,%rdi
b68e0: 48 81 c4 a0 00 00 00 add $0xa0,%rsp
b68e7: 5d pop %rbp
b68e8: ff e0 jmpq *%rax
while the AArch64 version emits:
a8be0: 97fe1fbf bl 30adc
<_ZNKSt12experimental16coroutine_handleIvE7addressEv>
a8be4: f9400008 ldr x8, [x0]
a8be8: d63f0100 blr x8
a8bec: a9497bfd ldp x29, x30, [sp, #144]
a8bf0: 910283ff add sp, sp, #0xa0
a8bf4: d65f03c0 ret
which seems to perform a regular call using blr.
--
You are receiving this mail because:
You are on the CC list for the bug.
--1569860243.811DA5.21507
Date: Mon, 30 Sep 2019 09:17:23 -0700
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.llvm.org/
Auto-Submitted: auto-generated
<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Coroutine symmetric transfer tail call optimization not working on \
AArch64" href="https://bugs.llvm.org/show_bug.cgi?id=43508">43508</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Coroutine symmetric transfer tail call optimization not working on \
AArch64 </td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>9.0
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>C++2a
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>bartde@microsoft.com
</td>
</tr>
<tr>
<th>CC</th>
<td>blitzrakete@gmail.com, erik.pilkington@gmail.com, \
llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk </td>
</tr></table>
<p>
<div>
<pre>The following code:
task<void> sync_async() { co_return; }
task<void> do_async()
{
for (int i = 0; i < 1024 * 1024; i++)
{
co_await sync_async();
}
}
Causes a stack overflow
sync_async
do_async
sync_async
do_async
…
using clang9 on AArch64, without optimization (-O0). The task<T> implementation
uses symmetric transfer for the final awaiter:
template <typename Promise>
coroutine_handle_t await_suspend(std::experimental::coroutine_handle<Promise>
h) const noexcept
{
return h.promise().m_waiter;
}
and for its operator co_await implementation:
coroutine_handle_t await_suspend(coroutine_handle_t h) const
{
m_coro.promise().m_waiter = h;
return m_coro;
}
I think the task<T> provided by cppcoro should behave completely similar, so it
can be used for the repro.
The stack overflow doesn't repro on x86/x64, or for higher levels of
optimization on AArch64. Both for -O0 builds, the x86 version emits a tail call
by means of a jmp:
b68d2: e8 49 bc f7 ff callq 32520
<_ZNKSt12experimental16coroutine_handleIvE7addressEv>
b68d7: 48 89 c1 mov %rax,%rcx
b68da: 48 8b 00 mov (%rax),%rax
b68dd: 48 89 cf mov %rcx,%rdi
b68e0: 48 81 c4 a0 00 00 00 add $0xa0,%rsp
b68e7: 5d pop %rbp
b68e8: ff e0 jmpq *%rax
while the AArch64 version emits:
a8be0: 97fe1fbf bl 30adc
<_ZNKSt12experimental16coroutine_handleIvE7addressEv>
a8be4: f9400008 ldr x8, [x0]
a8be8: d63f0100 blr x8
a8bec: a9497bfd ldp x29, x30, [sp, #144]
a8bf0: 910283ff add sp, sp, #0xa0
a8bf4: d65f03c0 ret
which seems to perform a regular call using blr.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>
--1569860243.811DA5.21507--
[Attachment #3 (text/plain)]
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic