[prev in list] [next in list] [prev in thread] [next in thread]
List: cfe-dev
Subject: Re: [cfe-dev] [PATCH] RE: missing return statement for non-void functions in C++
From: Richard Smith <richard () metafoo ! co ! uk>
Date: 2015-07-31 18:46:02
Message-ID: CAOfiQqmii4pRvzK=Xz=e6zCFK28Ffmv+K9WTSxb2sAWyFOCsNw () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
On Fri, Jul 31, 2015 at 7:35 AM, Sjoerd Meijer <sjoerd.meijer@arm.com>
wrote:
> Hi, I am not sure if we came to a conclusion. Please find attached a
> patch. It simply removes the two lines that insert an unreachable statement
> (which cause removal of the return statement). Please note that at -O0 the
> trap instruction is still generated. Is this something we could live with?
>
I don't think this is an improvement:
This doesn't satisfy the folks who want an 'unreachable' for better code
size and optimization, and it doesn't satisfy the folks who want a
guaranteed trap for security, and it doesn't satisfy the folks who want
their broken code to limp along (because it'll still trap at -O0), and it
is at best a minor improvement for the folks who want missing returns to be
more easily debuggable (with -On, the code goes wrong in the caller, or
appears to work, rather than falling into an unrelated function, and
debugging this with -O0 was already easy).
I think there are three options that are defensible here:
1) The status quo: this is UB and we treat it as such and optimize on that
basis, but provide a trap as a convenience at -O0
2) The secure approach: this is UB but we always trap
3) Define the behavior to return 'undef' for C types: this allows
questionable C code that has UB in C++ to keep working when built with a
C++ compiler
Note that (3) can be combined with either (1) or (2). (2) is already
available via the 'return' sanitizer. So this really reduces to: in those
cases where C says it's OK so long as the caller doesn't look at the
returned value (and where the return type doesn't have a non-trivial copy
constructor or destructor, isn't a reference, and so on), should we attempt
to preserve the C behaviour? I would be OK with putting that behind a `-f`
flag (perhaps `-fstrict-return` or similar) to support those folks who want
to build C code in C++, but I would suggest having that flag be off by
default, since that is not the usual use case for a C++ compiler.
Cheers,
>
> Sjoerd.
>
>
>
> *From:* cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu] *On
> Behalf Of *Richard Smith
> *Sent:* 29 July 2015 18:07
> *To:* Hal Finkel
> *Cc:* Marshall Clow; cfe-dev@cs.uiuc.edu Developers
>
> *Subject:* Re: [cfe-dev] missing return statement for non-void functions
> in C++
>
>
>
> On Jul 29, 2015 7:43 AM, "Hal Finkel" <hfinkel@anl.gov> wrote:
> >
> > ----- Original Message -----
> > > From: "David Blaikie" <dblaikie@gmail.com>
> > > To: "James Molloy" <james@jamesmolloy.co.uk>
> > > Cc: "Marshall Clow" <mclow.lists@gmail.com>, "cfe-dev Developers" <
> cfe-dev@cs.uiuc.edu>
> > > Sent: Wednesday, July 29, 2015 9:15:09 AM
> > > Subject: Re: [cfe-dev] missing return statement for non-void functions
> in C++
> > >
> > >
> > > On Jul 29, 2015 7:06 AM, "James Molloy" < james@jamesmolloy.co.uk >
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > If we're going to emit a trap instruction (and thus create a broken
> > > > binary), why don't we error instead?
> > >
> > > We warn, can't error, because it may be dynamically unreached, in
> > > which case the program is valid and we can't reject it.
> >
> > I think this also explains why this is useful for optimization.
> >
> > 1. It is a code-size optimization
> > 2. By eliminating unreachable control flow, we can remove branches and
> tests that are not actual necessary
> >
> > int foo(int x) {
> > if (x > 5) return 2*x;
> > else if (x < 2) return 3 - x;
> > }
> >
> > That having been said, there are other ways to express these things, and
> the situation often represents an error. I'd be fine with requiring a
> special flag (-fallow-nonreturning-functions or whatever) in order to put
> the compiler is a truly confirming mode (similar to the situation with
> sized delete).
>
> Note that we already have a flag to trap on this: -fsanitize-trap=return.
> (You may also need -fsanitize=return, I don't remember.) That seems
> consistent with how we treat most other forms of UB.
>
> > -Hal
> >
> > >
> > > >
> > > > James
> > > >
> > > > On Wed, 29 Jul 2015 at 15:05 David Blaikie < dblaikie@gmail.com >
> > > > wrote:
> > > >>
> > > >>
> > > >> On Jul 29, 2015 2:10 AM, "mats petersson" < mats@planetcatfish.com
> > > >> > wrote:
> > > >> >
> > > >> >
> > > >> >
> > > >> > On 28 July 2015 at 23:40, Marshall Clow < mclow.lists@gmail.com
> > > >> > > wrote:
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> On Tue, Jul 28, 2015 at 6:14 AM, Sjoerd Meijer <
> > > >> >> sjoerd.meijer@arm.com > wrote:
> > > >> >>>
> > > >> >>> Hi,
> > > >> >>>
> > > >> >>>
> > > >> >>>
> > > >> >>> In C++, the undefined behaviour of a missing return statements
> > > >> >>> for a non-void function results in not generating the
> > > >> >>> function epilogue (unreachable statement is inserted and the
> > > >> >>> return statement is optimised away). Consequently, the
> > > >> >>> runtime behaviour is that control is never properly returned
> > > >> >>> from this function and thus it starts executing "garbage
> > > >> >>> instructions". As this is undefined behaviour, this is
> > > >> >>> perfectly fine and according to the spec, and a compile
> > > >> >>> warning for this missing return statement is issued. However,
> > > >> >>> in C, the behaviour is that a function epilogue is generated,
> > > >> >>> i.e. basically by returning uninitialised local variable.
> > > >> >>> Codes that rely on this are not beautiful pieces of code, i.e
> > > >> >>> are buggy, but it might just be okay if you for example have
> > > >> >>> a function that just initialises stuff (and the return value
> > > >> >>> is not checked, directly or indirectly); some one might argue
> > > >> >>> that not returning from that function might be a bit harsh.
> > > >> >>
> > > >> >>
> > > >> >> I would not be one of those people.
> > > >> >
> > > >> >
> > > >> > Nor me.
> > > >> >>
> > > >> >>
> > > >> >>>
> > > >> >>> So this email is to probe if there would be strong resistance
> > > >> >>> to follow the C behaviour? I am not yet sure how, but would
> > > >> >>> perhaps a compromise be possible/acceptable to make the
> > > >> >>> undefined behaviour explicit and also generate the function
> > > >> >>> epilogue?
> > > >> >>
> > > >> >>
> > > >> >> "undefined behavior" is exactly that.
> > > >> >>
> > > >> >> You have no idea what is going to happen; there are no
> > > >> >> restrictions on what the code being executed can do.
> > > >> >>
> > > >> >> "it just might be ok" means on a particular version of a
> > > >> >> particular compiler, on a particular architecture and OS, at a
> > > >> >> particular optimization level. Change any of those things, and
> > > >> >> you can change the behavior.
> > > >> >
> > > >> >
> > > >> > In fact, the "it works kind of as you expected" is the worst
> > > >> > kind of UB in my mind. UB that causes a crash, stops or other
> > > >> > "directly obvious that this is wrong" are MUCH easier to debug.
> > > >> >
> > > >> > So make this particular kind of UB explicit by crashing or
> > > >> > stopping would be a good thing. Making it explicit by
> > > >> > "returning kind of nicely, but not correct return value" is
> > > >> > about the worst possible result.
> > > >>
> > > >> At -O0 clang emits a trap instruction, making it more explicit as
> > > >> you suggest. At higher optimization levels it just falls
> > > >> through/off.
> > > >>
> > > >> >
> > > >> > --
> > > >> > Mats
> > > >> >>
> > > >> >>
> > > >> >> -- Marshall
> > > >> >>
> > > >> >>
> > > >> >> _______________________________________________
> > > >> >> cfe-dev mailing list
> > > >> >> cfe-dev@cs.uiuc.edu
> > > >> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> > > >> >>
> > > >> >
> > > >> >
> > > >> > _______________________________________________
> > > >> > cfe-dev mailing list
> > > >> > cfe-dev@cs.uiuc.edu
> > > >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> > > >> >
> > > >>
> > > >> _______________________________________________
> > > >> cfe-dev mailing list
> > > >> cfe-dev@cs.uiuc.edu
> > > >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> > >
> > > _______________________________________________
> > > cfe-dev mailing list
> > > cfe-dev@cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> > >
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev@cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
[Attachment #5 (text/html)]
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 31, 2015 \
at 7:35 AM, Sjoerd Meijer <span dir="ltr"><<a href="mailto:sjoerd.meijer@arm.com" \
target="_blank">sjoerd.meijer@arm.com</a>></span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div lang="EN-GB" link="blue" vlink="purple"><p \
class="MsoNormal"><span \
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hi, \
I am not sure if we came to a conclusion. Please find attached a patch. It simply \
removes the two lines that insert an unreachable statement (which cause removal of \
the return statement). Please note that at -O0 the trap instruction is still \
generated. Is this something we could live \
with?</span></p></div></blockquote><div><br></div><div>I don't think this is an \
improvement:</div><div><br></div><div>This doesn't satisfy the folks who want an \
'unreachable' for better code size and optimization, and it doesn't \
satisfy the folks who want a guaranteed trap for security, and it doesn't satisfy \
the folks who want their broken code to limp along (because it'll still trap at \
-O0), and it is at best a minor improvement for the folks who want missing returns to \
be more easily debuggable (with -On, the code goes wrong in the caller, or appears to \
work, rather than falling into an unrelated function, and debugging this with -O0 was \
already easy).</div><div><br></div><div>I think there are three options that are \
defensible here:</div><div>1) The status quo: this is UB and we treat it as such and \
optimize on that basis, but provide a trap as a convenience at -O0</div><div>2) The \
secure approach: this is UB but we always trap</div><div>3) Define the behavior to \
return 'undef' for C types: this allows questionable C code that has UB in \
C++ to keep working when built with a C++ compiler</div><div><br></div><div>Note that \
(3) can be combined with either (1) or (2). (2) is already available via the \
'return' sanitizer. So this really reduces to: in those cases where C says \
it's OK so long as the caller doesn't look at the returned value (and where \
the return type doesn't have a non-trivial copy constructor or destructor, \
isn't a reference, and so on), should we attempt to preserve the C behaviour? I \
would be OK with putting that behind a `-f` flag (perhaps `-fstrict-return` or \
similar) to support those folks who want to build C code in C++, but I would suggest \
having that flag be off by default, since that is not the usual use case for a C++ \
compiler.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="EN-GB" link="blue" \
vlink="purple"><p class="MsoNormal"><span \
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Cheers,<u></u><u></u></span></p><p \
class="MsoNormal"><span \
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Sjoerd.<u></u><u></u></span></p><p \
class="MsoNormal"><span \
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> \
<u></u></span></p><p class="MsoNormal"><b><span lang="EN-US" \
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span \
lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> \
<a href="mailto:cfe-dev-bounces@cs.uiuc.edu" \
target="_blank">cfe-dev-bounces@cs.uiuc.edu</a> [mailto:<a \
href="mailto:cfe-dev-bounces@cs.uiuc.edu" \
target="_blank">cfe-dev-bounces@cs.uiuc.edu</a>] <b>On Behalf Of </b>Richard \
Smith<br><b>Sent:</b> 29 July 2015 18:07<br><b>To:</b> Hal Finkel<br><b>Cc:</b> \
Marshall Clow; <a href="mailto:cfe-dev@cs.uiuc.edu" \
target="_blank">cfe-dev@cs.uiuc.edu</a> Developers</span></p><div><div \
class="h5"><br><b>Subject:</b> Re: [cfe-dev] missing return statement for non-void \
functions in C++<u></u><u></u></div></div><p></p><div><div class="h5"><p \
class="MsoNormal"><u></u> <u></u></p><p>On Jul 29, 2015 7:43 AM, "Hal \
Finkel" <<a href="mailto:hfinkel@anl.gov" \
target="_blank">hfinkel@anl.gov</a>> wrote:<br>><br>> ----- Original Message \
-----<br>> > From: "David Blaikie" <<a \
href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>><br>> \
> To: "James Molloy" <<a href="mailto:james@jamesmolloy.co.uk" \
target="_blank">james@jamesmolloy.co.uk</a>><br>> > Cc: "Marshall \
Clow" <<a href="mailto:mclow.lists@gmail.com" \
target="_blank">mclow.lists@gmail.com</a>>, "cfe-dev Developers" <<a \
href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a>><br>> \
> Sent: Wednesday, July 29, 2015 9:15:09 AM<br>> > Subject: Re: [cfe-dev] \
missing return statement for non-void functions in C++<br>> ><br>> \
><br>> > On Jul 29, 2015 7:06 AM, "James Molloy" < <a \
href="mailto:james@jamesmolloy.co.uk" target="_blank">james@jamesmolloy.co.uk</a> \
><br>> > wrote:<br>> > ><br>> > > Hi,<br>> > \
><br>> > > If we're going to emit a trap instruction (and thus create \
a broken<br>> > > binary), why don't we error instead?<br>> \
><br>> > We warn, can't error, because it may be dynamically unreached, \
in<br>> > which case the program is valid and we can't reject \
it.<br>><br>> I think this also explains why this is useful for \
optimization.<br>><br>> 1. It is a code-size optimization<br>> 2. By \
eliminating unreachable control flow, we can remove branches and tests that are not \
actual necessary<br>><br>> int foo(int x) {<br>> if (x > 5) return \
2*x;<br>> else if (x < 2) return 3 - x;<br>> }<br>><br>> That \
having been said, there are other ways to express these things, and the situation \
often represents an error. I'd be fine with requiring a special flag \
(-fallow-nonreturning-functions or whatever) in order to put the compiler is a truly \
confirming mode (similar to the situation with sized \
delete).<u></u><u></u></p><p>Note that we already have a flag to trap on this: \
-fsanitize-trap=return. (You may also need -fsanitize=return, I don't remember.) \
That seems consistent with how we treat most other forms of \
UB.<u></u><u></u></p><p>> -Hal<br>><br>> ><br>> > ><br>> \
> > James<br>> > ><br>> > > On Wed, 29 Jul 2015 at 15:05 \
David Blaikie < <a href="mailto:dblaikie@gmail.com" \
target="_blank">dblaikie@gmail.com</a> ><br>> > > wrote:<br>> > \
>><br>> > >><br>> > >> On Jul 29, 2015 2:10 AM, \
"mats petersson" < <a href="mailto:mats@planetcatfish.com" \
target="_blank">mats@planetcatfish.com</a><br>> > >> > wrote:<br>> \
> >> ><br>> > >> ><br>> > >> ><br>> > \
>> > On 28 July 2015 at 23:40, Marshall Clow < <a \
href="mailto:mclow.lists@gmail.com" target="_blank">mclow.lists@gmail.com</a><br>> \
> >> > > wrote:<br>> > >> >><br>> > >> \
>><br>> > >> >><br>> > >> >> On Tue, Jul \
28, 2015 at 6:14 AM, Sjoerd Meijer <<br>> > >> >> <a \
href="mailto:sjoerd.meijer@arm.com" target="_blank">sjoerd.meijer@arm.com</a> > \
wrote:<br>> > >> >>><br>> > >> >>> \
Hi,<br>> > >> >>><br>> > >> >>><br>> \
> >> >>><br>> > >> >>> In C++, the undefined \
behaviour of a missing return statements<br>> > >> >>> for a \
non-void function results in not generating the<br>> > >> >>> \
function epilogue (unreachable statement is inserted and the<br>> > >> \
>>> return statement is optimised away). Consequently, the<br>> > \
>> >>> runtime behaviour is that control is never properly \
returned<br>> > >> >>> from this function and thus it starts \
executing "garbage<br>> > >> >>> instructions". As this is \
undefined behaviour, this is<br>> > >> >>> perfectly fine and \
according to the spec, and a compile<br>> > >> >>> warning for \
this missing return statement is issued. However,<br>> > >> >>> \
in C, the behaviour is that a function epilogue is generated,<br>> > >> \
>>> i.e. basically by returning uninitialised local variable.<br>> > \
>> >>> Codes that rely on this are not beautiful pieces of code, \
i.e<br>> > >> >>> are buggy, but it might just be okay if you \
for example have<br>> > >> >>> a function that just initialises \
stuff (and the return value<br>> > >> >>> is not checked, \
directly or indirectly); some one might argue<br>> > >> >>> that \
not returning from that function might be a bit harsh.<br>> > >> \
>><br>> > >> >><br>> > >> >> I would not be \
one of those people.<br>> > >> ><br>> > >> ><br>> \
> >> > Nor me.<br>> > >> >><br>> > >> \
>><br>> > >> >>><br>> > >> >>> So \
this email is to probe if there would be strong resistance<br>> > >> \
>>> to follow the C behaviour? I am not yet sure how, but would<br>> > \
>> >>> perhaps a compromise be possible/acceptable to make the<br>> \
> >> >>> undefined behaviour explicit and also generate the \
function<br>> > >> >>> epilogue?<br>> > >> \
>><br>> > >> >><br>> > >> >> \
"undefined behavior" is exactly that.<br>> > >> \
>><br>> > >> >> You have no idea what is going to happen; \
there are no<br>> > >> >> restrictions on what the code being \
executed can do.<br>> > >> >><br>> > >> >> \
"it just might be ok" means on a particular version of a<br>> > \
>> >> particular compiler, on a particular architecture and OS, at \
a<br>> > >> >> particular optimization level. Change any of those \
things, and<br>> > >> >> you can change the behavior.<br>> > \
>> ><br>> > >> ><br>> > >> > In fact, the \
"it works kind of as you expected" is the worst<br>> > >> > \
kind of UB in my mind. UB that causes a crash, stops or other<br>> > >> \
> "directly obvious that this is wrong" are MUCH easier to \
debug.<br>> > >> ><br>> > >> > So make this particular \
kind of UB explicit by crashing or<br>> > >> > stopping would be a \
good thing. Making it explicit by<br>> > >> > "returning kind of \
nicely, but not correct return value" is<br>> > >> > about the \
worst possible result.<br>> > >><br>> > >> At -O0 clang emits \
a trap instruction, making it more explicit as<br>> > >> you suggest. At \
higher optimization levels it just falls<br>> > >> through/off.<br>> \
> >><br>> > >> ><br>> > >> > --<br>> > \
>> > Mats<br>> > >> >><br>> > >> \
>><br>> > >> >> -- Marshall<br>> > >> \
>><br>> > >> >><br>> > >> >> \
_______________________________________________<br>> > >> >> \
cfe-dev mailing list<br>> > >> >> <a \
href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a><br>> \
> >> >> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" \
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>> > \
>> >><br>> > >> ><br>> > >> ><br>> > \
>> > _______________________________________________<br>> > >> \
> cfe-dev mailing list<br>> > >> > <a \
href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a><br>> \
> >> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" \
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>> > \
>> ><br>> > >><br>> > >> \
_______________________________________________<br>> > >> cfe-dev mailing \
list<br>> > >> <a href="mailto:cfe-dev@cs.uiuc.edu" \
target="_blank">cfe-dev@cs.uiuc.edu</a><br>> > >> <a \
href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" \
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>> \
><br>> > _______________________________________________<br>> > \
cfe-dev mailing list<br>> > <a href="mailto:cfe-dev@cs.uiuc.edu" \
target="_blank">cfe-dev@cs.uiuc.edu</a><br>> > <a \
href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" \
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>> \
><br>><br>> --<br>> Hal Finkel<br>> Assistant Computational \
Scientist<br>> Leadership Computing Facility<br>> Argonne National \
Laboratory<br>><br>> _______________________________________________<br>> \
cfe-dev mailing list<br>> <a href="mailto:cfe-dev@cs.uiuc.edu" \
target="_blank">cfe-dev@cs.uiuc.edu</a><br>> <a \
href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" \
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><u></u><u></u></p></div></div></div></blockquote></div><br></div></div>
_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic