--===============6165424804858282381== Content-Type: multipart/alternative; boundary=20cf307ac7ab50f732051c303b4f --20cf307ac7ab50f732051c303b4f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Fri, Jul 31, 2015 at 7:35 AM, Sjoerd Meijer wrote: > Hi, I am not sure if we came to a conclusion. Please find attached a > patch. It simply removes the two lines that insert an unreachable stateme= nt > (which cause removal of the return statement). Please note that at -O0 th= e > trap instruction is still generated. Is this something we could live with= ? > I don't think this is an improvement: This doesn't satisfy the folks who want an 'unreachable' for better code size and optimization, and it doesn't satisfy the folks who want a guaranteed trap for security, and it doesn't satisfy the folks who want their broken code to limp along (because it'll still trap at -O0), and it is at best a minor improvement for the folks who want missing returns to be more easily debuggable (with -On, the code goes wrong in the caller, or appears to work, rather than falling into an unrelated function, and debugging this with -O0 was already easy). I think there are three options that are defensible here: 1) The status quo: this is UB and we treat it as such and optimize on that basis, but provide a trap as a convenience at -O0 2) The secure approach: this is UB but we always trap 3) Define the behavior to return 'undef' for C types: this allows questionable C code that has UB in C++ to keep working when built with a C++ compiler Note that (3) can be combined with either (1) or (2). (2) is already available via the 'return' sanitizer. So this really reduces to: in those cases where C says it's OK so long as the caller doesn't look at the returned value (and where the return type doesn't have a non-trivial copy constructor or destructor, isn't a reference, and so on), should we attempt to preserve the C behaviour? I would be OK with putting that behind a `-f` flag (perhaps `-fstrict-return` or similar) to support those folks who want to build C code in C++, but I would suggest having that flag be off by default, since that is not the usual use case for a C++ compiler. Cheers, > > Sjoerd. > > > > *From:* cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu] = *On > Behalf Of *Richard Smith > *Sent:* 29 July 2015 18:07 > *To:* Hal Finkel > *Cc:* Marshall Clow; cfe-dev@cs.uiuc.edu Developers > > *Subject:* Re: [cfe-dev] missing return statement for non-void functions > in C++ > > > > On Jul 29, 2015 7:43 AM, "Hal Finkel" wrote: > > > > ----- Original Message ----- > > > From: "David Blaikie" > > > To: "James Molloy" > > > Cc: "Marshall Clow" , "cfe-dev Developers" < > cfe-dev@cs.uiuc.edu> > > > Sent: Wednesday, July 29, 2015 9:15:09 AM > > > Subject: Re: [cfe-dev] missing return statement for non-void function= s > in C++ > > > > > > > > > On Jul 29, 2015 7:06 AM, "James Molloy" < james@jamesmolloy.co.uk > > > > wrote: > > > > > > > > Hi, > > > > > > > > If we're going to emit a trap instruction (and thus create a broken > > > > binary), why don't we error instead? > > > > > > We warn, can't error, because it may be dynamically unreached, in > > > which case the program is valid and we can't reject it. > > > > I think this also explains why this is useful for optimization. > > > > 1. It is a code-size optimization > > 2. By eliminating unreachable control flow, we can remove branches and > tests that are not actual necessary > > > > int foo(int x) { > > if (x > 5) return 2*x; > > else if (x < 2) return 3 - x; > > } > > > > That having been said, there are other ways to express these things, an= d > the situation often represents an error. I'd be fine with requiring a > special flag (-fallow-nonreturning-functions or whatever) in order to put > the compiler is a truly confirming mode (similar to the situation with > sized delete). > > Note that we already have a flag to trap on this: -fsanitize-trap=3Dretur= n. > (You may also need -fsanitize=3Dreturn, I don't remember.) That seems > consistent with how we treat most other forms of UB. > > > -Hal > > > > > > > > > > > > > James > > > > > > > > On Wed, 29 Jul 2015 at 15:05 David Blaikie < dblaikie@gmail.com > > > > > wrote: > > > >> > > > >> > > > >> On Jul 29, 2015 2:10 AM, "mats petersson" < mats@planetcatfish.com > > > >> > wrote: > > > >> > > > > >> > > > > >> > > > > >> > On 28 July 2015 at 23:40, Marshall Clow < mclow.lists@gmail.com > > > >> > > wrote: > > > >> >> > > > >> >> > > > >> >> > > > >> >> On Tue, Jul 28, 2015 at 6:14 AM, Sjoerd Meijer < > > > >> >> sjoerd.meijer@arm.com > wrote: > > > >> >>> > > > >> >>> Hi, > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> In C++, the undefined behaviour of a missing return statements > > > >> >>> for a non-void function results in not generating the > > > >> >>> function epilogue (unreachable statement is inserted and the > > > >> >>> return statement is optimised away). Consequently, the > > > >> >>> runtime behaviour is that control is never properly returned > > > >> >>> from this function and thus it starts executing =E2=80=9Cgarba= ge > > > >> >>> instructions=E2=80=9D. As this is undefined behaviour, this is > > > >> >>> perfectly fine and according to the spec, and a compile > > > >> >>> warning for this missing return statement is issued. However, > > > >> >>> in C, the behaviour is that a function epilogue is generated, > > > >> >>> i.e. basically by returning uninitialised local variable. > > > >> >>> Codes that rely on this are not beautiful pieces of code, i.e > > > >> >>> are buggy, but it might just be okay if you for example have > > > >> >>> a function that just initialises stuff (and the return value > > > >> >>> is not checked, directly or indirectly); some one might argue > > > >> >>> that not returning from that function might be a bit harsh. > > > >> >> > > > >> >> > > > >> >> I would not be one of those people. > > > >> > > > > >> > > > > >> > Nor me. > > > >> >> > > > >> >> > > > >> >>> > > > >> >>> So this email is to probe if there would be strong resistance > > > >> >>> to follow the C behaviour? I am not yet sure how, but would > > > >> >>> perhaps a compromise be possible/acceptable to make the > > > >> >>> undefined behaviour explicit and also generate the function > > > >> >>> epilogue? > > > >> >> > > > >> >> > > > >> >> "undefined behavior" is exactly that. > > > >> >> > > > >> >> You have no idea what is going to happen; there are no > > > >> >> restrictions on what the code being executed can do. > > > >> >> > > > >> >> "it just might be ok" means on a particular version of a > > > >> >> particular compiler, on a particular architecture and OS, at a > > > >> >> particular optimization level. Change any of those things, and > > > >> >> you can change the behavior. > > > >> > > > > >> > > > > >> > In fact, the "it works kind of as you expected" is the worst > > > >> > kind of UB in my mind. UB that causes a crash, stops or other > > > >> > "directly obvious that this is wrong" are MUCH easier to debug. > > > >> > > > > >> > So make this particular kind of UB explicit by crashing or > > > >> > stopping would be a good thing. Making it explicit by > > > >> > "returning kind of nicely, but not correct return value" is > > > >> > about the worst possible result. > > > >> > > > >> At -O0 clang emits a trap instruction, making it more explicit as > > > >> you suggest. At higher optimization levels it just falls > > > >> through/off. > > > >> > > > >> > > > > >> > -- > > > >> > Mats > > > >> >> > > > >> >> > > > >> >> -- Marshall > > > >> >> > > > >> >> > > > >> >> _______________________________________________ > > > >> >> cfe-dev mailing list > > > >> >> cfe-dev@cs.uiuc.edu > > > >> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > >> >> > > > >> > > > > >> > > > > >> > _______________________________________________ > > > >> > cfe-dev mailing list > > > >> > cfe-dev@cs.uiuc.edu > > > >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > >> > > > > >> > > > >> _______________________________________________ > > > >> cfe-dev mailing list > > > >> cfe-dev@cs.uiuc.edu > > > >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > > > _______________________________________________ > > > cfe-dev mailing list > > > cfe-dev@cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > > > > -- > > Hal Finkel > > Assistant Computational Scientist > > Leadership Computing Facility > > Argonne National Laboratory > > > > _______________________________________________ > > cfe-dev mailing list > > cfe-dev@cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > --20cf307ac7ab50f732051c303b4f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On F= ri, Jul 31, 2015 at 7:35 AM, Sjoerd Meijer <sjoerd.meijer@arm.com&= gt; wrote:

Hi, I am not sure if we came to a conclusion. Please find attached a pat= ch. It simply removes the two lines that insert an unreachable statement (w= hich cause removal of the return statement). Please note that at -O0 the tr= ap instruction is still generated. Is this something we could live with?


I don't think this is an= improvement:

This doesn't satisfy the folks w= ho want an 'unreachable' for better code size and optimization, and= it doesn't satisfy the folks who want a guaranteed trap for security, = and it doesn't satisfy the folks who want their broken code to limp alo= ng (because it'll still trap at -O0), and it is at best a minor improve= ment for the folks who want missing returns to be more easily debuggable (w= ith -On, the code goes wrong in the caller, or appears to work, rather than= falling into an unrelated function, and debugging this with -O0 was alread= y easy).

I think there are three options that are = defensible here:
1) The status quo: this is UB and we treat it as= such and optimize on that basis, but provide a trap as a convenience at -O= 0
2) The secure approach: this is UB but we always trap
3) Define the behavior to return 'undef' for C types: this allows = questionable C code that has UB in C++ to keep working when built with a C+= + compiler

Note that (3) can be combined with eith= er (1) or (2). (2) is already available via the 'return' sanitizer.= So this really reduces to: in those cases where C says it's OK so long= as the caller doesn't look at the returned value (and where the return= type doesn't have a non-trivial copy constructor or destructor, isn= 9;t a reference, and so on), should we attempt to preserve the C behaviour?= I would be OK with putting that behind a `-f` flag (perhaps `-fstrict-retu= rn` or similar) to support those folks who want to build C code in C++, but= I would suggest having that flag be off by default, since that is not the = usual use case for a C++ compiler.

Cheers,

Sjoerd.

=C2=A0

From: cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.ed= u] On Behalf Of Richard Smith
Sent: 29 July 2015 18:07=
To: Hal Finkel
Cc: Marshall Clow; cfe-dev@cs.uiuc.edu Developers


Subject: Re: [cfe-dev] missing ret= urn statement for non-void functions in C++

=C2=A0

O= n Jul 29, 2015 7:43 AM, "Hal Finkel" <hfinkel@anl.gov> wrote:
>
>= ----- Original Message -----
> > From: "David Blaikie" = <dblaikie@gmail.= com>
> > To: "James Molloy" <james@jamesmolloy.co.uk>=
> > Cc: "Marshall Clow" <mclow.lists@gmail.com>, "cfe-de= v Developers" <cfe-dev@cs.uiuc.edu>
> > Sent: Wednesday, July 29, 201= 5 9:15:09 AM
> > Subject: Re: [cfe-dev] missing return statement f= or non-void functions in C++
> >
> >
> > On Jul = 29, 2015 7:06 AM, "James Molloy" < james@jamesmolloy.co.uk >
> = > wrote:
> > >
> > > Hi,
> > >
&g= t; > > If we're going to emit a trap instruction (and thus create= a broken
> > > binary), why don't we error instead?
>= ; >
> > We warn, can't error, because it may be dynamically= unreached, in
> > which case the program is valid and we can'= t reject it.
>
> I think this also explains why this is useful = for optimization.
>
> =C2=A01. It is a code-size optimization> =C2=A02. By eliminating unreachable control flow, we can remove bran= ches and tests that are not actual necessary
>
> int foo(int x)= {
> =C2=A0 if (x > 5) return 2*x;
> =C2=A0 else if (x < = 2) return 3 - x;
> }
>
> That having been said, there are= other ways to express these things, and the situation often represents an = error. I'd be fine with requiring a special flag (-fallow-nonreturning-= functions or whatever) in order to put the compiler is a truly confirming m= ode (similar to the situation with sized delete).

Note = that we already have a flag to trap on this: -fsanitize-trap=3Dreturn. (You= may also need -fsanitize=3Dreturn, I don't remember.) That seems consi= stent with how we treat most other forms of UB.

> = =C2=A0-Hal
>
> >
> > >
> > > James> > >
> > > On Wed, 29 Jul 2015 at 15:05 David Blaik= ie < dblaikie@gm= ail.com >
> > > wrote:
> > >>
> >= ; >>
> > >> On Jul 29, 2015 2:10 AM, "mats peters= son" < = mats@planetcatfish.com
> > >> > wrote:
> > &= gt;> >
> > >> >
> > >> >
> = > >> > On 28 July 2015 at 23:40, Marshall Clow < mclow.lists@gmail.com> > >> > > wrote:
> > >> >>
>= ; > >> >>
> > >> >>
> > >&g= t; >> On Tue, Jul 28, 2015 at 6:14 AM, Sjoerd Meijer <
> >= ; >> >> sjoerd.meijer@arm.com > wrote:
> > >> >>>= ;
> > >> >>> Hi,
> > >> >>>=
> > >> >>>
> > >> >>>
&= gt; > >> >>> In C++, the undefined behaviour of a missing= return statements
> > >> >>> for a non-void functi= on results in not generating the
> > >> >>> functio= n epilogue (unreachable statement is inserted and the
> > >>= >>> return statement is optimised away). Consequently, the
>= ; > >> >>> runtime behaviour is that control is never pro= perly returned
> > >> >>> from this function and th= us it starts executing =E2=80=9Cgarbage
> > >> >>> = instructions=E2=80=9D. As this is undefined behaviour, this is
> >= >> >>> perfectly fine and according to the spec, and a comp= ile
> > >> >>> warning for this missing return stat= ement is issued. However,
> > >> >>> in C, the beha= viour is that a function epilogue is generated,
> > >> >&= gt;> i.e. basically by returning uninitialised local variable.
> &= gt; >> >>> Codes that rely on this are not beautiful pieces = of code, i.e
> > >> >>> are buggy, but it might jus= t be okay if you for example have
> > >> >>> a func= tion that just initialises stuff (and the return value
> > >>= ; >>> is not checked, directly or indirectly); some one might argu= e
> > >> >>> that not returning from that function = might be a bit harsh.
> > >> >>
> > >> = >>
> > >> >> I would not be one of those people.=
> > >> >
> > >> >
> > >>= ; > Nor me.
> > >> >>
> > >> >>= ;
> > >> >>>
> > >> >>> So = this email is to probe if there would be strong resistance
> > >= ;> >>> to follow the C behaviour? I am not yet sure how, but wo= uld
> > >> >>> perhaps a compromise be possible/acc= eptable to make the
> > >> >>> undefined behaviour = explicit and also generate the function
> > >> >>> = epilogue?
> > >> >>
> > >> >>
= > > >> >> "undefined behavior" is exactly that.=
> > >> >>
> > >> >> You have no = idea what is going to happen; there are no
> > >> >> r= estrictions on what the code being executed can do.
> > >> &= gt;>
> > >> >> "it just might be ok" mean= s on a particular version of a
> > >> >> particular co= mpiler, on a particular architecture and OS, at a
> > >> >= ;> particular optimization level. Change any of those things, and
>= ; > >> >> you can change the behavior.
> > >>= >
> > >> >
> > >> > In fact, the &q= uot;it works kind of as you expected" is the worst
> > >&g= t; > kind of UB in my mind. UB that causes a crash, stops or other
&g= t; > >> > "directly obvious that this is wrong" are M= UCH easier to debug.
> > >> >
> > >> > = So make this particular kind of UB explicit by crashing or
> > >= ;> > stopping would be a good thing. Making it explicit by
> &g= t; >> > "returning kind of nicely, but not correct return val= ue" is
> > >> > about the worst possible result.
= > > >>
> > >> At -O0 clang emits a trap instruct= ion, making it more explicit as
> > >> you suggest. At highe= r optimization levels it just falls
> > >> through/off.
&= gt; > >>
> > >> >
> > >> > --<= br>> > >> > Mats
> > >> >>
> >= >> >>
> > >> >> -- Marshall
> > = >> >>
> > >> >>
> > >> >= > _______________________________________________
> > >> = >> cfe-dev mailing list
> > >> >> cfe-dev@cs.uiuc.edu
> = > >> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-d= ev
> > >> >>
> > >> >
> &g= t; >> >
> > >> > _______________________________= ________________
> > >> > cfe-dev mailing list
> &g= t; >> > c= fe-dev@cs.uiuc.edu
> > >> > http://lists.cs.uiuc= .edu/mailman/listinfo/cfe-dev
> > >> >
> > &= gt;>
> > >> _____________________________________________= __
> > >> cfe-dev mailing list
> > >> cfe-dev@cs.uiuc.edu> > >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev=
> >
> > ____________________________________________= ___
> > cfe-dev mailing list
> > cfe-dev@cs.uiuc.edu
> > = http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >
><= br>> --
> Hal Finkel
> Assistant Computational Scientist
= > Leadership Computing Facility
> Argonne National Laboratory
&= gt;
> _______________________________________________
> cfe-dev= mailing list
> cfe-dev@cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/l= istinfo/cfe-dev

<= br>
--20cf307ac7ab50f732051c303b4f-- --===============6165424804858282381== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ cfe-dev mailing list cfe-dev@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev --===============6165424804858282381==--