This is a multipart message in MIME format. --===============1647157026836473028== Content-Language: en-gb Content-Type: multipart/alternative; boundary="----=_NextPart_000_0012_01D0CDE1.29D205F0" This is a multipart message in MIME format. ------=_NextPart_000_0012_01D0CDE1.29D205F0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Richard, =20 I agree with your conclusions and will start preparing a patch for option 3= ) under a flag that is off by default; this enables folks to build/run C co= de in C++. I actually think option 2) would be a good one too, but as it is= already available under a flag I also don=E2=80=99t see how useful it is c= ombining options 2) and 3) with another (or one more) flag that is off by d= efault. =20 Cheers. =20 From: metafoo@gmail.com [mailto:metafoo@gmail.com] On Behalf Of Richard Smi= th Sent: 31 July 2015 19:46 To: Sjoerd Meijer Cc: Hal Finkel; Marshall Clow; cfe-dev@cs.uiuc.edu Developers; cfe commits Subject: Re: [PATCH] RE: [cfe-dev] missing return statement for non-void fu= nctions in C++ =20 On Fri, Jul 31, 2015 at 7:35 AM, Sjoerd Meijer wrot= e: Hi, I am not sure if we came to a conclusion. Please find attached a patch.= It simply removes the two lines that insert an unreachable statement (whic= h cause removal of the return statement). Please note that at -O0 the trap = instruction is still generated. Is this something we could live with? =20 I don't think this is an improvement: =20 This doesn't satisfy the folks who want an 'unreachable' for better code si= ze and optimization, and it doesn't satisfy the folks who want a guaranteed= trap for security, and it doesn't satisfy the folks who want their broken = code to limp along (because it'll still trap at -O0), and it is at best a m= inor improvement for the folks who want missing returns to be more easily d= ebuggable (with -On, the code goes wrong in the caller, or appears to work,= rather than falling into an unrelated function, and debugging this with -O= 0 was already easy). =20 I think there are three options that are defensible here: 1) The status quo: this is UB and we treat it as such and optimize on that = basis, but provide a trap as a convenience at -O0 2) The secure approach: this is UB but we always trap 3) Define the behavior to return 'undef' for C types: this allows questiona= ble C code that has UB in C++ to keep working when built with a C++ compile= r =20 Note that (3) can be combined with either (1) or (2). (2) is already availa= ble via the 'return' sanitizer. So this really reduces to: in those cases w= here C says it's OK so long as the caller doesn't look at the returned valu= e (and where the return type doesn't have a non-trivial copy constructor or= destructor, isn't a reference, and so on), should we attempt to preserve t= he C behaviour? I would be OK with putting that behind a `-f` flag (perhaps= `-fstrict-return` or similar) to support those folks who want to build C c= ode in C++, but I would suggest having that flag be off by default, since t= hat is not the usual use case for a C++ compiler. =20 Cheers, Sjoerd. =20 From: cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu] On B= ehalf Of Richard Smith Sent: 29 July 2015 18:07 To: Hal Finkel Cc: Marshall Clow; cfe-dev@cs.uiuc.edu Developers Subject: Re: [cfe-dev] missing return statement for non-void functions in C= ++ =20 On Jul 29, 2015 7:43 AM, "Hal Finkel" wrote: > > ----- Original Message ----- > > From: "David Blaikie" > > To: "James Molloy" > > Cc: "Marshall Clow" , "cfe-dev Developers" > > Sent: Wednesday, July 29, 2015 9:15:09 AM > > Subject: Re: [cfe-dev] missing return statement for non-void functions = in C++ > > > > > > On Jul 29, 2015 7:06 AM, "James Molloy" < james@jamesmolloy.co.uk > > > wrote: > > > > > > Hi, > > > > > > If we're going to emit a trap instruction (and thus create a broken > > > binary), why don't we error instead? > > > > We warn, can't error, because it may be dynamically unreached, in > > which case the program is valid and we can't reject it. > > I think this also explains why this is useful for optimization. > > 1. It is a code-size optimization > 2. By eliminating unreachable control flow, we can remove branches and t= ests that are not actual necessary > > int foo(int x) { > if (x > 5) return 2*x; > else if (x < 2) return 3 - x; > } > > That having been said, there are other ways to express these things, and = the situation often represents an error. I'd be fine with requiring a speci= al flag (-fallow-nonreturning-functions or whatever) in order to put the co= mpiler is a truly confirming mode (similar to the situation with sized dele= te). Note that we already have a flag to trap on this: -fsanitize-trap=3Dreturn.= (You may also need -fsanitize=3Dreturn, I don't remember.) That seems cons= istent with how we treat most other forms of UB. > -Hal > > > > > > > > > James > > > > > > On Wed, 29 Jul 2015 at 15:05 David Blaikie < dblaikie@gmail.com > > > > wrote: > > >> > > >> > > >> On Jul 29, 2015 2:10 AM, "mats petersson" < mats@planetcatfish.com > > >> > wrote: > > >> > > > >> > > > >> > > > >> > On 28 July 2015 at 23:40, Marshall Clow < mclow.lists@gmail.com > > >> > > wrote: > > >> >> > > >> >> > > >> >> > > >> >> On Tue, Jul 28, 2015 at 6:14 AM, Sjoerd Meijer < > > >> >> sjoerd.meijer@arm.com > wrote: > > >> >>> > > >> >>> Hi, > > >> >>> > > >> >>> > > >> >>> > > >> >>> In C++, the undefined behaviour of a missing return statements > > >> >>> for a non-void function results in not generating the > > >> >>> function epilogue (unreachable statement is inserted and the > > >> >>> return statement is optimised away). Consequently, the > > >> >>> runtime behaviour is that control is never properly returned > > >> >>> from this function and thus it starts executing =E2=80=9Cgarbage > > >> >>> instructions=E2=80=9D. As this is undefined behaviour, this is > > >> >>> perfectly fine and according to the spec, and a compile > > >> >>> warning for this missing return statement is issued. However, > > >> >>> in C, the behaviour is that a function epilogue is generated, > > >> >>> i.e. basically by returning uninitialised local variable. > > >> >>> Codes that rely on this are not beautiful pieces of code, i.e > > >> >>> are buggy, but it might just be okay if you for example have > > >> >>> a function that just initialises stuff (and the return value > > >> >>> is not checked, directly or indirectly); some one might argue > > >> >>> that not returning from that function might be a bit harsh. > > >> >> > > >> >> > > >> >> I would not be one of those people. > > >> > > > >> > > > >> > Nor me. > > >> >> > > >> >> > > >> >>> > > >> >>> So this email is to probe if there would be strong resistance > > >> >>> to follow the C behaviour? I am not yet sure how, but would > > >> >>> perhaps a compromise be possible/acceptable to make the > > >> >>> undefined behaviour explicit and also generate the function > > >> >>> epilogue? > > >> >> > > >> >> > > >> >> "undefined behavior" is exactly that. > > >> >> > > >> >> You have no idea what is going to happen; there are no > > >> >> restrictions on what the code being executed can do. > > >> >> > > >> >> "it just might be ok" means on a particular version of a > > >> >> particular compiler, on a particular architecture and OS, at a > > >> >> particular optimization level. Change any of those things, and > > >> >> you can change the behavior. > > >> > > > >> > > > >> > In fact, the "it works kind of as you expected" is the worst > > >> > kind of UB in my mind. UB that causes a crash, stops or other > > >> > "directly obvious that this is wrong" are MUCH easier to debug. > > >> > > > >> > So make this particular kind of UB explicit by crashing or > > >> > stopping would be a good thing. Making it explicit by > > >> > "returning kind of nicely, but not correct return value" is > > >> > about the worst possible result. > > >> > > >> At -O0 clang emits a trap instruction, making it more explicit as > > >> you suggest. At higher optimization levels it just falls > > >> through/off. > > >> > > >> > > > >> > -- > > >> > Mats > > >> >> > > >> >> > > >> >> -- Marshall > > >> >> > > >> >> > > >> >> _______________________________________________ > > >> >> cfe-dev mailing list > > >> >> cfe-dev@cs.uiuc.edu > > >> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > >> >> > > >> > > > >> > > > >> > _______________________________________________ > > >> > cfe-dev mailing list > > >> > cfe-dev@cs.uiuc.edu > > >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > >> > > > >> > > >> _______________________________________________ > > >> cfe-dev mailing list > > >> cfe-dev@cs.uiuc.edu > > >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > _______________________________________________ > > cfe-dev mailing list > > cfe-dev@cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > cfe-dev mailing list > cfe-dev@cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev =20 ------=_NextPart_000_0012_01D0CDE1.29D205F0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Hi Richar= d,

 

I agree with your conclusions and will sta= rt preparing a patch for option 3) under a flag that is off by default; thi= s enables folks to build/run C code in C++. I actually think option 2) woul= d be a good one too, but as it is already available under a flag I also don= =E2=80=99t see how useful it is combining options 2) and 3) with another (o= r one more) flag that is off by default.

 

Chee= rs.

 

From: metafoo@gmail.com = [mailto:metafoo@gmail.com] On Behalf Of Richard Smith
Sent: 31 July 2015 19:46
To: Sjoerd Meijer
Cc: Hal Finkel; = Marshall Clow; cfe-dev@cs.uiuc.edu Developers; cfe commits
Subject: Re: [PATCH] RE: [cfe-dev] missing return statement for non-void function= s in C++

 

On Fri, Jul 31, 2015 at 7:35 AM, Sjoerd Me= ijer <sjoerd.= meijer@arm.com> wrote:

Hi, I am not= sure if we came to a conclusion. Please find attached a patch. It simply r= emoves the two lines that insert an unreachable statement (which cause remo= val of the return statement). Please note that at -O0 the trap instruction = is still generated. Is this something we could live with?=

 

I don't think this is an improvement:

 

This doesn't satisfy the folks who want an 'unreachable' for better code s= ize and optimization, and it doesn't satisfy the folks who want a guarantee= d trap for security, and it doesn't satisfy the folks who want their broken= code to limp along (because it'll still trap at -O0), and it is at best a = minor improvement for the folks who want missing returns to be more easily = debuggable (with -On, the code goes wrong in the caller, or appears to work= , rather than falling into an unrelated function, and debugging this with -= O0 was already easy).

&n= bsp;

I think there are three optio= ns that are defensible here:

= 1) The status quo: this is UB and we treat it as such and optimize on that = basis, but provide a trap as a convenience at -O0

=

2) The secure approach: this is UB but we always trap<= o:p>

3) Define the behavior to ret= urn 'undef' for C types: this allows questionable C code that has UB in C++= to keep working when built with a C++ compiler

 

Not= e that (3) can be combined with either (1) or (2). (2) is already available= via the 'return' sanitizer. So this really reduces to: in those cases wher= e C says it's OK so long as the caller doesn't look at the returned value (= and where the return type doesn't have a non-trivial copy constructor or de= structor, isn't a reference, and so on), should we attempt to preserve the = C behaviour? I would be OK with putting that behind a `-f` flag (perhaps `-= fstrict-return` or similar) to support those folks who want to build C code= in C++, but I would suggest having that flag be off by default, since that= is not the usual use case for a C++ compiler.

 

Cheers,

Sjoerd.

 

From: cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounc= es@cs.uiuc.edu] On Behalf Of Richard Smith
Sent: 29 Ju= ly 2015 18:07
To: Hal Finkel
Cc: Marshall Clow; cfe-dev@cs.uiuc.edu D= evelopers


Subjec= t: Re: [cfe-dev] missing return statement for non-void functions in C++=

 

On Jul = 29, 2015 7:43 AM, "Hal Finkel" <hfinkel@anl.gov> wrote:
>
> -----= Original Message -----
> > From: "David Blaikie" <dblaikie@gmail.com>
> > To: "James Molloy" <james@jamesmolloy.co.uk>
&g= t; > Cc: "Marshall Clow" <mclow.lists@gmail.com>, "cfe-dev Deve= lopers" <c= fe-dev@cs.uiuc.edu>
> > Sent: Wednesday, July 29, 2015 9:15= :09 AM
> > Subject: Re: [cfe-dev] missing return statement for non= -void functions in C++
> >
> >
> > On Jul 29, 20= 15 7:06 AM, "James Molloy" < james@jamesmolloy.co.uk >
> > w= rote:
> > >
> > > Hi,
> > >
> >= ; > If we're going to emit a trap instruction (and thus create a broken<= br>> > > binary), why don't we error instead?
> >
>= > We warn, can't error, because it may be dynamically unreached, in
= > > which case the program is valid and we can't reject it.
>> I think this also explains why this is useful for optimization.
&= gt;
>  1. It is a code-size optimization
>  2. By eli= minating unreachable control flow, we can remove branches and tests that ar= e not actual necessary
>
> int foo(int x) {
>   if (= x > 5) return 2*x;
>   else if (x < 2) return 3 - x;
&g= t; }
>
> That having been said, there are other ways to express= these things, and the situation often represents an error. I'd be fine wit= h requiring a special flag (-fallow-nonreturning-functions or whatever) in = order to put the compiler is a truly confirming mode (similar to the situat= ion with sized delete).

Note that we already have a flag t= o trap on this: -fsanitize-trap=3Dreturn. (You may also need -fsanitize=3Dr= eturn, I don't remember.) That seems consistent with how we treat most othe= r forms of UB.

>  -Hal
>
> >
>= ; > >
> > > James
> > >
> > > On = Wed, 29 Jul 2015 at 15:05 David Blaikie < dblaikie@gmail.com >
> > > wro= te:
> > >>
> > >>
> > >> On Ju= l 29, 2015 2:10 AM, "mats petersson" < mats@planetcatfish.com
> >= ; >> > wrote:
> > >> >
> > >> >= ;
> > >> >
> > >> > On 28 July 2015 at = 23:40, Marshall Clow < mclow.lists@gmail.com
> > >> > > wrote= :
> > >> >>
> > >> >>
> >= ; >> >>
> > >> >> On Tue, Jul 28, 2015 at = 6:14 AM, Sjoerd Meijer <
> > >> >> sjoerd.meijer@arm.com > w= rote:
> > >> >>>
> > >> >>>= Hi,
> > >> >>>
> > >> >>><= br>> > >> >>>
> > >> >>> In C+= +, the undefined behaviour of a missing return statements
> > >= > >>> for a non-void function results in not generating the
= > > >> >>> function epilogue (unreachable statement is= inserted and the
> > >> >>> return statement is op= timised away). Consequently, the
> > >> >>> runtime= behaviour is that control is never properly returned
> > >>= >>> from this function and thus it starts executing =E2=80=9Cgarb= age
> > >> >>> instructions=E2=80=9D. As this is un= defined behaviour, this is
> > >> >>> perfectly fin= e and according to the spec, and a compile
> > >> >>&g= t; warning for this missing return statement is issued. However,
> &g= t; >> >>> in C, the behaviour is that a function epilogue is= generated,
> > >> >>> i.e. basically by returning = uninitialised local variable.
> > >> >>> Codes that= rely on this are not beautiful pieces of code, i.e
> > >> &= gt;>> are buggy, but it might just be okay if you for example have> > >> >>> a function that just initialises stuff (an= d the return value
> > >> >>> is not checked, direc= tly or indirectly); some one might argue
> > >> >>>= that not returning from that function might be a bit harsh.
> > &= gt;> >>
> > >> >>
> > >> >&= gt; I would not be one of those people.
> > >> >
> = > >> >
> > >> > Nor me.
> > >>= >>
> > >> >>
> > >> >>>=
> > >> >>> So this email is to probe if there woul= d be strong resistance
> > >> >>> to follow the C b= ehaviour? I am not yet sure how, but would
> > >> >>&g= t; perhaps a compromise be possible/acceptable to make the
> > >= ;> >>> undefined behaviour explicit and also generate the funct= ion
> > >> >>> epilogue?
> > >> >= >
> > >> >>
> > >> >> "un= defined behavior" is exactly that.
> > >> >>
&= gt; > >> >> You have no idea what is going to happen; there = are no
> > >> >> restrictions on what the code being e= xecuted can do.
> > >> >>
> > >> >&g= t; "it just might be ok" means on a particular version of a
&g= t; > >> >> particular compiler, on a particular architecture= and OS, at a
> > >> >> particular optimization level.= Change any of those things, and
> > >> >> you can cha= nge the behavior.
> > >> >
> > >> >
= > > >> > In fact, the "it works kind of as you expected= " is the worst
> > >> > kind of UB in my mind. UB th= at causes a crash, stops or other
> > >> > "directly= obvious that this is wrong" are MUCH easier to debug.
> > &g= t;> >
> > >> > So make this particular kind of UB e= xplicit by crashing or
> > >> > stopping would be a good = thing. Making it explicit by
> > >> > "returning kin= d of nicely, but not correct return value" is
> > >> &g= t; about the worst possible result.
> > >>
> > >= > At -O0 clang emits a trap instruction, making it more explicit as
&= gt; > >> you suggest. At higher optimization levels it just falls<= br>> > >> through/off.
> > >>
> > >&= gt; >
> > >> > --
> > >> > Mats
&= gt; > >> >>
> > >> >>
> > >= > >> -- Marshall
> > >> >>
> > >&= gt; >>
> > >> >> _______________________________= ________________
> > >> >> cfe-dev mailing list
>= ; > >> >> cfe-dev@cs.uiuc.edu
> > >> >> http://l= ists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> > >> >>= ;
> > >> >
> > >> >
> > >&g= t; > _______________________________________________
> > >&g= t; > cfe-dev mailing list
> > >> > cfe-dev@cs.uiuc.edu
> > = >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev> > >> >
> > >>
> > >> ______= _________________________________________
> > >> cfe-dev mai= ling list
> > >> cfe-dev@cs.uiuc.edu
> > >> http://lis= ts.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >
> > _____= __________________________________________
> > cfe-dev mailing lis= t
> > cfe= -dev@cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/lis= tinfo/cfe-dev
> >
>
> --
> Hal Finkel
>= ; Assistant Computational Scientist
> Leadership Computing Facility> Argonne National Laboratory
>
> ________________________= _______________________
> cfe-dev mailing list
> cfe-dev@cs.uiuc.edu
> = http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

 

------=_NextPart_000_0012_01D0CDE1.29D205F0-- --===============1647157026836473028== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ cfe-commits mailing list cfe-commits@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits --===============1647157026836473028==--