From cfe-dev Fri Jul 31 14:35:54 2015 From: "Sjoerd Meijer" Date: Fri, 31 Jul 2015 14:35:54 +0000 To: cfe-dev Subject: [cfe-dev] [PATCH] RE: missing return statement for non-void functions in C++ Message-Id: <000b01d0cb9e$351e3db0$9f5ab910$ () arm ! com> X-MARC-Message: https://marc.info/?l=cfe-dev&m=143835398104261 MIME-Version: 1 Content-Type: multipart/mixed; boundary="------=_NextPart_000_000C_01D0CBA6.96E4EFA0" This is a multipart message in MIME format. ------=_NextPart_000_000C_01D0CBA6.96E4EFA0 Content-Type: multipart/alternative; boundary="----=_NextPart_001_000D_01D0CBA6.96E4EFA0" ------=_NextPart_001_000D_01D0CBA6.96E4EFA0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I am not sure if we came to a conclusion. Please find attached a patch.= It simply removes the two lines that insert an unreachable statement (whic= h cause removal of the return statement). Please note that at -O0 the trap = instruction is still generated. Is this something we could live with? =20 Cheers, Sjoerd. =20 From: cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu] On B= ehalf Of Richard Smith Sent: 29 July 2015 18:07 To: Hal Finkel Cc: Marshall Clow; cfe-dev@cs.uiuc.edu Developers Subject: Re: [cfe-dev] missing return statement for non-void functions in C= ++ =20 On Jul 29, 2015 7:43 AM, "Hal Finkel" wrote: > > ----- Original Message ----- > > From: "David Blaikie" > > To: "James Molloy" > > Cc: "Marshall Clow" , "cfe-dev Developers" > > Sent: Wednesday, July 29, 2015 9:15:09 AM > > Subject: Re: [cfe-dev] missing return statement for non-void functions = in C++ > > > > > > On Jul 29, 2015 7:06 AM, "James Molloy" < james@jamesmolloy.co.uk > > > wrote: > > > > > > Hi, > > > > > > If we're going to emit a trap instruction (and thus create a broken > > > binary), why don't we error instead? > > > > We warn, can't error, because it may be dynamically unreached, in > > which case the program is valid and we can't reject it. > > I think this also explains why this is useful for optimization. > > 1. It is a code-size optimization > 2. By eliminating unreachable control flow, we can remove branches and t= ests that are not actual necessary > > int foo(int x) { > if (x > 5) return 2*x; > else if (x < 2) return 3 - x; > } > > That having been said, there are other ways to express these things, and = the situation often represents an error. I'd be fine with requiring a speci= al flag (-fallow-nonreturning-functions or whatever) in order to put the co= mpiler is a truly confirming mode (similar to the situation with sized dele= te). Note that we already have a flag to trap on this: -fsanitize-trap=3Dreturn.= (You may also need -fsanitize=3Dreturn, I don't remember.) That seems cons= istent with how we treat most other forms of UB. > -Hal > > > > > > > > > James > > > > > > On Wed, 29 Jul 2015 at 15:05 David Blaikie < dblaikie@gmail.com > > > > wrote: > > >> > > >> > > >> On Jul 29, 2015 2:10 AM, "mats petersson" < mats@planetcatfish.com > > >> > wrote: > > >> > > > >> > > > >> > > > >> > On 28 July 2015 at 23:40, Marshall Clow < mclow.lists@gmail.com > > >> > > wrote: > > >> >> > > >> >> > > >> >> > > >> >> On Tue, Jul 28, 2015 at 6:14 AM, Sjoerd Meijer < > > >> >> sjoerd.meijer@arm.com > wrote: > > >> >>> > > >> >>> Hi, > > >> >>> > > >> >>> > > >> >>> > > >> >>> In C++, the undefined behaviour of a missing return statements > > >> >>> for a non-void function results in not generating the > > >> >>> function epilogue (unreachable statement is inserted and the > > >> >>> return statement is optimised away). Consequently, the > > >> >>> runtime behaviour is that control is never properly returned > > >> >>> from this function and thus it starts executing =E2=80=9Cgarbage > > >> >>> instructions=E2=80=9D. As this is undefined behaviour, this is > > >> >>> perfectly fine and according to the spec, and a compile > > >> >>> warning for this missing return statement is issued. However, > > >> >>> in C, the behaviour is that a function epilogue is generated, > > >> >>> i.e. basically by returning uninitialised local variable. > > >> >>> Codes that rely on this are not beautiful pieces of code, i.e > > >> >>> are buggy, but it might just be okay if you for example have > > >> >>> a function that just initialises stuff (and the return value > > >> >>> is not checked, directly or indirectly); some one might argue > > >> >>> that not returning from that function might be a bit harsh. > > >> >> > > >> >> > > >> >> I would not be one of those people. > > >> > > > >> > > > >> > Nor me. > > >> >> > > >> >> > > >> >>> > > >> >>> So this email is to probe if there would be strong resistance > > >> >>> to follow the C behaviour? I am not yet sure how, but would > > >> >>> perhaps a compromise be possible/acceptable to make the > > >> >>> undefined behaviour explicit and also generate the function > > >> >>> epilogue? > > >> >> > > >> >> > > >> >> "undefined behavior" is exactly that. > > >> >> > > >> >> You have no idea what is going to happen; there are no > > >> >> restrictions on what the code being executed can do. > > >> >> > > >> >> "it just might be ok" means on a particular version of a > > >> >> particular compiler, on a particular architecture and OS, at a > > >> >> particular optimization level. Change any of those things, and > > >> >> you can change the behavior. > > >> > > > >> > > > >> > In fact, the "it works kind of as you expected" is the worst > > >> > kind of UB in my mind. UB that causes a crash, stops or other > > >> > "directly obvious that this is wrong" are MUCH easier to debug. > > >> > > > >> > So make this particular kind of UB explicit by crashing or > > >> > stopping would be a good thing. Making it explicit by > > >> > "returning kind of nicely, but not correct return value" is > > >> > about the worst possible result. > > >> > > >> At -O0 clang emits a trap instruction, making it more explicit as > > >> you suggest. At higher optimization levels it just falls > > >> through/off. > > >> > > >> > > > >> > -- > > >> > Mats > > >> >> > > >> >> > > >> >> -- Marshall > > >> >> > > >> >> > > >> >> _______________________________________________ > > >> >> cfe-dev mailing list > > >> >> cfe-dev@cs.uiuc.edu > > >> >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > >> >> > > >> > > > >> > > > >> > _______________________________________________ > > >> > cfe-dev mailing list > > >> > cfe-dev@cs.uiuc.edu > > >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > >> > > > >> > > >> _______________________________________________ > > >> cfe-dev mailing list > > >> cfe-dev@cs.uiuc.edu > > >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > _______________________________________________ > > cfe-dev mailing list > > cfe-dev@cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > cfe-dev mailing list > cfe-dev@cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev ------=_NextPart_001_000D_01D0CBA6.96E4EFA0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Hi, I am = not sure if we came to a conclusion. Please find attached a patch. It simpl= y removes the two lines that insert an unreachable statement (which cause r= emoval of the return statement). Please note that at -O0 the trap instructi= on is still generated. Is this something we could live with?

 

Cheers,

Sj= oerd.

 

From: cfe-dev-bounces@= cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu] On Behalf Of Richar= d Smith
Sent: 29 July 2015 18:07
To: Hal Finkel
C= c: Marshall Clow; cfe-dev@cs.uiuc.edu Developers
Subject: Re:= [cfe-dev] missing return statement for non-void functions in C++

 

On Jul 29, 2015 7:= 43 AM, "Hal Finkel" <hfinke= l@anl.gov> wrote:
>
> ----- Original Message -----
&g= t; > From: "David Blaikie" <dblaikie@gmail.com>
> > To: "James Molloy"= <james@jamesmolloy.co.uk= >
> > Cc: "Marshall Clow" <mclow.lists@gmail.com>, "cfe-dev Developers&q= uot; <cfe-dev@cs.uiuc.edu>=
> > Sent: Wednesday, July 29, 2015 9:15:09 AM
> > Subjec= t: Re: [cfe-dev] missing return statement for non-void functions in C++
= > >
> >
> > On Jul 29, 2015 7:06 AM, "James Mo= lloy" < james@jamesmollo= y.co.uk >
> > wrote:
> > >
> > > Hi= ,
> > >
> > > If we're going to emit a trap instruc= tion (and thus create a broken
> > > binary), why don't we erro= r instead?
> >
> > We warn, can't error, because it may b= e dynamically unreached, in
> > which case the program is valid an= d we can't reject it.
>
> I think this also explains why this i= s useful for optimization.
>
>  1. It is a code-size optim= ization
>  2. By eliminating unreachable control flow, we can re= move branches and tests that are not actual necessary
>
> int f= oo(int x) {
>   if (x > 5) return 2*x;
>   else if= (x < 2) return 3 - x;
> }
>
> That having been said, = there are other ways to express these things, and the situation often repre= sents an error. I'd be fine with requiring a special flag (-fallow-nonretur= ning-functions or whatever) in order to put the compiler is a truly confirm= ing mode (similar to the situation with sized delete).

Not= e that we already have a flag to trap on this: -fsanitize-trap=3Dreturn. (Y= ou may also need -fsanitize=3Dreturn, I don't remember.) That seems consist= ent with how we treat most other forms of UB.

>  -= Hal
>
> >
> > >
> > > James
> = > >
> > > On Wed, 29 Jul 2015 at 15:05 David Blaikie <= dblaikie@gmail.com >
> = > > wrote:
> > >>
> > >>
> > &= gt;> On Jul 29, 2015 2:10 AM, "mats petersson" < mats@planetcatfish.com
> > >= ;> > wrote:
> > >> >
> > >> >
= > > >> >
> > >> > On 28 July 2015 at 23:40= , Marshall Clow < mclow.lists@g= mail.com
> > >> > > wrote:
> > >> &= gt;>
> > >> >>
> > >> >>
&g= t; > >> >> On Tue, Jul 28, 2015 at 6:14 AM, Sjoerd Meijer &l= t;
> > >> >> = sjoerd.meijer@arm.com > wrote:
> > >> >>>> > >> >>> Hi,
> > >> >>>
= > > >> >>>
> > >> >>>
> = > >> >>> In C++, the undefined behaviour of a missing ret= urn statements
> > >> >>> for a non-void function r= esults in not generating the
> > >> >>> function ep= ilogue (unreachable statement is inserted and the
> > >> >= ;>> return statement is optimised away). Consequently, the
> &g= t; >> >>> runtime behaviour is that control is never properl= y returned
> > >> >>> from this function and thus i= t starts executing =E2=80=9Cgarbage
> > >> >>> inst= ructions=E2=80=9D. As this is undefined behaviour, this is
> > >= ;> >>> perfectly fine and according to the spec, and a compile<= br>> > >> >>> warning for this missing return statemen= t is issued. However,
> > >> >>> in C, the behaviou= r is that a function epilogue is generated,
> > >> >>&= gt; i.e. basically by returning uninitialised local variable.
> > = >> >>> Codes that rely on this are not beautiful pieces of c= ode, i.e
> > >> >>> are buggy, but it might just be= okay if you for example have
> > >> >>> a function= that just initialises stuff (and the return value
> > >> &g= t;>> is not checked, directly or indirectly); some one might argue> > >> >>> that not returning from that function migh= t be a bit harsh.
> > >> >>
> > >> >= >
> > >> >> I would not be one of those people.
= > > >> >
> > >> >
> > >> &g= t; Nor me.
> > >> >>
> > >> >>> > >> >>>
> > >> >>> So this= email is to probe if there would be strong resistance
> > >>= ; >>> to follow the C behaviour? I am not yet sure how, but would<= br>> > >> >>> perhaps a compromise be possible/accepta= ble to make the
> > >> >>> undefined behaviour expl= icit and also generate the function
> > >> >>> epil= ogue?
> > >> >>
> > >> >>
>= > >> >> "undefined behavior" is exactly that.
= > > >> >>
> > >> >> You have no idea= what is going to happen; there are no
> > >> >> restr= ictions on what the code being executed can do.
> > >> >&= gt;
> > >> >> "it just might be ok" means on= a particular version of a
> > >> >> particular compil= er, on a particular architecture and OS, at a
> > >> >>= ; particular optimization level. Change any of those things, and
> &g= t; >> >> you can change the behavior.
> > >> >= ;
> > >> >
> > >> > In fact, the "= it works kind of as you expected" is the worst
> > >> &= gt; kind of UB in my mind. UB that causes a crash, stops or other
> &= gt; >> > "directly obvious that this is wrong" are MUCH = easier to debug.
> > >> >
> > >> > So m= ake this particular kind of UB explicit by crashing or
> > >>= ; > stopping would be a good thing. Making it explicit by
> > &= gt;> > "returning kind of nicely, but not correct return value&q= uot; is
> > >> > about the worst possible result.
>= > >>
> > >> At -O0 clang emits a trap instruction,= making it more explicit as
> > >> you suggest. At higher op= timization levels it just falls
> > >> through/off.
> = > >>
> > >> >
> > >> > --
&= gt; > >> > Mats
> > >> >>
> > >= ;> >>
> > >> >> -- Marshall
> > >= > >>
> > >> >>
> > >> >>= _______________________________________________
> > >> >= > cfe-dev mailing list
> > >> >> cfe-dev@cs.uiuc.edu
> > >> >>= http://lists= .cs.uiuc.edu/mailman/listinfo/cfe-dev
> > >> >>> > >> >
> > >> >
> > >> &= gt; _______________________________________________
> > >> &= gt; cfe-dev mailing list
> > >> > cfe-dev@cs.uiuc.edu
> > >> > http://lists.cs.uiuc= .edu/mailman/listinfo/cfe-dev
> > >> >
> > &= gt;>
> > >> _____________________________________________= __
> > >> cfe-dev mailing list
> > >> cfe-dev@cs.uiuc.edu
> > >&g= t; http://lis= ts.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >
> > _____= __________________________________________
> > cfe-dev mailing lis= t
> > cfe-dev@cs.uiuc.edu
> >
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >
&g= t;
> --
> Hal Finkel
> Assistant Computational Scientist<= br>> Leadership Computing Facility
> Argonne National Laboratory>
> _______________________________________________
> cfe-= dev mailing list
> cfe-dev@cs.= uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

------=_NextPart_001_000D_01D0CBA6.96E4EFA0-- ------=_NextPart_000_000C_01D0CBA6.96E4EFA0 Content-Transfer-Encoding: base64 Content-Type: application/octet-stream; name="missingreturn.diff" Content-Disposition: attachment; filename="missingreturn.diff" ZGlmZiAtLWdpdCBhL2xpYi9Db2RlR2VuL0NvZGVHZW5GdW5jdGlvbi5jcHAgYi9saWIvQ29kZUdl bi9Db2RlR2VuRnVuY3Rpb24uY3BwCmluZGV4IGVjM2M3NWMuLjM1YTVmMTUgMTAwNjQ0Ci0tLSBh L2xpYi9Db2RlR2VuL0NvZGVHZW5GdW5jdGlvbi5jcHAKKysrIGIvbGliL0NvZGVHZW4vQ29kZUdl bkZ1bmN0aW9uLmNwcApAQCAtOTMzLDggKzkzMyw2IEBAIHZvaWQgQ29kZUdlbkZ1bmN0aW9uOjpH ZW5lcmF0ZUNvZGUoR2xvYmFsRGVjbCBHRCwgbGx2bTo6RnVuY3Rpb24gKkZuLAogICAgIH0gZWxz ZSBpZiAoQ0dNLmdldENvZGVHZW5PcHRzKCkuT3B0aW1pemF0aW9uTGV2ZWwgPT0gMCkgewogICAg ICAgRW1pdFRyYXBDYWxsKGxsdm06OkludHJpbnNpYzo6dHJhcCk7CiAgICAgfQotICAgIEJ1aWxk ZXIuQ3JlYXRlVW5yZWFjaGFibGUoKTsKLSAgICBCdWlsZGVyLkNsZWFySW5zZXJ0aW9uUG9pbnQo KTsKICAgfQogCiAgIC8vIEVtaXQgdGhlIHN0YW5kYXJkIGZ1bmN0aW9uIGVwaWxvZ3VlLgo= ------=_NextPart_000_000C_01D0CBA6.96E4EFA0 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ cfe-dev mailing list cfe-dev@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev ------=_NextPart_000_000C_01D0CBA6.96E4EFA0--