[prev in list] [next in list] [prev in thread] [next in thread]
List: cfe-dev
Subject: Re: [cfe-dev] [Analyzer] Obtain MemRegion corresponding to an pointer expression that has been cast
From: scott constable via cfe-dev <cfe-dev () lists ! llvm ! org>
Date: 2015-08-24 20:49:04
Message-ID: CADYF24ffuUnft6sHfO5Sd1jwEqOHt_jCf19yEjvzhx4xJMfoVg () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Thanks Ted,
The solution was to write the "dereference" function like this:
const MemRegion *
Util::getPointedToRegion(SVal addrVal, bool ignoreElemCast) {
Optional<Loc> l = addrVal.getAs<Loc>();
if (!l) // must be a null pointer
return nullptr;
const MemRegion *MR = l->getAsRegion();
if (!MR)
return nullptr;
const ElementRegion *ER = dyn_cast<ElementRegion>(MR);
if (ER && ignoreElemCast)
MR = ER->getSuperRegion();
return MR;
}
It's essentially just stripping off the ElementRegion, just like you
suggested.
~Scott Constable
On Wed, Aug 19, 2015 at 11:57 AM, Ted Kremenek via cfe-dev <
cfe-dev@lists.llvm.org> wrote:
> Hi Scott,
>
> I don't actually see a reason here why you need to even look at the
> structure of the AST here. The analyzer does a full symbolic execution, so
> there is a powerful separation between syntax and semantics right at your
> fingertips.
>
> I would approach this from a different angle. Once you have the location,
> in this case, ‘l', it should be an ElementRegion. That will represent the
> cast from original MemRegion (a VarRegion) to uint8_t*. Then just strip
> off the ElementRegion. The MemRegion design captures how the casts were
> used to change the interpretation of a piece of memory. It's all right
> there in the MemRegion hierarchy.
>
> AST-based approaches like this are fundamentally very brittle. For
> example, you would need to do something different if the code was instead
> written like this:
>
> void foo() {
> struct S x;
> uint8_t *y = (uint8_t *)&x;
> bar(y);
> }
>
> If you just use the MemRegions directly, these syntactic differences are
> irrelevant. The MemRegions capture the actual semantics of the value you
> are working with. In this case, the analyzer knows that the original
> memory address is for the VarRegion for ‘x'.
>
> Typically if you find yourself going to the AST itself to do these kind of
> operations, the approach is inherently wrong. Syntactic approaches work
> reasonably well for the compiler, where cheap local analysis is all you
> have. For the static analyzer, there is so much semantics captured in the
> ProgramState that you can go far beyond the reasoning power of syntactic
> checks like this.
>
> Cheers,
> Ted
>
> > On Aug 19, 2015, at 8:44 AM, scott constable via cfe-dev <
> cfe-dev@lists.llvm.org> wrote:
> >
> > Hi All,
> >
> > I'm analyzing something like the following code:
> >
> > struct S {
> > int a;
> > char b;
> > int c;
> > }
> >
> > void foo() {
> > struct S x;
> > bar((uint8_t *)&x);
> > }
> >
> > When I reach the CallEvent corresponding to the call to bar(), I would
> like to extract the MemRegion corresponding to x, i.e. by ignoring the
> (uint8_t *) cast. My code looks something like this:
> >
> > const Expr *arg = Call.getArgExpr(0);
> > SVal addrVal = State->getSVal(arg, LCtx);
> > Optional<Loc> l = addrVal.getAs<Loc>();
> > if (!l) // must be a null pointer
> > return nullptr;
> >
> > QualType T = getPointedToType(E);
> > return State->getSVal(*l, T).getAsRegion();
> >
> > where getPointedToType() is defined as
> >
> > getPointedToType(const Expr *E) {
> > assert(E);
> > if (!isPointer(E))
> > return QualType();
> > if (const CastExpr *cast = dyn_cast<CastExpr>(E))
> > return getPointedToType(cast->getSubExpr());
> >
> > const PointerType *Ty =
> >
> dyn_cast<PointerType>(E->getType().getCanonicalType().getTypePtr());
> > if (Ty)
> > return Ty->getPointeeType();
> > return QualType();
> > }
> >
> > Everything seems to work just fine, until the call to State->getSVal(*l,
> T), which returns a NonLoc. If I instead call State->getSVal(*l) without
> the pointed-to type, then I do get a MemRegion, but it's an element region
> of type uint_8, NOT what I want.
> >
> > Am I doing something wrong? Is there a much easier way to do this?
> >
> > ~Scott Constable
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev@lists.llvm.org
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman \
> _listinfo_cfe-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=UVc407_CCx3FapxjS2xZ9jo4Q91u \
> pSGpJHRF8fPPYVY&m=kO3mADPT6iSj6j0bsR1t_h-zUwpU5pIswmJrYE52JpY&s=lDOFrm1CLnG-VY9ygoKFkayV7KRSC5BEgo-k_jJdf9k&e=
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
[Attachment #5 (text/html)]
<div dir="ltr">Thanks Ted,<div><br></div><div>The solution was to write the \
"dereference" function like this:</div><div><br></div><div><div>const \
MemRegion * </div><div>Util::getPointedToRegion(SVal addrVal, bool ignoreElemCast) \
{</div><div><span class="" style="white-space:pre"> </span>Optional<Loc> l = \
addrVal.getAs<Loc>();</div><div><span class="" \
style="white-space:pre"> </span>if (!l) // must be a null pointer</div><div><span \
class="" style="white-space:pre"> </span>return nullptr;</div><div><span class="" \
style="white-space:pre"> </span>const MemRegion *MR = \
l->getAsRegion();</div><div><span class="" style="white-space:pre"> </span>if \
(!MR)</div><div><span class="" style="white-space:pre"> </span>return \
nullptr;</div><div><span class="" style="white-space:pre"> </span>const ElementRegion \
*ER = dyn_cast<ElementRegion>(MR);</div><div><span class="" \
style="white-space:pre"> </span>if (ER && ignoreElemCast)</div><div><span \
class="" style="white-space:pre"> </span>MR = \
ER->getSuperRegion();</div><div><br></div><div><span class="" \
style="white-space:pre"> </span>return \
MR;</div><div>}</div></div><div><br></div><div>It's essentially just stripping \
off the ElementRegion, just like you suggested.</div><div><br></div><div>~Scott \
Constable</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, \
Aug 19, 2015 at 11:57 AM, Ted Kremenek via cfe-dev <span dir="ltr"><<a \
href="mailto:cfe-dev@lists.llvm.org" \
target="_blank">cfe-dev@lists.llvm.org</a>></span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">Hi Scott,<br> <br>
I don't actually see a reason here why you need to even look at the structure of the \
AST here. The analyzer does a full symbolic execution, so there is a powerful \
separation between syntax and semantics right at your fingertips.<br> <br>
I would approach this from a different angle. Once you have the location, in this \
case, ‘l', it should be an ElementRegion. That will represent the cast from \
original MemRegion (a VarRegion) to uint8_t*. Then just strip off the \
ElementRegion. The MemRegion design captures how the casts were used to change the \
interpretation of a piece of memory. It's all right there in the MemRegion \
hierarchy.<br> <br>
AST-based approaches like this are fundamentally very brittle. For example, you \
would need to do something different if the code was instead written like this:<br> \
<span class=""><br> void foo() {<br>
struct S x;<br>
</span> uint8_t *y = (uint8_t *)&x;<br>
bar(y);<br>
}<br>
<br>
If you just use the MemRegions directly, these syntactic differences are irrelevant. \
The MemRegions capture the actual semantics of the value you are working with. In \
this case, the analyzer knows that the original memory address is for the VarRegion \
for ‘x'.<br> <br>
Typically if you find yourself going to the AST itself to do these kind of \
operations, the approach is inherently wrong. Syntactic approaches work reasonably \
well for the compiler, where cheap local analysis is all you have. For the static \
analyzer, there is so much semantics captured in the ProgramState that you can go far \
beyond the reasoning power of syntactic checks like this.<br> <br>
Cheers,<br>
Ted<br>
<div><div class="h5"><br>
> On Aug 19, 2015, at 8:44 AM, scott constable via cfe-dev <<a \
href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br> \
><br> > Hi All,<br>
><br>
> I'm analyzing something like the following code:<br>
><br>
> struct S {<br>
> int a;<br>
> char b;<br>
> int c;<br>
> }<br>
><br>
> void foo() {<br>
> struct S x;<br>
> bar((uint8_t *)&x);<br>
> }<br>
><br>
> When I reach the CallEvent corresponding to the call to bar(), I would like to \
extract the MemRegion corresponding to x, i.e. by ignoring the (uint8_t *) cast. My \
code looks something like this:<br> ><br>
> const Expr *arg = Call.getArgExpr(0);<br>
> SVal addrVal = State->getSVal(arg, LCtx);<br>
> Optional<Loc> l = addrVal.getAs<Loc>();<br>
> if (!l) // must be a null pointer<br>
> return nullptr;<br>
><br>
> QualType T = getPointedToType(E);<br>
> return State->getSVal(*l, T).getAsRegion();<br>
><br>
> where getPointedToType() is defined as<br>
><br>
> getPointedToType(const Expr *E) {<br>
> assert(E);<br>
> if (!isPointer(E))<br>
> return QualType();<br>
> if (const CastExpr *cast = dyn_cast<CastExpr>(E))<br>
> return getPointedToType(cast->getSubExpr());<br>
><br>
> const PointerType *Ty =<br>
> \
dyn_cast<PointerType>(E->getType().getCanonicalType().getTypePtr());<br> \
> if (Ty)<br> > return \
Ty->getPointeeType();<br> > return QualType();<br>
> }<br>
><br>
> Everything seems to work just fine, until the call to State->getSVal(*l, T), \
which returns a NonLoc. If I instead call State->getSVal(*l) without the \
pointed-to type, then I do get a MemRegion, but it's an element region of type \
uint_8, NOT what I want.<br> ><br>
> Am I doing something wrong? Is there a much easier way to do this?<br>
><br>
> ~Scott Constable<br>
</div></div>> _______________________________________________<br>
> cfe-dev mailing list<br>
> <a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2 \
Dbin_mailman_listinfo_cfe-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=UVc407 \
_CCx3FapxjS2xZ9jo4Q91upSGpJHRF8fPPYVY&m=kO3mADPT6iSj6j0bsR1t_h-zUwpU5pIswmJrYE52JpY&s=lDOFrm1CLnG-VY9ygoKFkayV7KRSC5BEgo-k_jJdf9k&e=" \
rel="noreferrer" target="_blank">https://urldefense.proofpoint.com/v2/url?u=http-3A__l \
ists.llvm.org_cgi-2Dbin_mailman_listinfo_cfe-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKG \
JOplqw&r=UVc407_CCx3FapxjS2xZ9jo4Q91upSGpJHRF8fPPYVY&m=kO3mADPT6iSj6j0bsR1t_h- \
zUwpU5pIswmJrYE52JpY&s=lDOFrm1CLnG-VY9ygoKFkayV7KRSC5BEgo-k_jJdf9k&e=</a><br> \
<br> _______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" \
target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br> \
</blockquote></div><br></div>
[Attachment #6 (text/plain)]
_______________________________________________
cfe-dev mailing list
cfe-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic