Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

C pointer aliasing violations and aggressive compiler optimizations.

Posted by peeterjoot on November 14, 2009

In our product (DB2), performance is given priority over all else in many cases. We have performance regression testing on all our platforms (lots), policing of build to build and release to release instruction counts in critical codepaths, benchmark and customer workload driven performance features, and a lot of other similar performance driven activity.

It is probably fair to say that there are folks in our performance team that are willing to sacrifice small children and dogs for 3% improvement in various workloads. Some of those sacrificed children are developer time and sanity. In particular we build our product with options like the xlC (AIX powerpc compiler) -qansialias option (there are similar options on other compilers, like the gcc -fstrict-aliasing). Additionally we build with options like -qipa (inter-procedural-analysis) and -qpdf (profile directed feedback). These last two options allow the compiler to do very aggressive optimization, using profiling information to guide a second pass of compilation, and allowing that optimization to span separate compilation modules. In a nutshell, these are the very scary optimization options. When it goes wrong, often because of incorrect code, the effects can be very mysterious, hard to isolate, and even non-deterministic (Thursday’s build can be busted, even if the code didn’t change, when Wednesday’s build was fine). What do these profile guided optimizations have to do with aliasing? I’ll return to that.

Incorrectly pointer aliasing is probably the most prominent trigger for unexpected compiler behaviour under these aggressive optimizations. Aliasing problems can also mess up plain old optimization (when strict aliasing is enabled or not disabled when on by default). These issues generally become worse with profile guided optimization since the compiler has access to more state. To get an idea of how much state we are talking about, for our product, in the -qipa “re-link” phase of our libdb2e.a library, the compiler actually has ALL the source code available to it for the whole library in a metadata form of some sort. That’s a shitload of code. Our library, containing the bulk of our product, is now something like 100Mb large, and that is without debug symbols. It is so big that we break linkers on most new platform ports. Within the profiling information that the compiler uses for this second phase compilation, is information about what to spend time doing additional optimization of. However, just as important, this profile information is actually used by the compiler to decide what NOT to spend time doing additional optimization of. The compiler’s optimization problem is so big when it has all of the source available en-mass that, unless you want to spend 3 weeks waiting for your build to complete, the compiler needs a mechanism for its optimizations to be throttled down.

I know I haven’t said what aliasing means. Sorry about that, I’ll try to get to it now. Reading the man pages may not be that enlightening. Here, for example, is the gcc man page content for its aliasing checking options:

       -Wstrict-aliasing
           This option is only active when -fstrict-aliasing is active.  It
           warns about code which might break the strict aliasing rules that
           the compiler is using for optimization. The warning does not catch
           all cases, but does attempt to catch the more common pitfalls. It
           is included in -Wall.

If you don’t already know what this means it isn’t going to be of much help. The help text for their option to enable strict aliasing is a bit better

       -fstrict-aliasing
           Allows the compiler to assume the strictest aliasing rules applica-
           ble to the language being compiled.  For C (and C++), this acti-
           vates optimizations based on the type of expressions.  In particu-
           lar, an object of one type is assumed never to reside at the same
           address as an object of a different type, unless the types are
           almost the same.  For example, an "unsigned int" can alias an
           "int", but not a "void*" or a "double".  A character type may alias
           any other type.

           Pay special attention to code like this:

                   union a_union {
                     int i;
                     double d;
                   };

                   int f() {
                     a_union t;
                     t.d = 3.0;
                     return t.i;
                   }

           The practice of reading from a different union member than the one
           most recently written to (called ``type-punning'') is common.  Even
           with -fstrict-aliasing, type-punning is allowed, provided the mem-
           ory is accessed through the union type.  So, the code above will
           work as expected.  However, this code might not:

                   int f() {
                     a_union t;
                     int* ip;
                     t.d = 3.0;
                     ip = &t.i;
                     return *ip;
                   }

           Every language that wishes to perform language-specific alias anal-
           ysis should define a function that computes, given an "tree" node,
           an alias set for the node.  Nodes in different alias sets are not
           allowed to alias.  For an example, see the C front-end function
           "c_get_alias_set".

But in all honesty, this is written by a compiler developer, and another compiler developer will probably look at this and say, “sure”. As a C or C++ developer, the real impact of this option is still probably not clear. Here’s what it boils down to.

  • You can potentially get a lot of performance improvement telling the compiler you do not alias pointers.
  • The subset of your developers who haven’t worked on the compiler probably don’t know what this means.
  • When they break the rules that they don’t know about, things blow up.

Ironically, the alias options are a lot like a cast in C.  When you cast something in the compiler you are usually instructing the compiler “I know what I am doing, so please believe me that this is what I want, and do it for me”.  The aliasing options are very much like this, and are a set of explicit instructions to the compiler that you are following the rules that it finds desirable to optimize the code.

These rules are in fact an agreement and pact to the compiler that you are not going to change the type of a pointer with a cast, or if you do then you won’t dereference the pointer.  There are certain types of casts that are allowed, but many that do not cause compilation errors are in fact not allowed when these options are, knowingly or unknowingly in effect.  Here is an example:

int32_t u = INT32_MAX ;
short * ps = (short *)&u ;

printf( "0x%hx\n", *ps ) ;

What expectation do you have for this code? The value placed in the variable u is the same in both the high and low order bits, so there’s no endian gotchas. Most people would expect 0xffff to be printed. You may be rudely suprised if you got 0x0 printed or something else, perhaps something random. You’d say “whoa, that’s one buggy compiler!”. But here’s the thing, if you’ve enabled the strict pointer aliasing options of the compiler, you’ve told it,

“Hey man, I’m friendly to your optimizer, I’d never do something as evil as casting one type to something that it is not. If I do, then I won’t actually dereference this pointer. Please go ahead and optimize away assuming something like this would never happen. If it does happen, your optimization pass should probably just treat such a thing as an impossibility that could be optimized away.”

Any strict aliasing option is pretty much the most dangerous thing you can tell the compiler, especially when the developers don’t know the rules that the compiler optimizer wants followed. A product like ours has hundreds or thousands of developers all over the world, with constant turn around and role change. Subtleties like this isn’t taught in school. When it comes down to it, code that does not fail to compile code is legal (albeit possibly buggy). If the language allows a cast, why would a subsequent dereference not be allowed?

It is the dereference that is really what is not allowed. You can cast away to your hearts content, but if the final result of that cast changes the type, then you are not allowed to dereference it. Here’s another example

void printIt( const void * const    p )
{
   short * ps = (short *)p ;

   printf( "0x%hx\n", *ps ) ;
}

void ThisIsAllowed( void )
{
   int16_t u = INT16_MAX ;

   printIt( &u ) ;
}

void ThisIsNotAllowed( void )
{
   int32_t u = INT32_MAX ;

   printIt( &u ) ;
}

Do you look at this example and also say “huuhh?” Isn’t that the whole point of a ‘void *’? You can cast something to a ‘void *’, and then you can cast it from a ‘void *’ and use it. Right?

Nope.

If you talk to the compiler guru who points out that your code is buggy since you’ve agreed to follow the rules they want, then the response, rudely violating your expectations, will be something like:

“You may cast a ‘TYPE pointer’ to a ‘void *’, and then cast a ‘void *’ back to a ‘TYPE pointer’. You may not use this as an opportunity to change the type from what it was originally.”

Oh. Shit. We probably do that all the time. Does this mean all our code is probably busted?

“Yes.”

It gets worse. Consider this fragment of code again. If those functions that dereference things are in different source code modules, then the compiler can’t know the type gets changed. Right? Wrong. Here’s things modified with a fix to workaround the issue:

/* BLAH.C
 *
 * like it or not, we believe we are given a pointer to a short
 */
void printIt( const void * const    p )
{
   short * ps = (short *)p ;

   printf( "0x%hx\n", *ps ) ;
}

and

/* GOO.C 
 *
 */
void ThisIsAllowed( void )
{
   int16_t u = INT16_MAX ;

   printIt( &u ) ;
}

void ThisWasntAllowedButIsNowBelievedFixed( void )
{
   int32_t u = INT32_MAX ;

   printIt( &u ) ; // BLAH.C can't know this wasn't a short.
}

Not only do we lie to the compiler telling it that we follow the aliasing rules, but we also now let it do cross module optimization, even inlining non-inline functions that are called and defined in completely different places. This is why strict aliasing doesn’t get along with the even more aggressive optimization techniques.

So, you walk downstairs one floor to the compiler guys and ask how can I even find where our code is breaking the rules?

These guys do want your code to work, but they have a hard problem, and you have hard problems, and here’s what they say

“Here, I’ve got this friendly compiler option for you that will give you an error, whenever you cast anything. Does that help?”

Ah. Simply run that on millons of lines of legacy and constantly changing new code, and remove all the casting. I’ll get back to you tomorrow and let you know how that worked;)

One of the options to fix this is simple. Stop lying to the compiler. Your code probably doesn’t follow the aliasing rules, so stop telling it that it does. This is what I think we ought to do with our code, but I’m now an old jaded programmer who doesn’t have the energy to fight that battle. All those baby sacrificers in the performance team aren’t going to like loosing the free performance they are getting, and we’ve now found all the problematic parts of the code. Right? Sure, until next time.

I’m actually quite laid back writing this, so if it sounds like I am ranting, that’s not the intention. I’m just enjoying playing up the drama a bit. If I was asked what the right thing to do it would be turn off the aliasing options by default, and only turn them back on for subsets of the code identified as worthy. Then impose a requirement for strongly educating all developers in that kernel of performance sensitive code what they may or may not do, and audit that subset of the code thoroughly. That’s the right way I think, but you have to ante up the people to deal with the performance regressions this will cause, and everybody is busy working on the new features. The people who pay the salaries of the developers and cater to marketing demands have a whole different idea of what the right thing to do is, and they are also perfectly correct from their point of view.

Realistically, even if we could selectively restrict our use of these strong optimizations, I wouldn’t have high expectations that it would really work. At least not permanently. Tools are required to enforce the rules (ideally the compiler), so that things do not regress. A developer who cannot compile the code is much more likely to spend the time required to understand the rules being broken.

It is unfortunate that many of the options that are available for diagnosis are too noisy, and not granular enough. void pointer and char pointer casting are allowed in some circumstances, so alias detection options that warn about these too along with all the other bad stuff means that the bad stuff will get lost in the diagnostics barrage. When you are faced with humongous amounts of legacy code that no single person understands, and code that is only understood by people who have retired or moved on, then the only hope is incremental change, gradually increasing the scope of what is disallowed.

There is so much pressure for performance that the compiler guys (at least our compiler) now deliberately weaken their optimizations when it appears there is an aliasing rule violation. Unfortunately even this is not easy fed back to us for code change since it is in the bowels of the deepest layers of the optimizer where infrastructure to cause external and meaningful messages to the developer is not easily available. With the compiler doing this deliberate optimization weakening so that we can continue to lie to them that we follow their rules, we do squeak by.

I don’t know a good way of dealing with these sorts of problems. I suspect that the status quo, tackling these things one at a time, as they happen will be with us for a while. Every few months, usually as a product or fixpack cycle is approaching the end of the cycle we’ll hit these sort of optimization issues, and have the hard task of debugging them. The cost of this in developer time is absorbed silently and not part of any sizing of effort or time.

I’d love it if we had at least one of the platform compilers that we use deal with strict aliasing in a good way, making it an error to break the rules when we say we are following them. Then we can sleep at night knowing that we do not have a timebomb, and also get the performance that we desire.

Some references

I’ve glossed over a lot of details. There are a number of ways that type changing casts are allowed, and I haven’t said how you would fix these issues when you hit them. It’s my intention to blog on these separately some time later. Until then I’ve found a couple good tidbits that do some of what I have not.

what-is-the-strict-aliasing-rule?

and here:

understanding-strict-aliasing

3 Responses to “C pointer aliasing violations and aggressive compiler optimizations.”

  1. […] by peeterjoot on July 9, 2010 In C pointer aliasing violations and aggressive compiler optimizations, I discussed some examples of aliasing violations. Some types of aliasing violations can be fixed […]

  2. Alan Silverstein said

    Great article, very useful, thanks.

  3. […] alias: 83     pointer aliasing optimization    18     pointer aliasing example    17     c pointer alias    13     c_get_alias_set    13     pointer aliasing c    11     c pointer aliasing    11 https://peeterjoot.wordpress.com/2010/07/09/use-of-unions-to-deal-with-aliasing-problems/ https://peeterjoot.wordpress.com/2009/11/14/c-pointer-aliasing-violations-and-aggressive-compiler-opt… […]

Leave a comment