Monday Sep 17, 2012

C++11 Tidbits: access control under SFINAE conditions

Lately I have been spending quite some time on the SFINAE ("Substitution failure is not an error") features of C++, fixing and tweaking various bits of the GCC implementation.

An important missing piece was the implementation of the resolution of DR 1170 which, in a nutshell, mandates that access checking is done as part of the substitution process. Consider:

class C {
  typedef int type;
};

template <class T, class = typename T::type>
auto f(int) -> char;

template <class>
auto f(...) -> char (&)[2];

static_assert (sizeof(f<C>(0)) == 2, "Ouch");

According to the resolution, the static_assert should not fire, and the snippet should compile successfully, the reason being that the first f overload must be removed from the candidate set because C::type is private to C. On the other hand, before the resolution of DR 1170, the expected behavior was for the first overload to remain in the candidate set, win over the second one, to eventually lead to an access control error (*).

In GCC mainline (would be 4.8) the DR is finally implemented, thus benefiting the many modern programming techniques exploiting SFINAE, among which certainly the GNU C++ runtime library itself, which relies on it for the internals of <type_traits> and in several other places.

Note that the resolution of the DR is active even in C++98 mode, not just in C++11 mode, because it turned out that the traditional behavior, as implemented in GCC, wasn't fully consistent in all the possible circumstances.

(*) In practice, GCC didn't really implement this, the static_assert triggered instead.

Wednesday Jun 06, 2012

C++11 Tidbits: Decltype (Part 2, trailing return type)

Following on from last tidbit showing how the decltype operator essentially queries the type of an expression, the second part of this overview discusses how decltype can be syntactically combined with auto (itself the subject of the March 2010 tidbit). This combination can be used to specify trailing return types, also known informally as "late specified return types".

Leaving aside the technical jargon, a simple example from section 8.3.5 of the C++11 standard usefully introduces this month's topic. Let's consider a template function like:

  template <class T, class U>
    ???
    foo(T t, U u)
    {
      return t + u;
    }

The question is: what should replace the question marks?

The problem is that we are dealing with a template, thus we don't know at the outset the types of T and U. Even if they were restricted to be arithmetic builtin types, non-trivial rules in C++ relate the type of the sum to the types of T and U. In the past - in the GNU C++ runtime library too - programmers used to address these situations by way of rather ugly tricks involving __typeof__ which now, with decltype, could be rewritten as:

  template <class T, class U>
    decltype((*(T*)0) + (*(U*)0))
    foo(T t, U u)
    {
      return t + u;
    }

On the other hand, in C++11 you can use auto:

  template <class T, class U>
    auto
    foo(T t, U u) -> decltype(t + u)
    {
      return t + u;
    }

This is much better. It's generic and a construct fully supported by the language.

Finally, let's see a real-life example directly taken from the C++11 runtime library as implemented in GCC:

  template<typename _IteratorL, typename _IteratorR>
    inline auto
    operator-(const reverse_iterator<_IteratorL>& __x,
	      const reverse_iterator<_IteratorR>& __y)
    -> decltype(__y.base() - __x.base())
    { return __y.base() - __x.base(); }

By now it should appear be completely straightforward.

The availability of trailing return types in C++11 allowed fixing a real bug in the C++98 implementation of this operator (and many similar ones). In GCC, C++98 mode, this operator is:

  template<typename _IteratorL, typename _IteratorR>
    inline typename reverse_iterator<_IteratorL>::difference_type
    operator-(const reverse_iterator<_IteratorL>& __x,
	      const reverse_iterator<_IteratorR>& __y)
    { return __y.base() - __x.base(); }

This was guaranteed to work well with heterogeneous reverse_iterator types only if difference_type was the same for both types.

Wednesday May 09, 2012

C++11 Tidbits: Decltype (Part 1)

Decltype was among the first C++11 features implemented in GCC. It has roots in a very old GNU extension named __typeof__, also usable in C and well known to users of the GNU Compiler Collection. The C++11 conforming implementation of the idea landed in GCC 4.3.x in 2008. It overcame some defects of __typeof__. Both decltype and __decltype are available in GCC; the former only in c++11 mode, the latter both in c++11 and c++98 mode. The legacy __typeof__ is now unmaintained, meaning no problematic aspects with it will be addressed in future, besides of course making sure that what used to work keeps on working.

That said, let's introduce the basic concept of decltype. In its essence it's quite simple: the decltype operator queries the type of an expression. When this is integrated with other C++11 features it can enable rather interesting programming constructs.

A range of basic examples (from the ISO Document N2343. See some of the references therein for historical background and rationale) follows:

  const int&&;& foo();
  int i;
  struct A { double x; };
  const A* a = new A();

  decltype(foo())  x1;  //  type is const int&&      (1)
  decltype(i)      x2;  //  type is int              (2)
  decltype(a->x)   x3;  //  type is double           (3)
  decltype((a->x)) x4;  //  type is const double&    (4) 

As shown by these, decltype is not affected by the limitations of the traditional __typeof__ with respect to reference types. See cases (1) and (4), which show that such types are handled in a consistent way. It is, as such, fully integrated into the modern C++11 type system.

More complex expressions are possible. For example:

  int    i;
  float  f;
  double d;
  
  typedef decltype(i + f)  type1;  // float
  typedef decltype(f + d)  type2;  // double
  typedef decltype(f < d)  type3;  // bool

Decltype turns out to be very useful with templates in the context of generic programming, when the same generic function or class can be instantiated for different types. Via decltype the code can easily query the type of generic expressions, for example:

  template<typename T, typename U>
    void
    foo(T t, U u)
    {
       // ...

       typedef decltype(t * u) ptype;

       // use ptype...
    }

The second part of this tidbit will present more possibilities that decltype enables.

Tuesday Apr 24, 2012

C++11 Tidbits: Explicit overrides and final

GCC 4.7.0 was released on March, 22, and among many other new C++11 features (http://gcc.gnu.org/gcc-4.7/cxx0x_status.html), it is now also possible to use the identifiers (1) 'final' and 'override' in a special way. For instance (see override*.C in the C++ testsuite for many other examples):

  struct B1 final { };

  struct D1 : B1 { };              // "cannot derive from 'final' base"

The code is rejected, because D1 tries to derive from B1, which is decorated as 'final'. Likewise, for virtual functions:

  struct B2
  {
    virtual void f() final {}      // "overriding final function"
  };

  struct D2 : B2
  {
    virtual void f() {}
  };

This is rejected too, because D2::f tries to override B2::f, which is decorated with 'final'. Also this is rejected:

  struct B3
  {
    virtual void g(int) {}
  };

  struct D3 : B3
  {
    virtual void g(double) override {}    // "does not override"
  };

This shows that thanks to the 'override' identifier, another whole class of programming errors can be easily avoided. These errors occurred where, due to a wrong signature, a new virtual function was inadvertently declared instead of overriding an existing.

Note however, that all of this isn't just syntactic sugar, useful to avoid many common programming errors. A call to a virtual function marked 'final', like B2::f above, is devirtualized by the GNU C++ front-end thus leading to much more efficient code. Likewise this occurs for any virtual function declared in a class decorated with 'final' as whole, for example B1 above.

(1) Technically those are existing identifiers which acquire a special meaning in some situations in C++11.

Wednesday Mar 07, 2012

C++11 Tidbits: Template Aliases

In terms of ISO C++ Standardization the concept of Template Aliases dates back to 2002 although, at that time, many people used the name "Typedef Templates". In subsequent years Gabriel Dos Reis and Bjarne Stroustrup, in particular, developed the idea into the form which is now part of the C++11 Standard (n2258 is the last free standing paper describing it).

In GCC the implementation is, unfortunately, rather recent. Only the forthcoming GCC 4.7 release series (release candidates are out for testing!) will include it. Maybe the delay is also the fault of library people not pushing for it strongly enough: it's relatively simple to implement and it helps the C++11 library in important areas such as containers and memory.

As it happens the core idea is pretty easy to explain: provide an alias for a family of types. Note that there are no new keywords involved: the using keyword is 'recycled' for the new syntax. For example (directly from n2258):

  #include <vector>

  template<class T> struct Alloc { /* ... */ };
  template<class T>
    using Vec = std::vector<T, Alloc<T>>;

  Vec<int> v;   // same as std::vector<int, Alloc<int>> v;

Here using declares a name Vec for a family of types, which are all the possible std::vector<T, Alloc<T>>, for any T. Then the template alias can be used as any other template, thus by providing between angle brackets an argument for T.

Much more complex examples are possible but there isn't much more to the syntax. It may be interesting to see another simple one which historically motivated the proposals. It is directly inspired by the rebind member of the std::allocator class - which many novice programmers find quite obscure:

  template<typename T>
    class allocator
    {
      //...
 
      template<typename U>
        struct rebind { typedef allocator<U> other; };
    };

   allocator<T>::rebind<U>::other x;  // sample usage

this can now be rewritten:

  template<typename T>
    class allocator
    {
      //...

      template<typename U>
        using rebind = allocator<U>;
    };

   allocator<T>::rebind<U> x;         // sample usage

Indeed, in the C++11 library there are many rebind-style template aliases in the areas of memory allocation, pointer traits, and everywhere: just grep inside the current GCC headers for concrete examples!

Monday Feb 13, 2012

C++11 Tidbits: User-defined Literals

Let's have a look at a nifty small feature, which people learning to program in C++11 will like a lot.

Consider the following self contained program:

   #include <iostream>

   long double operator"" _mm(long double x) { return x / 1000; }
   long double operator"" _m(long double x)  { return x; }
   long double operator"" _km(long double x) { return x * 1000; }

   int main()
   {
     std::cout << 1.0_mm << '\n';
     std::cout << 1.0_m  << '\n';
     std::cout << 1.0_km << '\n';
   }

and its output:

   0.001
   1
   1000

The new syntax should be rather self-explanatory: the first three lines of the code after the #include define three user-defined operators of a new type, called literal operator. In the example, the three operators know the power of ten corresponding to each SI prefix, that is 1 mm = 10^-3 m, 1 km = 10^3 m, plus the trivial 1 m = 1 m, and are thus able to automatically compute how many meters correspond to each SI length.

In order to start experimenting with this feature, it's useful to know that there are restrictions to the number and type of the parameters of the operators. Essentially (see C++11 or N2765 for details) only the following signatures are legal:

   char const*
   unsigned long long
   long double
   char const*, std::size_t
   wchar_t const*, std::size_t
   char16_t const*, std::size_t
   char32_t const*, std::size_t

where the last four are useful for strings because the second argument is automatically deduced as the length of the string. For example:

   std::size_t operator"" _len(char const*, std::size_t l) 
   { 
     return l;
   }

   int main()
   {
     std::cout << "ABCDEFGH"_len << '\n';
   }

gives the output:

   8

The first signature type I listed, not to be confused with the signatures for strings, is used for so-called raw literal operators. For example:

   char const* operator"" _r(char const* s) 
   {
     return s;
   }

   int main()
   {
     std::cout << 12_r << '\n';
   }

which outputs:

   12

The return type of a literal operator, on the other hand, is not restricted. For example it could be a full-fledged class type like std::string:

   std::string operator"" _rs(char const* s)
   { 
     return 'x' + std::string(s) + 'y'; 
   }

   int main()
   {
     std::cout << 5_rs << '\n';
   }

which outputs:

   x5y

As always, if you are looking for rationale, relationships to similar features in other languages, compatibility with C99, and all the technical details which guided the evolution of the facility until the final standardization in C++11, the early discussion papers are very useful, in particular N1892, in 2005.

User-defined literals are brand new in GCC. Thus, if you are interested in experimenting, a mainline snapshot or a tree checked out via SVN is necessary. On the other hand, you have a chance to help the project to fixing the remaining bugs. In any case the release of 4.7.0 is very close - it is currently aimed for March or April 2012.

C++11 Tidbits: Lambda Expressions (Part 2)

Previously I briefly introduced Lambda expressions. To show some more examples, consider the following:

  std::cout << [](float f) { return std::abs(f); } (-3.5);

The output is, of course, "3.5", because the function object is applied to the argument -3.5 and returns 3.5 as the value computed by std::abs. The return type is automatically deduced to be a float but could also be explicitly specified:

  std::cout << [](float f) -> int { return std::abs(f); } (-3.5);	      

What happens here is that the float 3.5 is truncated to the integer 3. The return type can also be preceded by an exception specification with the exact same meaning as in function declarations (this will be a good topic for a future tidbit because C++1x has interesting news in this area).

The open/closed squared brackets which introduce a lambda are called in jargon the lambda-introducer. Many forms of it are possible adding a lot of flexibility. For example:

  float f0 = 1.0;
  std::cout << [=](float f) { return f0 + std::abs(f); } (-3.5);

outputs "4.5". The equal sign enables the lambda to capture - also a term of art - all the automatic variables in scope by value. Changing the example to

  float f0 = 1.0;
  std::cout << [&](float f) { return f0 += std::abs(f); } (-3.5);
  std::cout << '\n' << f0 << '\n';

outputs "4.5", followed on the next line by "4.5". The ampersand lambda-introducer indicates all the automatic variables are captured by reference. Compare it to this variant:

  float f0 = 1.0;
  std::cout << [=](float f) mutable { return f0 += std::abs(f); } (-3.5);
  std::cout << '\n' << f0 << '\n';

where the automatic variables in scope are captured by value as in the first example but can be also modified in the lambda body, which is not normally the case for variables captured by value and would otherwise give a compilation error. This example outputs "4.5" followed by "1.0".

All the above capture syntaxes establish only the default: the code can specify different modes for individual variables. For example:

  float f0 = 1.0f;
  float f1 = 10.0f;
  std::cout << [=, &f0](float a) { return f0 += f1 + std::abs(a); } (-3.5);
  std::cout << '\n' << f0 << '\n';

outputs "14.5" followed again by "14.5" on the next line. In this case f0 is captured by reference while all the other automatics, i.e. f1, are captured by value.

The gist of the various available syntaxes should be clear enough by now. As always, the web is a precious resource for many more. Experimenting with lambdas together with std::sort, std::for_each and the other numeric facilities in the standard library is also highly recommended.

C++11 Tidbits: Lambda Expressions (Part 1)

Completely new in C++1x lambda expressions may sound a bit esoteric but in fact many programming languages already offer support for defining unnamed functions on-the-fly. Of course many are functional programming languages but there are also Java, Python and other languages in common use. The GCC 4.5.x release series and Microsoft Visual Studio delivered lambdas more than one year ago. In GCC many bugs have been fixed since then and the facility can be considered pretty stable by now. The following, taken directly from the FDIS of the new Standard (*), works as a first introductory example:

#include <algorithm>
#include <cmath>

void
abssort (float *x, unsigned N)
{
  std::sort (x, x + N,
   [](float a, float b) { return std::abs(a) < std::abs(b); } );
}

The lambda expression, introduced by the open square bracket, defines an unnamed function object taking the two parameters a and b and returning the value - of boolean type, automatically deduced - of the comparison between the absolute values of a and b themselves. Then, in the example, the std::sort algorithm can use the so-called closure object to which the lambda expression evaluates.

A future tidbit will explain in more practical applications what the open/closed squared brackets are really about. Until then, you are all encouraged to start experimenting right now!

(*) ISO Document N3242. For a large set of examples the older document N2529 is recommended. See some of the references therein for historical background and rationale.

C++11 Tidbits: Introducing generalized constant expressions and constexpr

Consider the following code snippet:

template<int M> struct F { };
F<std::numeric_limits<int>::max()> f; // Error!

Of course the problem is that numeric_limits<>::max is a function and in C++03 its return value cannot be used to instantiate the class template F. Thus the C-style way of using limits, via macros like INT_MAX, is still unavoidable in C++, at least when templates are involved. Also, code like

const int z = numeric_limits<int>::max();

is legal in C++03 but z is dynamically (ie, at run-time), rather than statically initialized. However, the above max function can be considered 'constant', since its return value is certainly known at compile-time. The rationale for generalized constant expressions, that is 'sufficiently simple' functions generalizing constant expressions, should be now pretty clear. Among the goals of the new idea are also improved type-safety (eg, no macros) and portability for code using compile- time evaluation; improved support for system programming, library building, and generic programming. The old paper N2235 summarizes pretty well these issues, and is still recommended reading even if in the meanwhile quite a few technical details have changed (see N3225 and other recent papers).

Thus, looking inside the current C++ runtime library in GCC reveals that functions like max above are decorated with the new constexpr keyword, i.e., simplifying irrelevant details:

static constexpr
int max()
{ return __INT_MAX__; }

that is, max is declared as a constexpr function. Only sufficiently simple functions (eg, the body must consist of a single return statement, no iteration, no changes to the arguments, etc.) can be declared as such but then (assuming the arguments are in turn constant expressions) the function is completely computed and the return value inlined at each call site at compile-time.

The following are other examples:

constexpr int
square(int x)
{ return x * x; }

constexpr int
abs(int x)
{ return x < 0 ? -x : x; }

constexpr int
fac(int x)
{ return x > 2 ? x * fac(x - 1) : 1; }

float array[square(9)]; // Ok (not C99 VLA)

std::bitset<abs(-87)> s; // Ok

enum { Max = fac(5) }; // Ok

Note that, per the latest specifications, recursion is also allowed. Another clarification, code like:

extern const int medium;
const int high = square(medium); // Ok, dynamic init

is also legal but the call to square boils down to a normal function call, thus high is initialized at run-time because at compile-time the value of medium is not known - it isn't a C++03 constant expression, in other terms.

In C++1x there is also the concept of constant expression data:

constexpr int s = square(5);    // Ok
constexpr int high = square(medium); // error!

And of constant expression constructor:

struct complex
{
  constexpr complex(double r, double I): re(r), im(i) {}
  constexpr double real() { return re; }
  constexpr double imag() { return im; }
private:
  double re; double im;
};

constexpr complex I(0, 1);     // Ok
constexpr double i = I.imag(); // Ok

In GCC 4.6 generalized constant expressions work already pretty well (library bits included) but remember that this is still an uncharted territory, thus, please help testing, file bugs in the GCC Bugzilla, provide feedback, hopefully these introductory notes are enough to get you interested.

Wednesday Feb 08, 2012

C++11 Tidbits: Non-static Data Member Initializers

Hi!

starting this month, thanks also to Chris (Jones) help and encouragement, I'm posting here the "C++11 tidbits" which I usually contribute to the Tools group Linux & VM partner newsletter. Enjoy! ;)

Despite the long name, Non-static Data Member Initializers are a rather straightforward new feature. In fact the GCC Bugzilla reveals novice C++ users often tried to use it in C++98, when the syntax was illegal! It must be said that the same feature is also available in Java, so adding it to C++ makes life easier for people using both languages.

The GCC implementation is brand new. It will be available soon in gcc-4.7.0 but it seems already quite stable and ready to play with.

People looking for self-contained specifications, outside the Standard itself, may consider fetching paper N2756 (and its earlier versions) from the ISO web site for more rationale.

In a nutshell, in C++11 the following are both legal:

  struct A
  {
    int m;
    A() : m(7) { }
  };

  struct B
  {
    int m = 7;   // non-static data member initializer
  };

thus the code:

  A a;
  B b;

  std::cout << a.m << '\n';
  std::cout << b.m << std::endl;

prints '7' followed by '7', because both the 'm' member of 'a' and the 'm' member of 'b' are initialized to 7.

A non-static data member initializer can be always overridden, thus:

  struct C
  {
    int m = 7;
    C() : m(14) { }
  };

  C c;

  std::cout << c.m << std::endl;

prints '14', not '7'.

This is actually very useful in practice, because it allows concisely written classes with many constructors, most relying on non-static initializers while default values are overridden for a few, selected data members. For interesting examples see the ISO papers.

In the examples we have been using built-in integer types, but the feature works with any kind of data member, for example std::string, std::vector, or any user-defined type. It also integrates nicely with other C++11 features like initializer lists. For example, the following is perfectly legal:

  struct D
  {
    std::vector<int> m{4, 5, 6};
  };

The code:

  D d;

  std::cout << d.m[0] << '\n';
  std::cout << d.m[1] << '\n';
  std::cout << d.m[2] << std::endl;

then prints '4', '5', and '6', on separate lines.

More non-trivial examples are available in the GCC test suite under g++.dg/cpp0x/nsdmi* and also in the C++ runtime library internals where the new construct is already exploited for the implementation of <mutex>. See, for example, once_flag, __mutex_base, and <condition_variable>.

As for all the other new C++ features, please don't hesitate to report bugs!

Monday Jun 27, 2011

A short but intense GCC Gathering in London

About one week ago I joined in London many long time GCC friends and acquaintances for a gathering organized by Google (in particular I guess should be thanked Diego and Ian). Only a weekend, and I wasn't able to attend on Sunday morning, but a very good occasion to raise some issues in a very relaxed way, in particular those at the border between areas of competence, which are the most difficult to discuss during the normal work days. If you are interested in a general overview and some notes this is a good link:

http://gcc.gnu.org/wiki/GCCGathering2011

As you may easily guess, the third topic is mine, which I managed to have up quite early on Friday morning thanks to the votes of some good friends like Dodji (the ordering of the topics resulted from democratic voting on Friday evening!). I learned a lot from the discussion: for example that certainly the new C++11 'final' should be exploited largely in the c++ front-end; the various reasons why devirtualization can be quite trick (but I'm really confident that Martin and Honza are going to make a good progress also basing on a set of short testcases which I promised to collect); that, as explained by Ian, the gold linker already implements the nice --icf (Identical Code Folding) facility, which some friends of mine are definitely going to like (however, see: http://sourceware.org/bugzilla/show_bug.cgi?id=12919). I also enjoyed the observations made by Lawrence, where he remarked that in C+11 we are going to see more pointer iterations implicitly produced by the new range-based for-loop and we really want to make sure the loop optimizers are able to deal with those as well as loops explicitly using a counter.

All in all, I really hope we are going to do it again!

Wednesday May 25, 2011

@CERN

Hi,

just a short note, to tell you that a couple of weeks ago I had the pleasure to be invited to give a lecture at CERN, in the "Computing Seminar" series:

http://indico.cern.ch/conferenceDisplay.py?confId=131493

As you may imagine, it was also a great occasion to learn more about the ongoing experiments - @LHC mainly, but elsewhere too - and meet many groups of researchers running almost exclusively Linux, and Oracle Linux in many cases on their computers (*).

Among the many visits to interesting sites in the campus, I have also been to the Computing Center, sponsored by Oracle, where I saw the huge StarageTek machines:

http://itknowledgeexchange.techtarget.com/eye-on-oracle/oracle-with-new-5-terabyte-storagetek-tape-drive/

In the same building, I met afterwords the researchers belonging to the OpenLab project:

https://proj-openlab-datagrid-public.web.cern.ch/proj-openlab-datagrid-public/

and discussed various topics having to do with parallel computing in C++ on Intel and AMD machines. Separately, I also met other C++ power users, belonging to the Geant 4 team:

http://www.geant4.org/geant4/

and to the Root team:

http://root.cern.ch/drupal/

In both cases I learned about their needs vs C++ and its runtime library, and sometimes have been able to suggest specific C++0x features to try together with the latest GCC releases, which potentially could improve their software, from the performance point of view or somehow else.

Overall, I had a great time and enjoyed fantastic hospitality over there (I have to thank Vincenzo Innocente, in particular, for that!)

... only, where I live in Italy, I can have far better perch, believe me! ;)

If you have specific curiosities, don't hesitate to contact me!

Cheers,
Paolo.

(*) As an amusing anecdote, at some point I joined a tour of the LHC Control Room. A physicist said to a bunch of high school students coming from Milan: "We mostly use off shelf computers very similar to what you have at home, only we use Linux, not Windows, because we like to be able to control and change the OS we run". Eh! ;)

Saturday Apr 09, 2011

... FDIS, finally!

[Read More]

Wednesday Jan 27, 2010

A couple of nice C++0x readings

[Read More]

Monday Dec 21, 2009

An old and new std::locale vs multithreading issue

[Read More]
About

C++ enthusiasts only, please! ;)

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today