Over the downtime this past week I've been dabbling again at writing a generic enough function object that I plan to bring into the Boost C++ Library collection. I found out that programming every day with C++, I still find some relatively simple things really hard to do even with the STL. One of these simple(r) problems is selecting the first or second element of a pair contained in STL containers.
The problem that recurs pretty much in every program that uses STL maps/multi-maps is to be able to first select the first or second element from every pair in the container, and the second to be able to apply a function only on the first or second element to every pair in the container. There's already an SGI extension to the STL called select1st and select2nd but these extensions require you to give the type of the pair it will take upon instantiation. This isn't very generic and tends to be a very rigid coupling between the solution that is selecting the element and the type(s) of the elements contained in a pair.
Rudimentary Solution
The first solution I came up with is a simple select object. This first solution is really simple -- it's a function object which determines the result of getting the first/second element of a pair, and then takes a pair as an argument to return the correct element chosen. It looks a little like this:
namespace helper {
template <int N, class T0, class T1>
struct result;
template <class T0, class T1>
struct result<0, T0, T1> {
typedef T0 type;
};
template <class T0, class T1>
struct result<1, T0, T1> {
typedef T1 type;
};
};
template <int N>
struct select;
template <>
struct select<0> {
template <class T0, class T1>
typename helper::result<0, T0, T1>::type
operator() (std::pair<T0, T1> const & input)
const {
return input.first;
};
};
template <>
struct select<1> {
template <class T0, class T1>
typename helper::result<1, T0, T1>::type
operator() (std::pair<T0, T1> const & input)
const {
return input.second;
};
};
Whew! That was a handful. So if you notice, there are a few problems with this approach, and let me go into detail into some of them.
- Repetitive code with full specialization of select. It is obvious that the repetitive nature of the code makes it really brittle and "ugly". However, technically we've captured the required functionality if only for the simpler cases.
- Reference and const awareness in the result type computer. Advanced C++ developers will see that the implementation of the result computer will break in cases where the pair contains references or non-copyable types. This is because we return copies of the element instead of references -- in certain cases as well, this is a performance problem especially if we're copying huge objects around.
- We can only select elements, not immediately perform operations on them. This is because this implementation pretty much makes it impossible to chain this with the other STL function objects (binder comes to mind, among others).
- Reliance on STL pair's. Obviously, if we used TR1 or Boost (Fusion) Tuples, we wouldn't be able to extend this readily -- unless we're prepared to write out all the specializations of the select function object from select<3> upwards.[0]
Let me then try to address these issues in the next sections.
Take 2: Remove Explicit RepititionOur first bullet item talks about repetitive code, and is one of the relatively easier things we can deal with. In this approach, we will require three things: a wrapper type to return a convertible object (convertible to the correct type), a generator function which will determine the types for us at compile time, and a "generalized" element getter. So in this step, we add another layer of indirection we can deal with.
The implementation would then include the following additions/changes:
namespace helper {
template <int N, class T0, class T1>
struct get_impl;
template <class T0, class T1>
struct get_impl<0, T0, T1> {
typedef typename
result<0, T0, T1>::type
result_type;
explicit get_impl(
std::pair<T0, T1> const & pair
) : _pair(pair) { };
operator result_type const & () {
return _pair.first;
};
private:
std::pair<T0, T1> const &
_pair;
};
template <class T0, class T1>
struct get_impl<1, T0, T1> {
typedef typename
result<1, T0, T1>::type
result_type;
explicit get_impl(
std::pair<T0, T1> const & pair
) : _pair(pair) { };
operator result_type const & () {
return _pair.second;
};
private:
std::pair<T0, T1> const &
_pair;
};
// generator function
template <int N, class T0, class T1>
get_impl<N, T0, T1>
get(std::pair<T0, T1> const & pair) {
return get_impl<N, T0, T1>(pair);
};
};
template <int N>
struct select {
template <class T0, class T1>
typename result<N, T0, T1>::type const &
operator() (std::pair<T0, T1> const & pair)
const {
return get<N>(pair);
};
};
The above looks like more code and it is, but it actually allows us to scale our implementation of the act of getting the 'Nth' element [1]. But let me go through the approach we've taken here.
If you notice,
helper::get_impl<...> is convertible to
result_type const &. This means when we return an object of type
helper::get_impl<0, int, double>, it can be converted to an
int that is encapsulated in the pair passed in the constructor of the wrapper object. This approach allows us to make the implementation of
template <int N> struct select { ... } simpler.
The next problem we need to deal with is the issue of reference and const-awareness.
Take 3: Handling References and Const AwarenessSo now we'll have problems if we had pairs that had references included instead of just plain types. It is not impossible to have std::pair
's passed to instances of the selector -- even though highly unlikely to be part of containers. So we'll deal with references, so that we have a proper solution.
To deal with this, we need to check if the type we're going to return in helper::result<...>::type will be the correct reference. So we need to remove the references from T0 or T1, in the typedef. We can do this with Boost.Type_Traits:
namespace helper {
template <class T0, class T1>
struct result<0, T0, T1> {
typedef typename
boost::remove_reference<T0>::type
type;
};
template <class T0, class T1>
struct result<1, T0, T1> {
typedef typename
boost::remove_reference<T1>::type
type;
};
};
So it's getting more and more complex as we go, and admittedly the solution to this seemingly simple problem touches some of the subtleties of the genericity we achieve with C++ and the STL.
To Be Continued...
In the next article in this line, I will dive a little deeper into the problems we're trying to avoid and real-life use cases for this simple utility. For now, you can imagine a simple solution involving an STL algorithm and the select<#>. You can even try it for yourself by getting the latest version of this implementation from the Boost C++ Developers Mailing List (or if you'd like to get it privately, send me an email).
Basically, the idea I want to be able to convey is that in C++, with great power comes great flexibility -- and this flexibility borders between being really generic and being very useful. What this flexibility and power gives us C++ developers are the tools we can use and manipulate to be able to make simple things really simple -- even though implemented in a not so simple manner.
If you haven't been diving deeper into C++'s features to make your solutions more generic and more usable outside the context of your application, this may be a good time to be able to solve bigger problems that not only you are encountering, but also what others are solving. You never know what simple problems you'd run into and what kinds of challenges you'll be facing. Even in this simple case of implementing a generic selector, we're already running into issues you might not be facing in your everyday programming -- but it allows us to get more acquainted with the very powerful C++ features already available to you at your disposal.
Notes & References
[0] - Surprisingly enough, the solution underneath any tuple supporting approach pre-C++0x will require having to use Preprocessor Programming and to some extent code generation/repitition, albeit in a "managed" manner.
[1] - I'm still currently testing the implementation of the extension to supporting TR1/Boost.Fusion tuple selection. The next article should follow soon with more details as to what's involved in making that implementaiton possible.