PrevUpHomeNext

Self Joins

The problem ...

Suppose we want to generate all possible double features. We might try this:

const query<std::tuple<movie, movie>> double_features = join(movies, movies);  // wrong

But let's recall one of the characteristics of the join() free function:

Its value mapper is a new tuple_mapper<T0, T1, ...), in which each member mapper is identical to the respective qi's value mapper.

In our case, then, both member mappers are identical to movies's member mapper -- and hence identical to each other. That's disappointing, because we need two different member mappers: one for the first movie in each pair and one for the second, so that code like:

for (const std::tuple<movie, movie> &df: double_features)
    std::cout << std::get<0>(df).title << " with " << std::get<1>(df).title << std::endl;

can extract one movie into element 0 of df and another movie into element 1.

Here's another way to see the problem. Suppose we want to find all double features that are worth watching, i.e. we want to exclude those that consist of the same movie twice over. Then we might try building the query this way:

const query<std::tuple<movie, movie>> heterogeneous_double_features =
    movies.inner_join(movies, movies->id != movies->id);  // wrong

but it's clear that heterogeneous_double_features, if it could run at all, would not produce any output, since the condition movies->id != movies->id is always patently false.

... and the solution: table_alias

Every table has an alias() method, which creates and returns a table_alias object. Let's start with that:

table_alias<movie> movies_alias = movies.alias();

A table_alias is very much like the table that made it. movies_alias represents the same SQL table that movies represents, and, like movies, it is a kind of query. So it can be executed, or used to build more complex queries. It has a value type (the same as movies's), and it has a value mapper.

And its value mapper is very much like movies's value mapper: it has the same type (class_mapper<movie>), and it knows how to map between movie objects and columns in the target SQL table. But the two mappers are not identical: when movies_alias and movies are used in the same query, their mappers behave as if two distinct SQL tables were involved, and each mapper belonged to a different one.

This lets us solve the first problem case:

const query<std::tuple<movie, movie>> double_features = join(movies, movies_alias);

for (const std::tuple<movie, movie> &df: double_features)
    std::cout << std::get<0>(df).title << " with " << std::get<1>(df).title << std::endl;

And the second:

const query<std::tuple<movie, movie>> heterogeneous_double_features =
    movies.inner_join(movies_alias, movies->id != movies_alias->id);

We can have as many aliases of a table as we want: each call to alias() creates a new one:

table_alias<movie> movies_alias2 = movies.alias();

const query<std::tuple<movie, movie, movie>> triple_features =
    join(movies, movies_alias, movies_alias2);

What we cannot do, however, is make aliases of arbitrary queries. Only tables have aliases. So if we wanted to generate all possible double features of local hit movies, using the query local_hits, we could not do so directly. The solution would be to build a query local_hits2, exactly like local_hits except built using movies_alias instead of movies. Then we would call join(local_hits, local_hits2).

I like to think that a future version of quince will provide alias() on arbitrary queries, but until then, I concede that this is a lapse of compositionality.


PrevUpHomeNext