PostgreSQL generate_series() Tricks

Before my current job, I actually had not heard of PostgreSQL. But it turns out to actually be a pretty prominent SQL server. There are some weird quirks to Postgres and the management tools are pretty meh compared to SQL Server Management Studio, but there are a lot of neat features I keep bumping into. One such feature is the generate_series() function. Although a table with one column of consecutive integers sounds boring, there are a lot of interesting uses for having a “numbers table.” For example, when you run a SELECT sum(data) FROM table GROUP BY date query, you might have missing dates where the sum is zero. If you use your numbers table to add days to a start date, you can join that to your query to make sure no days are missed. However, Postgres makes a numbers table obsolete with the generate_series() function. The syntax is simple and the result is what you would expect:

select generate_series(1,4);

 generate_series
-----------------
               1
               2
               3
               4
(4 rows)

So here is the example from above where you want to view grouped data and you want to be sure you don’t miss any days without data. We’re also going to use generate_series() to make some simulated data!

with simul_data as(
    --Give me a random date betwen 8/1 and 8/7
    select (cast(trunc(random() * 7) as int)) + date '8/1/2013' as myDate
    --Give me a random number
    ,cast(trunc(random() * 500 + 1) as int) as data
    --Make 10 rows
    from generate_series(1,10)
)
,full_dates as (
    --Select every date between 8/1 and 8/7
    select generate_series(0,6) + date '8/1/2013' as fulldate
)

--If we do a regular aggregate, here's what you get:
select mydate,coalesce(sum(data),0) as data_sum
from simul_data
group by mydate
order by mydate;

  fulldate  | data_sum
------------+----------
 2013-08-01 |      566
 2013-08-02 |      455
 2013-08-03 |      291
 2013-08-04 |      598
 2013-08-06 |      741
 2013-08-07 |      476
(6 rows)

Notice the missing date? To force it in place, use a join.

select fulldate,coalesce(sum(data),0) as data_sum
from full_dates
left join simul_data on full_dates.fulldate=simul_data.mydate
group by fulldate
order by fulldate;

  fulldate  | data_sum
------------+----------
 2013-08-01 |      566
 2013-08-02 |      455
 2013-08-03 |      291
 2013-08-04 |      598
 2013-08-05 |        0
 2013-08-06 |      741
 2013-08-07 |      476
(7 rows)

I have one more example that is a bit esoteric, but I actually used it to generate a report the other day. One of our database tables has a unique two-digit identifier that consists of two letters. I wanted to see which of the 26² two-letter codes were still available. To do this, I used generate_series() and chr() to give me a list of letters. I then created a Cartesian product of the data which I could join with the live data.

with list as(
    --65 in ASCII is "A" and 90 is "Z"
    select chr(generate_series(65,90)) letter
)
select t1.letter||t2.letter combo
from list t1
--join every letter with every other letter
cross join (select * from list) t2;

 combo
-------
 AA
 AB
 AC
 AD
 AE
 [...]
 ZV
 ZW
 ZX
 ZY
 ZZ
(676 rows)

Know any other nice uses of generate_series() or Postgres in general? Share them in the comments!