Before my current job, I actually had not heard of PostgreSQL. But it turns out to actually be a pretty prominent SQL server.
There are some weird quirks to Postgres and the management tools are pretty meh compared to SQL Server Management Studio, but there are a lot of neat features I keep bumping into. One such feature is the generate_series()
function. Although a table with one column of consecutive integers sounds boring, there are a lot of interesting uses for having a “numbers table.” For example, when you run a SELECT sum(data) FROM table GROUP BY date
query, you might have missing dates where the sum is zero. If you use your numbers table to add days to a start date, you can join that to your query to make sure no days are missed. However, Postgres makes a numbers table obsolete with the generate_series()
function.
The syntax is simple and the result is what you would expect:
select generate_series(1,4); generate_series ----------------- 1 2 3 4 (4 rows)
So here is the example from above where you want to view grouped data and you want to be sure you don’t miss any days without data. We’re also going to use generate_series()
to make some simulated data!
with simul_data as( --Give me a random date betwen 8/1 and 8/7 select (cast(trunc(random() * 7) as int)) + date '8/1/2013' as myDate --Give me a random number ,cast(trunc(random() * 500 + 1) as int) as data --Make 10 rows from generate_series(1,10)) ,full_dates as ( --Select every date between 8/1 and 8/7 select generate_series(0,6) + date '8/1/2013' as fulldate ) --If we do a regular aggregate, here's what you get: select mydate,coalesce(sum(data),0) as data_sum from simul_data group by mydate order by mydate; fulldate | data_sum ------------+------------ 2013-08-01 | 566 2013-08-02 | 455 2013-08-03 | 291 2013-08-04 | 598 2013-08-06 | 741 2013-08-07 | 476 (6 rows) --Notice the missing date? To force it in place, use a join. select fulldate,coalesce(sum(data),0) as data_sum from full_dates left join simul_data on full_dates.fulldate=simul_data.mydate group by fulldate order by fulldate; fulldate | data_sum ------------+------------ 2013-08-01 | 566 2013-08-02 | 455 2013-08-03 | 291 2013-08-04 | 598 2013-08-05 | 0 2013-08-06 | 741 2013-08-07 | 476 (7 rows)
I have one more example that is a bit esoteric, but I actually used it to generate a report the other day. One of our database tables has a unique two-digit identifier that consists of two letters. I wanted to see which of the 262 two-letter codes were still available. To do this, I used generate_series()
and chr()
to give me a list of letters. I then created a Cartesian product of the data which I could join with the live data.
with list as( --65 in ASCII is "A" and 90 is "Z" select chr(generate_series(65,90)) letter ) select t1.letter||t2.letter combo from list t1 --join every letter with every other letter cross join (select * from list) t2; combo ------- AA AB AC AD AE [...] ZV ZW ZX ZY ZZ (676 rows)
Know any other nice uses of generate_series()
or Postgres in general? Share them in the comments!