PostgreSQL generate_series() Tricks

Before my current job, I actually had not heard of PostgreSQL. But it turns out to actually be a pretty prominent SQL server.

There are some weird quirks to Postgres and the management tools are pretty meh compared to SQL Server Management Studio, but there are a lot of neat features I keep bumping into. One such feature is the generate_series() function. Although a table with one column of consecutive integers sounds boring, there are a lot of interesting uses for having a “numbers table.” For example, when you run a SELECT sum(data) FROM table GROUP BY date query, you might have missing dates where the sum is zero. If you use your numbers table to add days to a start date, you can join that to your query to make sure no days are missed. However, Postgres makes a numbers table obsolete with the generate_series() function.

The syntax is simple and the result is what you would expect:

select generate_series(1,4);
 generate_series 
-----------------
               1
               2
               3
               4
(4 rows)

So here is the example from above where you want to view grouped data and you want to be sure you don’t miss any days without data. We’re also going to use generate_series() to make some simulated data!

with simul_data as(
	--Give me a random date betwen 8/1 and 8/7
	select (cast(trunc(random() * 7) as int)) + date '8/1/2013' as myDate
	--Give me a random number
	,cast(trunc(random() * 500 + 1) as int) as data
	--Make 10 rows
	from generate_series(1,10))
,full_dates as (
	--Select every date between 8/1 and 8/7
	select generate_series(0,6) + date '8/1/2013' as fulldate
)

--If we do a regular aggregate, here's what you get:
select mydate,coalesce(sum(data),0) as data_sum 
from simul_data 
group by mydate 
order by mydate;

  fulldate  | data_sum 
------------+------------
 2013-08-01 |        566
 2013-08-02 |        455
 2013-08-03 |        291
 2013-08-04 |        598
 2013-08-06 |        741
 2013-08-07 |        476
(6 rows)

--Notice the missing date? To force it in place, use a join.

select fulldate,coalesce(sum(data),0) as data_sum 
from full_dates 
	left join simul_data on full_dates.fulldate=simul_data.mydate 
group by fulldate 
order by fulldate;

  fulldate  | data_sum 
------------+------------
 2013-08-01 |        566
 2013-08-02 |        455
 2013-08-03 |        291
 2013-08-04 |        598
 2013-08-05 |          0
 2013-08-06 |        741
 2013-08-07 |        476
(7 rows)

I have one more example that is a bit esoteric, but I actually used it to generate a report the other day. One of our database tables has a unique two-digit identifier that consists of two letters. I wanted to see which of the 262 two-letter codes were still available. To do this, I used generate_series() and chr() to give me a list of letters. I then created a Cartesian product of the data which I could join with the live data.

with list as(
    --65 in ASCII is "A" and 90 is "Z"
    select chr(generate_series(65,90)) letter
)
select t1.letter||t2.letter combo 
from list t1
    --join every letter with every other letter 
    cross join (select * from list) t2;

 combo 
-------
 AA
 AB
 AC
 AD
 AE
[...]
 ZV
 ZW
 ZX
 ZY
 ZZ
(676 rows)

Know any other nice uses of generate_series() or Postgres in general? Share them in the comments!

Tagged on: ,

2 thoughts on “PostgreSQL generate_series() Tricks

  1. red

    Thanks for this article!

    I used generate_series this week to quickly populate a table with X of the (quasi-)same row with the following pseudo-query:

    INSERT INTO singular_items (catalog_item_id, tracking_id)
    SELECT
    1 AS catalog_item_id,
    RANDOM() AS tracking_id
    FROM
    generate_series(1, X)

    X had to be crafted manually into the SQL query string every time but this worked wonderfully and took about 30m to insert 1000 rows at once when inserting 1000 rows with 1000 SQL statements took close to five minutes.

Leave a Reply

Your email address will not be published. Required fields are marked *