Oleg (obartunov) wrote,

Hstore development for 9.4 release

In this post I summarize what we have done already for hstore on the way to PostgreSQL 9.4 and what we plan to do. Most results of our development we already presented at PGCon-2013 in Ottawa this May (check our slides, thanks Engine Yard for support !), so here are some comments.

1. Original hstore is a datatype for storing key-value pairs with rich set of functions and operators. Hstore quickly became very popular because of its good performance (provided by GIN, GiST indexes) and flexibility of KV-model. It is mature (10 years) and stable PostgreSQL extension.

2. PostgreSQL needs an efficient support for document-based model (CouchDB, MongoDB,...), which is currently can be modelled using json datatype. Json is a relatively new datatype with one major drawback - it's basically a string, which should be parsed and analysed every time it accessed, so query can be awfully slow.

3. Hstore performance is based on its binary representation, which means that hstore doesn't stored on disk as raw data (string), but in structure, contained keys and values, so access to them doesn't require parsing. Moreover, quick access to keys and values, gave us an opportunity to develop indexes support.

There is another benefit of using hstore in compare with simple (id,key,value) model, connected with header overhead (36-bytes?) in PostgreSQL. Instead of storing, say 20 rows (id,key,value), hstore can store all of them in one row and save storage.


1. Json is a text string
SELECT '{"a":1,"b":2}'::json;
0000: 7B 22 61 22 | 3A 31 2C 22 {"a":1,"
0008: 62 22 3A 32 | 7D b":2}

2. Hstore is a binary string
SELECT 'a=>1, b=>2'::hstore;
0000: 02 00 00 80 │ 01 00 00 80 ☻ А☺ А
0008: 02 00 00 00 │ 03 00 00 00 ☻ ♥
0010: 04 00 00 00 │ 61 31 62 32 ♦ a1b2

Quick performance comparison using synthetic data:

CREATE TABLE hstore_test AS (SELECT 'a=>1, b=>2, c=>3, d=>4, e=>5'::hstore AS v FROM generate_series(1,1000000));

CREATE TABLE json_test AS (SELECT '{"a":1, "b":2, "c":3, "d":4, "e":5}'::json
AS v FROM generate_series(1,1000000));

SELECT sum((v->'a')::text::int) FROM json_test;
1291,060 ms
SELECT sum((v->'a')::int) FROM hstore_test;
303,267 ms

We can see, that just binary representation (no indexes) of hstore gave us about four times win in compare with json.

4. Json is the widely used standart  and PostgreSQL really needs an efficient json datatype (already popular even as a string),  so we decided to develop a binary representation for nested objects with array support, which can be used for hstore and eventually for json.

Why did we choose hstore and didn't not start with json ? Json is a core datatype, any changes in core usually requires more efforts, while hstore is an extension we could play with. It was not clear in advance if we will succeed, so we wanted to have a freedom in our prototyping. Once we have an efficient binary representation,  other developers could start working on json support.

5. So the results of our development are:

a) hstore now can be array-like '{a,b}', '{a}', hash-like 'a=>1, b=>c', '{a=>b, b=>"c"}' and their combination '{a=>1}, {1,2,3}, {c=>{d,f}}'. See slides 12-18 for hstore internals.

b) it has rather weak limitations (see slide 19)

c) There are several GUC-variables, which configure the look of hstore at output to easy integration with programming languages - which kind of brackets (curly or square) use for arrays and how to embed hstore into brackets.

d) There are several operators for hstore to access values by key, by path, array element can be accessed by its index. Operators can be chained, for example:

=# select 'a=>1,b=>{c=>3,d=>{4,5,6}},1=>f'::hstore %> 'b'->'c';
=# select 'a=>1,b=>{c=>3,d=>{4,5,6}},1=>f'::hstore #%> '{b,d}'->0;

Path is an array of keys, which describes a specific position in hstore, an operator starts to work from.

e) hstore can be converted to json and vice versa, for example:

=# select hstore_to_json('a=>1,b=>{c=>3,d=>{4,5,6}}'::hstore);
{"a": "1", "b": {"c": "3", "d": ["4", "5", "6"]}}

=# select json_to_hstore('{"a": "1", "b": {"c": "3", "d": ["4", "5",
"a"=>"1", "b"=>{"c"=>"3", "d"=>["4", "5", "6"]}

These functions are used for casting - json::hstore, hstore::json. There is one problem with such conversion - right now hstore has no types support (value is a string) as json, so number 3.14 in json became a string "3.14"

=# select '{"a":3.14}'::json::hstore::json;
{"a": "3.14"}

and scalar '3.14' became an array with string "3.14"

=# select '3.14'::json::hstore::json;

Lossless conversion json <-> hstore will be possible, once hstore gets types support, which we plan to do for 9.4.

To facilitate working with nested hstore (can be very complex), we provide an ability of pretty printing of hstore in psql, controlled by GUC-variable hstore.pretty_print, see slide 38 for example.

Hstore provides many other operators and functions, which I will not describe here, see more details in presentation.

f) Hstore indexes. Hstore has two entities - keys and values. Index of all combinations (key, values) pairs will be very big, so we index keys and values separately, using prefixes K for keys, V for values and E for array element. Hstore has indexing support (GIN, GiST) for @>, ?, ?|, ?& operators.

e) Performance test. We used 1,252,973 Delisious bookmarks, which we load into database as json (table js), text (table tx) and hstore (table hs) data types. We tested input performance (copy data into table), access performance (get value by key) and search performance (contains operator) on desktop machine running Ubuntu with 8 Gb of RAM and 4-core Xeon 3.2 Ghz.

The bookmarks looks like:
 =# select h from hs limit 1;
     "id"=>"http://delicious.com/url/b5b3cbf9a9176fe43c27d7b4af94a422#mcasas1",                +
     "link"=>"http://www.theatermania.com/broadway/",                                          +
     "tags"=>                                                                                  +
     {                                                                                         +
         {                                                                                     +
             "term"=>"NYC",                                                                    +
             "label"=>NULL,                                                                    +
             "scheme"=>"http://delicious.com/mcasas1/"                                         +
         }                                                                                     +
     },                                                                                        +
     "links"=>                                                                                 +
     {                                                                                         +
         {                                                                                     +
             "rel"=>"alternate",                                                               +
             "href"=>"http://www.theatermania.com/broadway/",                                  +
             "type"=>"text/html"                                                               +
         }                                                                                     +
     },                                                                                        +
     "title"=>"TheaterMania",                                                                  +
     "author"=>"mcasas1",                                                                      +
     "source"=>NULL,                                                                           +
     "updated"=>"Tue, 08 Sep 2009 23:28:55 +0000",                                             +
     "comments"=>"http://delicious.com/url/b5b3cbf9a9176fe43c27d7b4af94a422",                  +
     "guidislink"=>"false",                                                                    +
     "title_detail"=>                                                                          +
     {                                                                                         +
         "base"=>"http://feeds.delicious.com/v2/rss/recent?min=1&count=100",                   +
         "type"=>"text/plain",                                                                 +
         "value"=>"TheaterMania",                                                              +
         "language"=>NULL                                                                      +
     },                                                                                        +

(1 row)

The results are below:

Input performance (copy tt from '/path/to/test.dump'):

text: 57s, json: 61s, hstore: 76s

It was clear, that hstore should be slowest on input, since hstore requires preprocessing, but it shouldn't be such slow, so we will work in iput improving.

=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
public | hs | table | postgres | 1379 MB |
public | js | table | postgres | 1322 MB |
public | tx | table | postgres | 1322 MB |

We see about 4% overhead of hstore, which is expected, since we have to store more informations than json and text, which are just strings.

Access performance

 base: 0.3s, hstore: 0.5s, json: 11s, regexp: 19.8 s.

Here base is time of sequentual scan of table:
    select h from hs;
    select h->'updated' from hs;
    select j->>'updated' from js;

We can use regexp to get key from text:
    select (regexp_matches(t,'"updated":"([^"]*)"'))[1] from tx;

One can see, that binary representaion of hstore provides 50x speedup in compare with json !  speedup = (hstore-base)/(json-base)

Search performance

We tested operator contains @> for hstore (seqscan, GiST, GIN), json (estimation time, functional GiST and GIN indexes using conversion functions ).

    select count(*) from hs where h @> 'tags=>{{term=>NYC}}';
    select count(*) from js where j#>>'{tags,0,term}' = 'NYC';

We don't know how to search over all array elements (tags) in json, so we search just the first element and the resulting time will be lowest estimation of sequentual search time.

hstore:  seqscan    GiST  GIN
            0.98s   0.3s  0.1s
                    3X    10X
json:    estimation GiST  GIN 
            12.0s   2s    0.1s   
                    6X    120X

We see that json could get 120X speedup using GIN index and not much from GiST, since the last is lossy index and recheck calls json_to_hstore() function many times.

So, the summary of our results:

1. Hstore is now nested and supports arrays, which means moving from simple key-value model to rich document-based model.
2. Hstore access to specified field is fast (thanks to binary representation)
3. Hstore operators can use GiST and GIN indexes
4. Json users can use functional GIN index and get considerable speedup
5. Hstore's binary representation can be used by json

Plans for future development:

We have a lot to work :)  For 9.4 release we plan to work on adding types support in binary representation, so it can be used by json and improving indexing. Also, we need to develop hstore query language, which allows better manipulation with hstore elements.

Below are some projects (not in any specific order).

- Speedup hstore input
- Better indexing - SP-GiST-GIN hybrid index
- Statistics support (challenging task)
- Documentation (9.4, need help)
- Submit patch for 9.4
- Add binary representation to json
- Add index support for json

One very interesting improvement can be obtained if our patch to GIN will be accepted. More details about this can be find in other post  Yet another performance improvement for hstore @> hstore ! , so here is only result  - improved GIN+ index can provide an additional speedup for @> operator.  It is still discussed in -hackers list, but I hope it will be eventually  accepted for 9.4.

hstore:  seqscan    GiST  GIN   GIN+
            0.98s   0.3s  0.1s  0.017s
                    3X    10X   57X

I should say, that this work would not be done without support of Engine Yard company and we hope the support will continue.

Tags: hstore, pg, pgen

  • What is the size of jsonb ?

    It's interesting to know how big is your jsonb and it's not trivial as it looks, since jsonb is a binary format and also may be TOASTed (compressed).…

  • Partial TOAST decompression for jsonb

    Inspired by commit support for partial TOAST decompression "When asked for a slice of a TOAST entry, decompress enough to return the slice instead…

  • Bloom index for bigint

    Bloom index by default works for int4 and text, but other types with hash function and equality operator could be supported. Just use opclass…

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your IP address will be recorded