Since the number of active users of Instagram was constantly growing, Postgres remain our solid foundation and the same data store for most of the data that users create. Although less than a year ago, we wrote about how we handle large amounts of data on Instagram husky at 90 per second, now we process more than 10,000 likes per second – and our primary data storage technology has not changed. In the past two and a half years, we understand a few things and picked up a couple of tools for scaling Postgres and we want to share them – what we would like to know when you start Instagram. Some of them are specific for Postgres, others are presented in other databases. To know how we horizontally scalable Postgres, see our post Sharding and IDs at Instagram

1. Partial indexes (Partial Indexes)

If you frequently use in your queries to filter the specific characteristics, and this characteristic is presented in part at the rows of your database, a partial index can seriously help you. For example, when searching for tags in Instagram, we are trying to raise the top tags, which can be found a lot photos. Although we use technologies such as ElasticSearch for more partisan searches in our application, this is the only case where the database is doing well on their own. Let’s see how Postgres acts when searching for tags, sorting them by the number of photos:

EXPLAIN ANALYZE SELECT ID from tags WHERE name LIKE ‘% snow’ ORDER BY media_count DESC LIMIT 10 ;

QUERY PLAN
Limit (cost = 1780.73..1780.75 rows = 10 width = 32) (actual time = 215.211..215.228 rows = 10 loops = 1)
-> Sort (cost = 1780.73..1819.36 rows = 15455 width = 32) (actual time = 215.209..215.215 rows = 10 loops = 1)
Sort Key: media_count
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using tags_search on tags_tag (cost = 0.00..1446.75 rows = 15455 width = 32) (actual time = 0.020..162.708 rows = 64572 loops = 1)
Index Cond: (((name) :: text ~> = ~ ‘snow’ :: text) AND ((name) :: text ~ <~ ‘snox’ :: text))
Filter: ((name) :: text ~~ ‘snow%’ :: text)
Total runtime: 215.275 ms
(8 rows)

Note, Postgres should sort the 15,000 rows to obtain correct results. And since the tags (for example) represent a pattern with a long tail, instead we first attempt to show the tags which have 100 or more images, so:

CREATE INDEX concurrently on tags (name text_pattern_ops) WHERE media_count> = 100

And our query plan now looks like this:

EXPLAIN ANALYZE SELECT * from tags WHERE name LIKE ‘% snow’ AND media_count> = 100 ORDER BY media_count DESC LIMIT 10 ;

QUERY PLAN
Limit (cost = 224.73..224.75 rows = 10 width = 32) (actual time = 3.088..3.105 rows = 10 loops = 1)
-> Sort (cost = 224.73..225.15 rows = 169 width = 32) (actual time = 3.086..3.090 rows = 10 loops = 1)
Sort Key: media_count
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using tags_tag_name_idx on tags_tag (cost = 0.00..221.07 rows = 169 width = 32) (actual time = 0.021..2.360 rows = 924 loops = 1)
Index Cond: (((name) :: text ~> = ~ ‘snow’ :: text) AND ((name) :: text ~ <~ ‘snox’ :: text))
Filter: ((name) :: text ~~ ‘snow%’ :: text)
Total runtime: 3.137 ms
(8 rows)

Note that Postgres should now only get around 169 lines, which is much faster. Postgres’a query planner is also good in the calculation of limits – if you later decide that you want to receive only the tags for which there are no less than 500 photos, ie from a subset of the index – it will still use the correct partial index.

2. Functional indices (Functional Indexes)

For some of our tables, we need to index the rows (e.g., 64-character tokens base64), a relatively long, in order to create an index on it – this will result in a duplication of a large amount of information. In this case, can be very useful functional indexes Postgres’a:

CREATE INDEX concurrently on tokens (substr (token, 0 , 8 ))

Thus, Postgres, using an index prefix is ​​the set of records and then filters them, finding the right one. Index at the same time takes 10 times less space than if we did all of the string index.

3. pg_reorg compression

After the lapse of some time, Postgres tables may be fragmented on disk (due to the competitive model MVCC Postgres’a, for example). Also, most of all, insert the strings are not in the order in which you want to receive them. For example, if you frequently requested all huskies by one user, it would be nice if these huskies were recorded on the disc continuously to minimize disk seeks. Our solution for this is to use a utility pg_reorg, which performs the following steps in the process of optimizing the tables :

Gets an exclusive lock on the table
Creates a temporary table for the accumulation of changes, and adds a trigger on the original table that replicates any changes to the temporary table
Makes using the CREATE TABLE SELECT FROM … ORDER BY, which creates a new table in an indexed order on the disk
Synchronizes the changes from the temporary table, which occurred after was launched SELECT FROM
Changes to the new table

There are certain characteristics in the preparation of blocking and the like, but this description of the general approach. We checked this tool and conducted a number of tests before you run it in Productions, and we spent a lot of reorganization on hundreds of machines without any problems.

4. WAL-E for archiving and backups WAL

We use and contribute to the development of WAL-E , a set of tools Heroku platform for continuous archiving WAL (Write-Ahead Log) files Postgres. Using WAL-E greatly simplify our process bekapirovaniya and start a new replica database. In fact, WAL-E – is a program that archives all WAL files generated by your PG server on Amazon S3, using archive_command Postgres’a. These WAL files can be used in combination with backup database to restore the database to any point from that backup. The combination of normal backups and WAL files gives us the ability to quickly start a new replica is read-only or failover slave (replica in case of failure of the main database). We made ​​a simple wrapper script to monitor repeated failures backup file, and it is available on GitHub .

5. avtokommita mode and asynchronous mode in psycopg2

After a while we started to use the more advanced features psycopg2, Python-driver for Postgres. First – it avtokommita mode (autocommit mode). This mode does not require psycopg2 BEGIN / COMMIT for any requests, each request instead runs in a separate transaction. This is particularly useful for retrieval requests from the database, for which the use of transactions does not make sense. Enable mode is very simple:

connection.autocommit = True

This greatly reduced the communication between our servers and databases, as well as reduced CPU overhead on the database machines. 

(Visited 370 times, 1 visits today)
adminTips
Since the number of active users of Instagram was constantly growing, Postgres remain our solid foundation and the same data store for most of the data that users create. Although less than a year ago, we wrote about how we handle large amounts of data on Instagram husky at...

Do you want to be notified about new DBA updates, releases, jobs and free tips? Join our email newsletter. It's fast and easy. You will be among the first to know about hot new DBA updates and stuff, it will help you enhance your DBA skills.
We take your privacy very seriously