PHP & Redis at mig33

Timothee Groleau @ mig33

2012-07-11

Presenter Notes

About me

French! Living in Singapore since 1999

Software Engineer at mig33, caring for the web team

Started in web frontends (Flash 5,6,7,8 + ActionScript)

Moved to JS + PHP

Enjoys servers, linux, cli, open source

Shameless plug: http://panocraft.com

Presenter Notes

About mig33

asset_logo_mig33_cropped.png

Social networking platform targeting emerging markets, primary on low-end mobile devices

Founded 2005, yet still (somewhat) in startup mode

  • small (but growing) engineering team (~20)
  • rapid iterations (weekly to biweekly releases)
  • experimentations encouraged
  • nothing perfect, nothing frozen, constantly improve and move on
  • we work in shorts and T-shirts!

Presenter Notes

Experimentations on all fronts: Technology, Processes, Development practices. e.g. Practicing with internal rest services before publishing a public rest API.

About mig33 – Some numbers

66m Registered Users

1.4m Daily Actives Users

130k Concurrent Users at peak

Operations:

  • Over 10k gateway requests per second at peak (830m requests daily)
  • Over 1k HTTP requests per second at peak (60m requests per day)
  • 2m account transactions daily

Data:

  • Main database is at 1TB, growing at 2GB per day.
  • Membase cluster at 97GB, growing at 1.7GB per day
  • Redis cluster at 110GB, growing at .5GB per day

Presenter Notes

About mig33 – Infra

Data Center in San Jose, California (SJC)

  • ~100 linux servers, grouped in functional clusters
  • DB for core mig33 data, core services
  • Usual stuff: RAID, redundancy over separate racks, independent power supplies, etc.

AWS

  • S3 for backups and archives
  • S3 + Cloudfront for CDN
  • EC2 (~60 servers) for miniblog, devcenter

Presenter Notes

About mig33 – Infra

mig33_arch_diagram.png

Presenter Notes

Managing the infra

To be blunt

Best software is worth nothing if your infra and ops sucks

Who's doing it?

Jedi-Level ops team in Australia ( Spry: http://www.sprybts.com/ )

mig33's own ops team in KL

DevOps team in Engineering in SG, bridges app and infra knowlege

Some tools we use

Puppet for the (almost) entire infra (both SJC + AWS) http://puppetlabs.com/

Nagios for monitoring and alerts http://www.nagios.org/

Cacti for near real time visualization http://www.cacti.net/

Presenter Notes

About mig33 - SFTW stack

  • J2EE, JBoss
  • Mysql
  • Apache
  • PHP
  • Memcached
  • Redis
  • MogileFS
  • Varnish
  • SVN, Git
  • Membase
  • RabbitMQ
  • ElasticSearch
  • Hadoop

mig33 is an open source friendly company! And we're running all these on CentOS boxes

Presenter Notes

mig33 is an open source friendly company! And we're running all these on CentOS boxes

Web and PHP

16+2+2+2+2 web servers

Vanilla Apache + module (e.g. mod_security, mod_rewrite, mod_usertrack)

Vanilla PHP packages + php.ini tweaks + some PHP modules (gettext, xcache, imagick)

Multiple usage:

  • Presentation
  • Interaction with data sources (CRUD) (mysql/memcached/redis/java services)

Presenter Notes

we server clusters: migcore, login, migbo, wordpress, devcenter

Web and PHP

Custom PHP framework served us well

Custom framework has multiple view support built-in (wap/ajax/midlet/touch)

Introducing Code Igniter

Usual practices

  • remove server+php identification headers
  • hotlinking prevention
  • libs and app files non servable by web servers
  • stateless servers
  • dependencies through packages and shared libs

shared packages help reduce duplicated code

Reduce Disk IO + latency

  • build tools
  • PHP opcode caching
  • on server variable caching (Xcache)

2 enemies of performance: io and latency

Presenter Notes

  • Custom framework has multiple view support built-in (wap/ajax/midlet/touch)
  • 2 enemies of performance: io and latency

Caches

Several types of cache:

  • In process cache (statics, singletons)
  • Cross-processes cache (Xcache / APC)
  • Cross-servers cache (memcached / redis)

Juggling what we put in what cache

conditions change, what seemed like a good idea might no hold true in the future, be willing to shuffle things around

Usual Cache strategies:

  • Time expiry
  • Content change invalidation

Presenter Notes

Caches

Cache run-down!
  XCache Memcached Redis
cross-server synchronization no yes yes
latency no yes yes
data structure yes no yes
nested data structure yes no no
requires serialization no yes (mostly) yes (sometimes)
supports expiry-based invalidation yes yes yes
supports content-change invalidation not easily cross servers yes yes

XCache still has a place as local server cache, best suited for:

  • almost static data (very long expiry)
  • rapidly changing data (very short expiry)

Content change invalidation really only works with synchronized caches

Presenter Notes

  • Content change invalidation really only works with synchronized caches
  • example of very long expiry: country lists
  • example of very short expiry: settings system

Some Web Lessons

Leave as little as possible for the web server to do. Let your app framework work for you.

example: application logic in url rewrite rules

Create a (safe) system to let you manage out-of releases changes (scales your team)

  • wordpress with json plugin
  • S3 bucket (with local caching)
  • consider pull vs. push

example of out of release changes: marketing messages, landing pages' content, promotions, email templates

Small specific apps/services

Lock down access to internal services to what's needed ONLY.

Presenter Notes

redis-300dpi.png

A fast no-SQL key/value store, with lots of nifty features

Sponsored by vmware

http://redis.io

designed to address some shortcomings of existing system (e.g. memcached)

remote data structure + replication are probably the biggest gains

Presenter Notes

Redis

Used by:

  • mig33 (!!)
  • Flickr
  • Github
  • Stackoverflow
  • Digg
  • Craigslist
  • Guardian

Presenter Notes

Redis – App Benefits

Persistence

  • allows you to use redis as authoritative data store for some things

Remote data structures

  • Maps to native data structures (Simplify code!)
  • Reduces race conditions! (Simplify your life!)
  • Reduces data transfer: get and send what you need!

example: removing an element from a list or set, in memcached, read, remove write. if another process was adding an element in the meanwhile, the write would overwrite the newly added element. work around in that case: distributed locks.

persistence allows us to store some authoritative data into redis, but we only do that for stuff that do not require ACID and are also not critical to lose or miss for a period of time

Expiry

Atomic operations + Transactions

transactions are performed with an optimistic locking mechanism

Pipelining

Search/List keys (including wildcards)

Presenter Notes

  • example for race conditions: removing an element from a list or set, in memcached, read, remove write. if another process was adding an element in the meanwhile, the write would overwrite the newly added element. work around in that case: distributed locks.
  • persistence allows us to store some authoritative data into redis, but we only do that for stuff that do not require ACID and are also not critical to lose or miss for a period of time
  • transactions are performed with an optimistic locking mechanism

Redis – Ops benefits

Dead easy Master/Slave setup

Customizable persistence behavior: AOF / Bg Save rules

Statistics: num ops, memory used, etc.

Single threaded, control over cpu core utilization

Solid! 2 years running, no problem

Presenter Notes

Redis – Challenges

memory bound, no built-in system for horizontal scaling

responsability of apps to implement consistent hashing, or something else

burden on libs and ops to maximize resource usage

persistence compromise: AOF is slow(er), snapshots risks data loss

No value search

No joins

No read-only slaves (being introduced in 2.6!)

Presenter Notes

Redis Data Types

Types

String/Blobs

number/strings

this is where you store your json, sir!

Hashes

key/value value in a key/value store! nesting not supported, but numeric field operations supported as well

Lists

mostly useful for queues

Sets

great for banned lists, user likes, incredible performance with operation like SISMEMBER, funky things like SPOP

supports huge number of entries

remote set operations! union intersection, etc.

Sorted Sets

power of both lists and sets!

go and die "ORDER BY"

Commands on those types (and more)

http://redis.io/commands

take note!

expiry is at key level only!

Presenter Notes

Redis at mig33 – Client Libs

PHP: predis: https://github.com/nrk/predis/

Python: redis-py: https://github.com/andymccurdy/redis-py/

Java: jredis: https://github.com/alphazero/jredis

More clients here: http://redis.io/clients/

sample PHP code:

/* query string parameters optional */
$instance = new Predis_Client('redis://myhost:12345?connection_timeout=2&read_write_timeout=3");
$instance->set("min_level", 10);

$new_count = $instance->incr("landing_page_view");

$key = "visitors";
$value = "gurmit";
$max_age_seconds = 7 * 24 * 60 * 60;
$now = time();

/* pipeline support for insert value, trim by age, and mark for expiration */
$pipe = $instance->pipeline()
    ->zremrangebyscore($key, '-inf', $now - $max_age_seconds)
    ->zadd($key, $now, $value)
    ->expire($key, $max_age_seconds)
    ->execute();

Presenter Notes

Complicated examples, you don't need to worry about it yet

Redis at mig33 – Sorted Set Use case

Leaderboards

  • Keys:
    • LB:GAME:LOWCARD:PLAYS
    • LB:GAME:LOWCARD:WINS
  • Values: username

value could contain more data provided it is deterministic, e.g. '{"username":"bob", "userid":1234, "country":"france"}

  • Scores: Ints, incremented at every relevant action

keys management:

  • principle above applied to multiple keys:
    • LB:GAME:LOWCARD:PLAYS:DAILY
    • LB:GAME:LOWCARD:PLAYS:WEEKLY
    • ...
  • move the keys every day/week

Presenter Notes

values could contain more data provided it is deterministic, e.g. '{"username":"bob", "userid":1234, "country":"france"}, just need to ensure that the data blob is always serialized in a fixed manner.

Demo

Quik Robin, to the command line!

Presenter Notes

Redis at mig33 – Sorted Set Use case

Footprints (time ordered)

  • Keys: 2 keys per user:
    • U:<UID>:ProfileViewed
    • U:<UID>:ProfileViewedBy
  • Values: username
  • Scores: timestamp!

key content management:

  • trim content on write
  • update key expiry on write

Presenter Notes

  • in redis <2.2, once a key had an expiry, it was immutable, that seriously limited the designs we could make
  • for footprints, it meant, we were never setting an expiry and had a garbage collection process on the side

Redis at mig33 – Sorted Set Use case

Game: Fashion Show (avatar beauty contest)

  • large pool of contestants from db (600K+, growing daily)
  • random order
  • no repeat for each user
  • not allowed to vote for same avatar twice in 24 hours

Redis design, part 1:

  • Key: GAME:FSHOW_POOL
  • Values: usernames
  • Scores: random values (!)

Details:

  • query data set daily from mysql
  • push updated set to new key, then use RENAME command (= atomic replace)

Presenter Notes

there's a complex slow query to get the contestants, we run it once a day and iterate through the resultset to inject to into the redis pool

Redis at mig33 – Sorted Set Use case

Redis design, part 2:

  • One new field in the user hash: "FShowIdx"
    • HSET("U:<UID>:Settings", "FShowIdx", 34);
    • for new player, index is selected randomly between 0 and ZCARD(GAME:FSHOW_POOL)
    • pick 3 avatars, use HINCRBY to increment FShowIdx, cycles back to beginning when hitting end of list
  • One key per user per day
    • U:<UID>:FSHOW:VOTES:20120710
    • key type: set
    • values: usernames
    • key expiry: 24 hours

Presenter Notes

Redis scaling at mig33

Redis is memory bound, we need an architecture that scales horizontally, more ram, more servers

Typical approach to clustering:

  • consistent hashing
  • directory-based partitionning

Custom directory-based cluster

mig33_directory_based_sharding.png

Presenter Notes

Redis scaling at mig33

Oversharding

mig33_shard_scaling.png

Scaling the shards

mig33_shards_replication.png

Presenter Notes

Redis scaling at mig33

2 huge servers: 24 cores, 192GB RAM

64 shards + directory instances

Master/Slave across the 2 boxes

IPVS to hide infra complexity:

  • redism.vip: master “host”, for writes
  • rediss.vip: slave “host”, for reads
    • slave vip uses both master and slave servers!
  • Ability to add more slaves where needed, transparently (e.g. more slaves for directory)

Presenter Notes

Redis scaling at mig33

Self-contained cluster definition, 3 redis keys in the directory is all it takes:

  • R:MASTERS: hash
  • R:SLAVES: hash
  • R:WEIGHTS: list

Client lib need only be aware of the directory master and slaves

  • director_master_host:port
  • directory_slave_host:port

Shard assignments for an entity in the directory is a blob

  • R:{Entity} => 4
  • shard address (host:port) from the entity is then found in the R:MASTERS/R:SLAVES keys
  • Client lib typically does some local caching of cluster def and shard assignements

Presenter Notes

Redis scaling at mig33

Cluster and lib is entity-agnostic

Lib exposes a minimalistic api to interact with the cluster

  • getMasterShard({Entity}) -> always returns
  • getSlaveShard({Entity}) -> may return null
  • some utility functions

applications should always have sane default values!

Keys with strict naming convention to allow movement of all keys belonging to an entity

Presenter Notes

  • Applications should always have sane default values!
  • getting a nullfor the handle to a slave shard means the users has not stored any data.

Demo (if time permits)

Quick Robin, to the command line!

Sorry PHP, this one's using python :$

Presenter Notes

Redis Lessons

Fast

redis-benchmark ftw!

Memory fragmentation is a problem (<2.4)

Live restart of instances in a master/slave setup is hard

Delete on read/write + garbage collection

Multiple instances on the same server: disable all bg save rules – make your own save routine

Key names take memory too, use an efficient naming convention

Don't use auto-serialization in blobs, be explicit

Expiry is at key level, not for hash members

Redis can't give you memory info per key or per key namespace.

  • Work around: copy the .rb file locally, boot up a new redis instance from it, use the INFO command to read memory used before, delete all the keys you want, check memory used after (y).

Presenter Notes

Last notes

Getting rid of memcached -> redis in caching-only mode ftw!

redis is a superset of memcached; app speed from atomic ops on data types for speed and getting speed on set/get

http://antirez.com/post/redis-memcached-benchmark.html/

Services! Multiple versions of the same lib is a bad idea

  • Maintenance nightmare
  • App separation more important than the (small) performance gain

Migrations of services / features

  1. Dual app code (new flow, fallback to old flow)

live migration: migrate data as it's being used

  1. Migration Script (optional)

speed up process and handle data migration for inactive users

  1. Removal of dual code

The future:

  • lua scripting server side (>=2.6)! Even less race conditions, huzzah!
  • native clustering/ha capability... still some way to go

Presenter Notes