Redis, Memcached, Tokyo Tyrant and MySQL comparision

I wanted to compare the following DBs, NoSQLs and caching solutions for speed and connections. Tested the following

My test had the following criteria

  • 2 client boxes
  • All clients connecting to the server using Python
  • Used Python’s threads to create concurrency
  • Each thread made 10,000 open-close connections to the server
  • The server was
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • Fedora 10 32bit
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • 2.6.27.38-170.2.113.fc10.i686 #1 SMP
    • 1GB RAM
  • Used a md5 as key and a value that was saved
  • Created an index on the key column of the table
  • Each server had SET and GET requests as a different test at same concurrency

Results please !

Work sheet

throughput set

throughput get

I wanted to simulate a situation where I had 2 servers (clients) serving my code, which connected to the 1 server (memcached, redis, or whatever). Another thing to note was that I used Python as the client in all the tests, definately the tests would give a different output had I used PHP. Again the test was done to check how well the clients could make and break the connections to the server, and I wanted the overall throughput after making and breaking the connections. I did not monitor the response times. I didnt change absolutely any parameters for the servers, eg didn’t change the innodb_buffer_pool_size or key_buffer_size.

MySQL

MySQL lacked the whole scene terribly, I monitored the MySQL server via the MySQL Administrator and found that hardly there were any conncurrent inserts or selects, I could see the unauthenticated users, which meant that the client had connected to MySQL and was doing a handshake using MySQL authentication (using username and password). As you could see I didn’t even perform the 40 and 60 thread tests.

I truncated the table before I swtiched my tests from MyISAM to InnoDB. And always started the tests from lesser threads. My table was as follows

CREATE TABLE `comp_dump` (
  `k` char(32) DEFAULT NULL,
  `v` char(32) DEFAULT NULL,
  KEY `ix_k` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

NoSQL

For Tokyo Tyrant I used a file.tch as the DB, which is a hash database. I also tried MongoDB as u may find if u have opened the worksheet, But the server kept failing or actually the mongod failed after coming at an unhandled Exception. I found something similar over here. I tried 1.0.1, 1.1.3 and the available Nightly build, but all failed and I lost my patience.

Now what

If you need speed just to fetch a data for a given combination or key, Redis is a solution that you need to look at. MySQL can no way compare to Redis and Memcache. If you find Memcache good enough, you may want to look at Tokyo Tyrant as it does a synchronous writes. But you need to check for your application which server/combination suits you the best. In Marathi there is a saying “मेल्या शिवाय स्वर्ग दिसत नाही”, which means “You can’t see heaven without dieing” or need to do your hard work, can’t escape that ;)

I’ve attached the source code used to test, if anybody has any doubts, questions feel free to ask

19 thoughts on “Redis, Memcached, Tokyo Tyrant and MySQL comparision”

  1. For the purpose of this test an explicitly defined PK would not make any difference. When you don’t define a PK innodb creates a hidden one. The reads would all be secondary key reads anyway with or without a defined PK. However defining innodb_buffer_pool_size large enough to fit the dataset, as opposed to the default 8MB, would be a big help to MySQL results.

  2. You are not testing what you think you are testing. What you are testing is the client interface speed, not the server speed. No one uses the memcache interface to tokyo tyrant in python, use pytyrant. Do not try to compare in-memory databases (redis and memcache) with persistent to disk databases (tyrant and mysql.) If you want to see tyrant run in-memory than the database to use is “*”, which creates an in-memory hash database.

  3. Sorry just curious, why not test with a table using the MEMORY engine for MySQL to be more consistent with memcache and Redis (both in-memory data stores)?

  4. CREATE TABLE comp_dump(
    k binary(20) DEFAULT NULL,
    v char(32) DEFAULT NULL,
    Primary key(`k`) ) ENGINE=InnoDB;

    set @ct:=1;
    insert into comp_dump select unhex(sha1(@ct:=@ct+1)), ‘aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa’ ;

    repeat the next query n time
    insert into comp_dump select unhex(sha1(@ct:=@ct+1)), ‘aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa’ from comp_dump;

    select count(*) from comp_dump;
    | 2097152 |

    time /Users/stephanevaroqui/local/mysql-5.4.1-beta-osx10.5-x86_64/bin/mysqlslap –create-schema=test -c 10 -i10000 -q benchget.sql
    Benchmark
    Average number of seconds to run all queries: 0.001 seconds
    Minimum number of seconds to run all queries: 0.000 seconds
    Maximum number of seconds to run all queries: 0.007 seconds
    Number of clients running queries: 10
    Average number of queries per client: 1

    real 0m23.337s
    user 0m3.988s
    sys 0m12.059s

    cat benchget.sql
    select v from comp_dump where k=unhex(cast(sha1(cast((2000000 ) as unsigned))as char));

    time /Users/stephanevaroqui/local/mysql-5.4.1-beta-osx10.5-x86_64/bin/mysqlslap –create-schema=test -c 10 -i10000 -q benchset.sql
    Benchmark
    Average number of seconds to run all queries: 0.001 seconds
    Minimum number of seconds to run all queries: 0.000 seconds
    Maximum number of seconds to run all queries: 0.028 seconds
    Number of clients running queries: 10
    Average number of queries per client: 1

    real 0m26.644s
    user 0m3.708s
    sys 0m12.229s
    macbook-pro-de-stephane-varoqui:~ stephanevaroqui$ cat benchset.sql
    update comp_dump set v=’bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb’ where k=unhex(cast(sha1(cast((2000000 ) as unsigned))as char));

  5. As pointed by one of the comments, I’d missed on setting that param, due to which MySQL took enough time to do a reverse DNS lookup.

    Each of my thread made a break and create connection for the 10,000 of the requests through each thread, I’m not sure if mysqlslap does a break and connect.

    I enabled the param, and did a re-test and the response from 230s was down to 51s.

  6. It would make difference.

    As the matter of fact innodb keeps all records in the b-tree by pk. Secondary indexes are b-tree with pk as data.

    It means – if you find record by primary key – it’s one search in the b-tree. If you find the record by secondary index – first search is to find pk in the sec. index b-tree, next search is to find the very record by pk in the table b-tree.

    So if you use secondary index you double the work.

  7. InnoDB table must have primary key. Also the non-primary keys are actually indexes on primary key. You must perform the test again.

    I am also surprised that in your test Redis is faster than Memcached. May be it’s true in your case, but in real live, Memcached is bit faster.

    Also I would like to see how TokuDB (www.tokutech.com) and PostgreSQL performing – According Reddit, PostgreSQL is fastest key value store if you use it for that.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>