Ask HN: Anyone Using CockroachDB in Production?

manigandham · on July 18, 2019

Tried it before. Compatibility depends on how complicated your SQL is. There's a detailed list of supported syntax which is getting better every release but it's best for simpler OLTP scenarios. [1] Horizontal scaling and reliability are great. Security setup and certificate management is horrible, and if you don't use certificates then you also can't set passwords for any user so its all or nothing for some reason. [2]

Performance for OLTP queries with high-selectivity using primary keys/indexes is pretty good. Will not be as fast as PostgreSQL because there's consensus required to get the latest data, although it now supports timestamped queries so you can get stale data from replicas for faster reads. Complex queries can slow down or stall the server. Deletes are handled by tombstones so heavy update/delete workloads will cause increasing amounts of slowness in queries until compaction happens so it needs tuning for your workload. [3]

Treats all the servers as one giant global cluster instead of multiple sub-clusters for each regional location so there will be very high latency for writes if you're running in multiple regions. I recommend sticking to a single region with multiple zones instead, or very close regions like US east + central.

Overall its a good solid database choice for core operational data that you want safe. Skip it if you need fast low-latency key/value (use redis or scylla) or large analytical queries (use numerous data warehouse options).

1. https://www.cockroachlabs.com/docs/stable/detailed-sql-suppo...

2. https://github.com/cockroachdb/cockroach/issues/32448

3. https://github.com/cockroachdb/cockroach/issues/32522

smnscu · on July 18, 2019

> Anyone Using Cockroach DB in Production?

Yes, used it in 2017 for a fairly small dataset (order of GBs) and 5-7 nodes running in Kubernetes.

> What do you like about its killer features?

Easy admin and good at scaling horizontally.

> How interchangeable with Postgres was it?

Wire compatibility helps but there were many minor differences at the syntax level. I tried for the longest time possible to maintain compatibility with Postgres syntax and it was painful.

> How is performance?

Back then it was garbage, probably 10-100x slower than Postgres. This was pre-2.0, mind you, I heard that things improved noticeably since then. And most of the problems can be circumvented by not doing stupid things and optimizing your queries/data model. You might want to throw a cache in front of it anyway.

dstroot · on July 18, 2019

This is what I love about HN - a simple question about a somewhat rare technology and a detailed answer.

I tried running Cockroach DB a few years ago too, also on Kubernetes. It worked but today we just use Postgres. Turned out we didn’t really have the problem Cockroach DB is designed to solve. We just needed a reliable performant SQL database.

glerchundi · on July 18, 2019

Using it in production since early 2018 as our data backbone. Other systems are feed by using CDC.

Killer features: Change data capture (to feed elk for full text search & as a event bus for microservice), follower reads, horizontal scaling, easy to manage/operate, compliance by letting user decide where should their data reside (row-level partitioning)

Before choosing crdb we made a deep investigation about the ppl involved, the activity on github, the quality of the process/PR & contributions to the go ecosystems. It’s incredible how high is the quality bar for these folks.

The first version wasn’t that good but since 2.0 things changed A LOT.

Really enjoying it

fastest963 · on July 19, 2019

We used it in production for a few months with around 4k QPS and a P99 of ~100ms. We specifically were using it because it has a "follow the workload" feature where it moves ranges of data to the server/region where they're most used.

We ran into a few panics (crashes) but support was great and they managed to fix the issues within a few days. Ultimately we stopped using it after multiple cascading failures that ended up bringing down the whole cluster. Granted we were running 4-core nodes with high CPU utilization so I'm sure it would've performed better with more cores but it wasn't in our budget.

nephy · on July 18, 2019

We have used it in production for over a year in a highly threaded environment, with multiple terabytes of data, and have had virtually no issues. The support is excellent, but it practically takes care of itself. The Postgres compatibility is very good, and we rarely run into compatibility issues. If you have a highly concurrent OLTP workload, I would suggest using CockroachDB.

luizfelberti · on July 18, 2019

I wanted to try it before, but the clock synchronicity thing scares me too much to dare. How do you even keep them in line? What happens if I'm monitoring and see an elevated clock-skew? Can I even do anything or do things just start to fail mysteriously?

I think CockroachDB will be amazing when cloud providers start offering APIs for High-Resolution Clocks as a service, of even better, if I got to opt-in to the system clock being synced to highres, much like in GCP you can upgrade your instance's network to premium tier on the fly.

thanatos_dem · on July 18, 2019

NTP is plenty accurate for Cockroach to work correctly.

> What happens if I'm monitoring and see an elevated clock-skew?

If you have monitoring that can detect that, NTP should also be able to run and correct it, apart from some very weird failure modes of the network itself (like traffic dropping in one direction).

> I think CockroachDB will be amazing when cloud providers start offering APIs for High-Resolution Clocks as a service

With the hybrid logical clocks that cockroach uses (basically wall time + vector clock), they don’t need to be all that precise. The trade off it makes is giving up linearizability (aka external serializability).

Spanner gets that by making every transaction wait for the maximum clock skew (7ms, with their atomic and GPS clock systems). But in order to run on commodity hardware Cockroach forewent it, in favor of allowing more acceptable skew.

luizfelberti · on July 18, 2019

I'll read up on that, thanks for pointing these things out.

One question though, doesn't CockroachDB also have a holdout period of 250ms (default) when reading across shards, kind of similar to Spanner's 7ms holdout on writes? I remember reading something vaguely similar to that somewhere.

thanatos_dem · on July 18, 2019

As far as I’m aware there is no explicit holdout period on reads, however, on read you do need quorum of the range holders for consistency sake. The exception here is if you’re talking to the “leaseholder” of that data, in which case it can respond to you directly.

If you have a workflow that consistently exercises the same ranges, leaseholders will be moved to the nodes that you’re connected to, in a process called “follow the workload”.

If you run a load generator against a cluster you’ll see that - for the first 30-90 seconds or so it’ll have some latency, but after some period of time, the leases should be moved closer to your generator and you should see latencies drop.

When you’re looking at data access patterns and partitioning you should keep this in mind, as it can be a big performance improvement if properly considered.

manigandham · on July 18, 2019

NTP works fine. This is a very minor risk. VMs on cloud providers are already preconfigured to use NTP with their own time servers so they are very reliable [1][2] and you can always add your own monitoring [3]. Also CRDB is tested with clock-drift and you can read the jepsen report [4].

1. https://aws.amazon.com/blogs/aws/keeping-time-with-amazon-ti...

2. https://developers.google.com/time/

3. https://docs.datadoghq.com/integrations/ntp/

4. https://jepsen.io/analyses/cockroachdb-beta-20160829

Havoc · on July 18, 2019

>NTP works fine. This is a very minor risk.

Until it doesn't. Low probability large impact doesn't directly translate to minor risk imo

craftinator · on July 18, 2019

I'm terrified of meteors too!

scottlocklin · on July 18, 2019

Great; so have you used CockroachDB in production? I bet you haven't!

manigandham · on July 19, 2019

Yes, read my other comment at the top of this page.

barbecue_sauce · on July 18, 2019

Can we expand this question to include other so-called NewSQL database technologies like TiDB, YugaByte, NuoDB, Citus, etc.?

manigandham · on July 18, 2019

TiDB is mysql-compatible, written in Go, and runs on top of TiKV which is a custom key/value store written in rust. It's rapidly developing and serves well for a horizontally scaled simpler OLTP workload. It can handle OLAP queries better than CRDB but neither are great at it. It also has more moving pieces with the placement driver, TiKV and TiDB layers so install is harder.

Yugabyte is a proprietary C++ document-store engine called DocDB running on top of RocksDB. It offers access in multiple APIs with Redis, Cassandra CQL and now PostgreSQL. Dataset can only use a single API so you pick when creating each one. The PG part is getting the most focus right now and is very capable since it takes the planning part from PG itself. Supports more advanced queries than CRDB. More moving pieces again with a separate Master and Tablet layers. Great UI with cloud integration and multi-regional management built in. Also recently made all enterprise features free for open source too.

Citus is an extension, basically an automatic sharding system for PG that works across servers. It's bottlenecked on a single master node working with multiple workers but there's an experimental branch called Citus MX to make every node master-capable. Since all servers are just running PG, you have all the expected functionality and extensibility. All normal tables in a database live on the master (which can be limiting) and only "distributed" tables will be stretched across worker nodes. If you have a good scheme for sharding (like TenantID) then all data for tables sharing the same shard key can live on the same server and get 100% query functionality. Distributed SQL queries are not as good and can have strange failures. There was big recent update to improve this but you will have issues with very complex queries and joins. Interesting use-case for OLAP workloads that need OLTP like updates with indexes and constant writes. Microsoft uses this for a project and now owns the company so its available on Azure.

No experience with NuoDB but another mention is MemSQL for a distributed relational data warehouse that has a unique in-memory rowstore and on-disk columnstore model for fast OLTP and OLAP query support. The best support for complex SQL out of all of these and will probably run anything you send it. MySQL compatible and one of the most polished database products. It's expensive and proprietary although now has a free limited community edition, but extremely fast for analytics while maintaining usability.

jinqueeny · on July 19, 2019

Impressively informative! Regarding the install/deployment about TiDB, we have build tools such as open-source tools such as TiDB-Ansible for on-prem and TiDB-Operator for kubernetes. Here is a list of deployment instructions:

Ansible: https://pingcap.com/docs/dev/how-to/deploy/orchestrated/ansi...

Kubernetes (using TiDB-Operator): https://pingcap.com/docs/dev/how-to/deploy/orchestrated/kube...

Terraform on AWS: https://github.com/pingcap/tidb-operator/tree/master/deploy/...

Docker: https://pingcap.com/docs/dev/how-to/deploy/orchestrated/dock...

Binary: https://pingcap.com/docs/dev/how-to/deploy/from-tarball/prod...

manigandham · on July 19, 2019

Adding on: there's also Vitess for anyone looking for automatic sharding for MySQL. It's not an extension but works as as middleware that operates mysql instances for you and has great support on Kubernetes. Lots of big companies as users and originated from Youtube. The system is written in Go and has unique features like table materialization and resharding ability.

gigatexal · on July 20, 2019

Vitess is far too complex to run imo unless you want to become a vitess-admin. We tried it for good 6-weeks to get a working PoC and moved to CRDB and had one working in hours.

Havoc · on July 18, 2019

Curious how/why you know what seems to be pretty esoteric knowledge/experience?

manigandham · on July 19, 2019

It's all personal experience and research from using these database products. I've also talked to the devs for each one, including here on HN sometimes.

What's esoteric about it though? I consider this to be standard knowledge if you're going to be investing into a database.

jinqueeny · on July 19, 2019

Disclaimer: I work for TiDB.

The incomplete list of TiDB adopters: https://pingcap.com/docs/v3.0/adopters/ and some of their adoption stories: https://pingcap.com/success-stories/

If you read mandarin, here are more: https://pingcap.com/cases-cn/

sergiotapia · on July 18, 2019

Sure - but I asked because I'm specifically interested in crdb.

dsl · on July 18, 2019

I tried to use CockroachDB for a side project. Gave up after a month on and off of trying to migrate a x00 GB MySQL database to it.

If you are starting a new project and willing to build on it, it looks pretty awesome. But there is zero migration capabilities, which in my mind would be the core focus of a product that pitches drop in scalability.

manigandham · on July 18, 2019

It's Postgres-compatible so MySQL will be more difficult to migrate seamlessly, but they do documentation detailing the process and functionality to restore a database dump from object storage: https://www.cockroachlabs.com/docs/stable/migrate-from-mysql...

Did this not work?

dsl · on July 18, 2019

Sure - but the majority of customers with cash who want to migrate to something better are not going to be coming from Postgres.

The MySQL import does not work for a lot of edge cases. They are aware of this and have open bugs that don't seem to be moving quickly. :/

dongxu · on July 19, 2019

If you want a MySQL-compatible NewSQL database, maybe you can try TiDB.

mdekkers · on July 19, 2019

Used in DC/OS for the Identity and Access Manager https://docs.mesosphere.com/1.13/overview/architecture/compo...

mandeepj · on July 18, 2019

Not directly related.

But you can definitely find some case studies from here - https://www.cockroachlabs.com/community/tech-talks/

the_common_man · on July 18, 2019

Does anyone know how much the support costs?

tracker1 · on July 18, 2019

Yeah, without at least a few lower-level support pricing examples, I just assume it's too much and move on when I've looked at it... the backup limitations for community installs is a no go for me.

gigatexal · on July 20, 2019

We are using it in production on Kubernetes. As a former MS SQLServer DBA here are some thoughts:

Firstly, if you're using SQLAlchemy as anything more than a query builder rethink your approach and act accordingly. That being said, maybe rethink the use of an ORM at all -- but that's my pet peeve it might not be yours. SQLAlchemy in particular at least in versions 2.x just doesn't like the retryable transaction approach that CRDB takes. We've had to move to modifying SQLAlchemy in our application to get it to co-operate. Had we been using stored procedures or raw sql this would not have been a problem.

We started with v2.1.x and are now on v19.1 with a commercial license and overall are happy with the support: we have direct access to the engineers via slack, a ticket escalation process with quick turnaround times, and monthly meetings -- all that help us with the edge cases one is likely to hit with a new database.

Even the smallest requests get heard: I had been using one of the betas to check out what would be coming in the admin UI and didn't like the SQL query plan at the time and mentioned it in passing and a product manager reached out and picked my brain and my input along with others made it into the shipping release. The turnaround and the professionalism was impressive.

Our ops team loves CRDB. We set it up in K8s and it just works. It heals itself if a node goes down and replication ( a huge pain the ass with MySQL) is seamless. That said, they have a managed CRDB option, which if we didn't insist on running our own infrastructure I would have signed our company up for instead of rolling it ourselves.

I'd say in our use case properly setup schemas with good indexes doing queries that are not OLAP like, but OLTP in nature, perform very well.

We went with CRDB about 6 months ago because it had the best Kubernetes support at the time and was much easier to get started with.

We recently did a Mysql 5.7 to CRDB migration (actually a number of them for our partners) and it went smoothly. We did many trial runs, but I wrote some MySQL dump and import scripts in python and we just waited for it to complete. The documentation helped there a lot.

It's not perfect, nothing is. We had hoped for a simple drop in replacement for a single node Postgres instance but one that could scale and be "cloud native". It is sort of like that. With more nodes you have the overhead of consensus but it scales with more nodes almost linearly in performance, and it's robust -- we can kill a node or two (depending on your replication factor you can determine how many nodes you want to allow loss of) and bring it back up again and it's fine. There's novel approaches with interleaving that help performance, the documentation is great, the support is as well.

Best thing to do is try it out -- you can play around with a paid license for free -- and try it on your data and workloads.

noncoml · on July 18, 2019

What other solutions are you evaluating?

redis_mlc · on July 18, 2019

As a DBA, I can say that it takes 5 to 10 years for a new database or file system to be ready (reliable, few global mutexes, third-party tools, programming language support, etc.) for production with paid customers.

So you should be using MySQL or Postgres for the Source of Truth (SoT) and things like Cassandra or CockroachDB for specialized use cases (caches, end-user metrics with TTL, etc.) But I would never put important data in a distributed database since there's too many moving parts and it's impossible to verify if the input matches the output over a decade.

If you need multi-master for failover today, I would pick Percona XtraDB Cluster (I've used it in production for 5 years) or MySQL 8.

https://dev.mysql.com/doc/refman/8.0/en/group-replication-mu...

If you don't have paying clients, then it really doesn't matter what tools you choose.

An example of the above is MongoDB, irrationally beloved by HN. Until recently it had a global write mutex and a POS (myisam-class) storage engine. After most companies migrated off of it for production, there are two Innodb-class storage engines now (native and WiredTiger) that add credibility ... after a decade of data loss and poor write performance. Those new engines will be ready for heavy-duty use in ... about 5 years.

StreamBright · on July 19, 2019

Exactly. I also migrated companies off from MongoDB after a total data-loss that even the vendor could not recover or explain.

Postgres is surprisingly capable, rock solid service that is always my first choice. I also used Riak in production for several years and the biggest problem was with it that the operations team forgot the root password because they did not need to login for a long time. :) I have heard great things about Percona XtraDB but I never used it. RocksDB is also interesting but it is just a data store not a full fledged service, based on it there is one service called TiDB that I have heard is very reliable.

manigandham · on July 19, 2019

TiDB is not based on RocksDB, it runs on top of TiKV which is a custom distributed key/value store written in rust.

StreamBright · on July 19, 2019

I thought it uses RocksDB and the Rust interface they wrote.

https://github.com/pingcap/rust-rocksdb

https://github.com/tikv/tikv/blob/master/src/storage/kv/rock...

jinqueeny · on July 20, 2019

TiKV is built on top of RocksDB, see our chief engineer’s speech of how we used RocksDB in TiKV: https://pingcap.com/blog/2017-09-15-rocksdbintikv/

Because of the write amplification issue of RocksDB, we build a RocksDB plugin, TiTan. Here is its design and implementation detail: https://pingcap.com/blog/titan-storage-engine-design-and-imp...

StreamBright · on July 20, 2019

Excellent thank you.

winrid · on July 19, 2019

One small data point - our small MongoDB cluster serves around 2gb/s traffic of times. Much more reliable than it was three years ago.

Irishsteve · on July 19, 2019

Found your post interesting. Thats it :-)

oconnore · on July 18, 2019

Also, how "exciting" is it?

manigandham · on July 18, 2019

Meaning what exactly?

oconnore · on July 29, 2019

"Exciting" technologies tend to keep you up on the weekends. The opposite of: BoringSSL.

0xdeadbeefbabe · on July 18, 2019

Something like I'd like to pretend people don't make technology choices out boredom, but I'm tired of pretending.

quickthrower2 · on July 19, 2019

Of course not. They make tech choices for their CV.