#256 — May 31, 2019

Read on the Web

Database Weekly

A Look at MetricsDB: A Time Series Database for Storing Metrics at Twitter — Twitter’s time series ingestion service is handling 83 million metrics a second and to scale into the future, they had to seek a new approach. MetricsDB, which went live in 2017, gives an overall cost reduction of 10x and latency by 5x compared to traditional key value stores.

Satish Kotha and Ilho Ye (Twitter)

▶  An Introduction to GridDB: A Scalable In-Memory, NoSQL Time Series Database — GridDB (homepage, in case you find videos annoying) hit our radar briefly in 2016 but appears to have grown up somewhat into an interesting IoT-optimized time series database. It’s open source (AGPL licensed) with the GitHub repo here.

Fixstars Corporation

Studio 3T Makes SQL Migration to MongoDB, Powerfully Simple — Now you can import an entire SQL database to MongoDB using Studio 3T and its new SQL Migration feature.

Studio 3T sponsor

Links to 756 (and Counting) Interesting Datasets — This is a rather varied grab bag of delights from datasets of every death in Tarantino movies and cruise ship inspections to cricket scores and visitors to National Parks.

Data Is Plural

Assessing Your Options for Real-Time Message Buses — Modern real time data processing systems often rely on message buses with stream processing systems sat on top. This post rounds up some of the former, including Apache Kafka, RabbitMQ, and ActiveMQ.



💻 Jobs

Senior Site Reliability Engineer - Invoca (Santa Barbara, CA or Remote) — Join our team of Operations Engineers deploying code to our production SaaS platform & public cloud infrastructure multiple times per day.


Find a DB Job on Vettery — Vettery specializes in tech roles and is completely free for job seekers.


📒 Tutorials and Stories

Using Docker Hub's PostgreSQL Images — Very helpful advice if you want to use the Docker-hosted images to run up Postgres for things like CI jobs or development.

Craig Ringer

Migrating Data from SQL to Dgraph, a Graph DatabaseDgraph is an advanced graph database so the task of turning an SQL-oriented relational schema into something suitable is quite interesting.

Lucas Wang

What Are All These Specialized Databases All About? — This article explores the tools developers and DBAs can use to deliver the insights their business stakeholders demand.

Turbonomic sponsor

How to Automate Daily DevOps Database Tasks with Chef — Translate your manual DBA tasks into an automated set of rules with Chef.

Paul Namuag

How I Decimated the PostgreSQL Response Times for My SaaS — A tale of taking Postgres query response times from an average of ~100ms with peaks to 1 second to a steady 1-10ms.

Tim Nolet

A Tale of SQL Query Optimization“We went from a query time of ~24 mins to 2 seconds, an extremely dramatic performance improvement”

Manish Gill

The Polyglot Problem: Solving the Paradox of the 'Right' Database — One monolithic database can rarely address all of an organization’s data requirements, so the idea of ‘polyglot persistence’ caught on.. but running multiple databases has a cost and downsides too.

Max Neunhoeffer (ArangoDB)

🛠 Code and Tools

PugSQL: A Python Interface for Using Parameterized SQL from Files — Define SQL queries in separate files and then use them on the fly from normal Python code. An interesting idea inspired by Clojure’s HugSQL.


test_db: A Sample MySQL Database with An Integrated Test Suite — A sample database containing about 300,000 ‘employee’ records with 2.8 million salary entries. At 167MB it’s complex enough to be used for fair tests (or practicing your SQL!) without being overwhelming.

Giuseppe Maxia

modssl: An SSL Module for Redis and KeyDB — Any old-school Apache httpd users might bristle at the name, but this is an interesting attempt to promote the idea of bringing SSL to Redis and to expand the capabilities of Redis’s module API.

John Sully