#337 — January 15, 2021

Read on the Web

Database Weekly

When Your Legacy Database Is Outgrowing Itself — The tale of scaling a database to support chess.com, an extremely popular site where you can play chess (surprisingly!) A lot of scaling issues popped up during lockdown as everyone rushed online to play games and here’s how they dealt with it using their homebrew MySQL clustering approach.

Aleksandar Ikonic

Have The Tables Turned on NoSQL? — After you get over the clever headline, this is a reflection on the growth of NoSQL solutions in the prior decade, their weaknesses, and the resurgence of SQL as a ‘gold standard.’

John Biggs and Ryan Donovan (Stack Overflow)

A Global Serverless Database for All Your Applications — Modernize your OLTP database to a 100% operations-free approach without compromising on the capabilities that you need for building industrial grade enterprise applications. Fauna is the “Data API” for modern applications.

Fauna sponsor

Cockroach Labs Raises $160M on $2B Valuation — A big raise for the creators of CockroachDB, a scalable, widely-distributed ‘cloud native’ transactional SQL database system. Doubling down on product development is their intention with the cash and I wouldn’t be surprised to see an IPO in the medium term future.

Cockroach Labs

An Unlikely Database Migration — We’ve all simply used text files (whether CSV or a lump of JSON) as adhoc databases, and so was the case at Tailscale, a networking company. Here’s how they migrated from a single JSON blob ‘database’ to etcd to help their control plane scale into the future.

Brad Fitzpatrick and David Crawshaw

License Changes Coming to Elasticsearch and Kibana — Echoing similar moves by MongoDB a couple of years ago, the creators of Elasticsearch are keen to ensure anyone offering Elasticsearch as a service contribute code back or license it commercially. This poses risks for Elasticsearch users, however, says VM Brasseur.

Shay Banon (Elastic)

The Always-On Time-Series Database: Keeping Up Where There's No Way to Catch Up — A sort of written round-table discussion where three data architects/scientists talked about time series databases that can handle huge amounts of overlapping data without conflicts. This is a good example of thinking through a problem and would probably have made a good podcast session..

ACM

Cross-Account Replication with Amazon DynamoDB — Demonstration of a cost-effective way to migrate and sync DynamoDB tables across AWS accounts while having no impact on the source table performance and availability.

Ahmed Saef Zamzam and Dragos Pisaro

Re-Introducing Hash Indexes in Postgres — Billed as the ‘ugly duckling of index types’, this is a look at an uncommonly used type of index whose use was discouraged in the past but whose performance can sometimes beat the trusty B-tree as we see here.

Haki Benita

Simulating Amazon DynamoDB Unique Constraints using Transactions — DynamoDB has no mechanism for ensuring the uniqueness of attributes but here’s a pattern for managing this from the application side.

Chad Tindel and Brett Hensley (AWS)

Four SQL Window Functions and Examples for a Data Scientist Interview in 2021 — This is billed as being suitable for job interview practice, but reviewing window functions is never a bad idea as they can come in useful when you least expect it.

Leon Wei

Free eBook: Efficient Search in Rails with Postgres — Speed up a search query from seconds to milliseconds and learn about exact matches, trigrams, ILIKE, and full-text search.

pganalyze sponsor

How MongoDB’s Drivers Team Makes It Easier for Developers to Innovate — Kaitlin Mahar, Lead Engineer at MongoDB, shares more about their Drivers team and how they support developers within the open source community.

MongoDB, Inc.

Cloud Data Warehouses and Cloud Data Lakes: There’s No Need to Choose

Venkat Chandra

How To Install and Use SQLite on Ubuntu 20.04

Gareth Dwyer

🔨 Code and Tools

MindsDB: A Predictive AI Layer for Existing Databases — Some of the code behind a company aiming to bring ML-based predictive models to existing databases using regular SQL.

MindsDB Inc

Irmin: A Distributed Database Built on Git's Principles — An OCaml library for building mergeable, branchable distributed data stores. We first mentioned this several months ago but it’s seen some releases since.

MirageOS

Loading Complex CSV Files into BigQuery using Google Sheets

Google Cloud Blog