#337 — January 15, 2021 |
Database Weekly |
When Your Legacy Database Is Outgrowing Itself — The tale of scaling a database to support chess.com, an extremely popular site where you can play chess (surprisingly!) A lot of scaling issues popped up during lockdown as everyone rushed online to play games and here’s how they dealt with it using their homebrew MySQL clustering approach. Aleksandar Ikonic |
Have The Tables Turned on NoSQL? — After you get over the clever headline, this is a reflection on the growth of NoSQL solutions in the prior decade, their weaknesses, and the resurgence of SQL as a ‘gold standard.’ John Biggs and Ryan Donovan (Stack Overflow) |
A Global Serverless Database for All Your Applications — Modernize your OLTP database to a 100% operations-free approach without compromising on the capabilities that you need for building industrial grade enterprise applications. Fauna is the “Data API” for modern applications. Fauna sponsor |
Cockroach Labs Raises $160M on $2B Valuation — A big raise for the creators of CockroachDB, a scalable, widely-distributed ‘cloud native’ transactional SQL database system. Doubling down on product development is their intention with the cash and I wouldn’t be surprised to see an IPO in the medium term future. Cockroach Labs |
An Unlikely Database Migration — We’ve all simply used text files (whether CSV or a lump of JSON) as adhoc databases, and so was the case at Tailscale, a networking company. Here’s how they migrated from a single JSON blob ‘database’ to etcd to help their control plane scale into the future. Brad Fitzpatrick and David Crawshaw |
License Changes Coming to Elasticsearch and Kibana — Echoing similar moves by MongoDB a couple of years ago, the creators of Elasticsearch are keen to ensure anyone offering Elasticsearch as a service contribute code back or license it commercially. This poses risks for Elasticsearch users, however, says VM Brasseur. Shay Banon (Elastic) |
The Always-On Time-Series Database: Keeping Up Where There's No Way to Catch Up — A sort of written round-table discussion where three data architects/scientists talked about time series databases that can handle huge amounts of overlapping data without conflicts. This is a good example of thinking through a problem and would probably have made a good podcast session.. ACM |
Cross-Account Replication with Amazon DynamoDB — Demonstration of a cost-effective way to migrate and sync DynamoDB tables across AWS accounts while having no impact on the source table performance and availability. Ahmed Saef Zamzam and Dragos Pisaro |
Re-Introducing Hash Indexes in Postgres — Billed as the ‘ugly duckling of index types’, this is a look at an uncommonly used type of index whose use was discouraged in the past but whose performance can sometimes beat the trusty B-tree as we see here. Haki Benita |
Simulating Amazon DynamoDB Unique Constraints using Transactions — DynamoDB has no mechanism for ensuring the uniqueness of attributes but here’s a pattern for managing this from the application side. Chad Tindel and Brett Hensley (AWS) |
Four SQL Window Functions and Examples for a Data Scientist Interview in 2021 — This is billed as being suitable for job interview practice, but reviewing window functions is never a bad idea as they can come in useful when you least expect it. Leon Wei |
Free eBook: Efficient Search in Rails with Postgres — Speed up a search query from seconds to milliseconds and learn about exact matches, trigrams, ILIKE, and full-text search. pganalyze sponsor |
How MongoDB’s Drivers Team Makes It Easier for Developers to Innovate — Kaitlin Mahar, Lead Engineer at MongoDB, shares more about their Drivers team and how they support developers within the open source community. MongoDB, Inc. |
Cloud Data Warehouses and Cloud Data Lakes: There’s No Need to Choose Venkat Chandra |
How To Install and Use SQLite on Ubuntu 20.04 Gareth Dwyer |
🔨 Code and Tools |
MindsDB: A Predictive AI Layer for Existing Databases — Some of the code behind a company aiming to bring ML-based predictive models to existing databases using regular SQL. MindsDB Inc |
Irmin: A Distributed Database Built on Git's Principles — An OCaml library for building mergeable, branchable distributed data stores. We first mentioned this several months ago but it’s seen some releases since. MirageOS |
Loading Complex CSV Files into BigQuery using Google Sheets Google Cloud Blog |