#247 — March 29, 2019

Read on the Web

Database Weekly

Going Deep on the Design of Amazon Aurorathe morning paper recently featured two papers that dig deep into how Amazon’s Aurora database system works. Aurora is a MySQL and PostgreSQL-compatible database that uses a lot of optimizations under the hood, and these papers dig into those optimizations. The second paper covers the distributed consensus model.

Adrian Colyer

Was MongoDB Ever the Right Choice? — If you’re looking for a place to rant and rave about MongoDB, this isn’t it. As an innovation in the database world, MongoDB has received a lot of criticism over the years but the problems it solves are important, says the author. (And if you're a MongoDB user, head over to our MongoDB newsletter :-))

Justin Etheredge

Why to Use a Relational Database for Your IoT Applications — Don’t make the mistake of creating data silos. With a relational database like TimescaleDB, you can unify time-series, metadata, & geospatial data in a single database system that scales from the cloud to the edge. Which is why Azure partnered with Timescale to power IoT & time-series workloads.

Timescale sponsor

How to Build and Run the Open Distro For Elasticsearch SQL Plugin with Elasticsearch OSS — We recently mentioned Amazon’s controversial new distribution of Elasticsearch and now we get to see one part of it in action: a plugin that lets you query an Elasticsearch store using SQL.

Jon Handler

McDonald's Bites on Big Data With $300M Acquisition — McDonalds’s brings Big Data to its Big Macs with its largest acquisition in 20 years. The plan is to use data analysis to suggest products customers are more likely to buy.

Brian Barrett

Neo4j Unveils Its 'Startup Program' Offering Graph Database Tech to Small Companies — The company behind the Neo4j graph database have started a program opening up their enterprise-level offerings to certain types of startup and small companies.


📖 Tutorials

Redis Streams as a Pure Data Structure — The creator of Redis, the data structure server, looks at streams as a pure data structure and how they can be like “CSV files on steroids.”

Salvatore Sanfilippo

The Best Way to Count Distinct Indexed Things — It’s significantly more efficient to do a count of a subquery’s results than to try to do a count with DISTINCT in a single query.

Peter Bengtsson

pganalyze eBook: How to Get a 3x Performance Improvement on Your Postgres Database — Learn our best practices for optimizing Postgres query performance for customers like Atlassian and how to reduce data loaded from disk by 500x.

pganalyze sponsor

Indexes in Postgres: A Look at B-Trees — The latest in a series of extensive posts taking a deep dive into how indexes work.

Egor Rogov

Speeding Up GROUP BY in PostgreSQL — SQL’s GROUP BY is used to group records in a result set together often for summary/aggregation purposes and there’s a way to speed them up and gain some ‘free’ extra perf, claims the author!

Hans-Jürgen Schönig

💬 Stories and Opinions

How We Moved a Massively Parallel Postgres DB onto Kubernetes

Oz Basarir (Pivotal)

Some Numbers You'll Know by Heart If You Have Been Working with SQL Server for A While — A light hearted, tongue in cheek piece for SQL Server users. I think every database could have an equivalent article!

Denis Gobo

A Tale Of Two Queries: Standing Up for ANSI-89 SQL JOIN Syntax — I’ve gotta admit, this is how I still do joins too. 👴🏻

Allan Hirt

🛠 Code & Tools

automl-gs: Quickly Perform Machine Learning on CSV Files — OK, the title doesn’t quite get at what this is, but it’s neat. Give automl-gs a CSV file and it’ll create a model for predicting the values of a field of your choice that you can work with from Python.

Max Woolf

Bitraft: A Bitcask Distributed Key/Value Store using Raft for Consensus with a Redis Compatible APIBitcask is both a high performance Go key/value store and a storage format used by Riak. Bitraft adds Raft-powered consensus to distribute the store.

James Mills

csvq: Use an SQL-like Query Language on CSV Files — Includes an interactive REPL.