#289 — January 31, 2020

Read on the Web

Database Weekly

The Rise and Fall of the OLAP Cube — A post about the shift away from building ‘data cubes’ to running OLAP workloads on columnar databases, complete with a look at the history and motivations.

Cedric Chin

Powering Pinterest Ads Analytics with Apache Druid — A look at why Pinterest moved from HBase to Druid (a database designed specifically for high performance real-time analytics).

Pinterest Engineering

Why a Gaming App Migrated Off Cassandra — When Cassandra’s data modeling limitations started influencing and restricting higher-level design choices, these MMOG developers looked to CockroachDB.

Cockroach Labs sponsor

Starting Out with Data Puddles, Then We’ll Think About Data Lakes — Last year Comic Relief, a major British charity, wrote about their journey to ‘90% serverless’. Today we see how they’re re-thinking their data ingestion, storage and query stack with Lambda, S3 and Athena.

Adam Clark

Amazon Relational Database Service (RDS) Can Now Export Snapshots to S3 — You can now export Amazon Relational Database Service (Amazon RDS) or Amazon Aurora snapshots to Amazon S3 as Apache Parquet, an efficient open columnar storage format for analytics.

Amazon Web Services

Engineering SQL Support on Apache Pinot at Uber — The story of how Uber has worked on adding full SQL support on Apache Pinot to enable quick analysis and reporting on aggregated data.

Haibo Wang

💻 Jobs

Full Stack Engineer — Expensify seeks a self-driven individual passionate about making code effective, with an understanding of algorithms and design patterns.

Expensify

MongoDB Consultant - Remote Americas — Work on projects with a wide variety of companies and on any database architecture you can imagine.

Percona

📄 And the rest

Migrating from Oracle to Postgres: Tips and Tricks — Covers a handful of common tripping points like checking for NOT NULL columns and the GRANT command, plus using Orafce, an extension containing numerous compatibility functions to make an Oracle to Postgres transition smoother.

Yorvi Arias

How I Write/Format SQL Code“Most of this comes from my time as a Data Engineer at Facebook.”

Marton Trencseni

Easy Fixes For SQL Queries — A handful of rules of thumb (indeed, five thumbs) for querying any traditional relational database.

Kovid Rathee

Using SQL's EXISTS and NOT EXISTSEXISTS has been part of the SQL standard since SQL:86 but it’s frequently underused.

Vlad Mihalcea

MongoDB Still a Mystery to You? Try Studio 3T Today for a Full 30 Days — Instant driver code for JavaScript, Python, Ruby, and more. Build fast queries with our drag & drop editor.

Studio 3T sponsor

Billy: How VictoriaMetrics Deals with More Than 500 Billion Rows — Re-running ScyllaDB’s ‘Billy’ benchmark on VictoriaMetrics, a scalable time-series database.

Aliaksandr Valialkin

Distributed SQL vs. 'NewSQL' — The term ‘NewSQL’ was coined to describe relational database systems with NoSQL-esque features and OLTP scalability.

Sid Choudhury (YugaByte)

pg_timetable: Advanced Postgres Job Scheduling — Looking at a new job scheduler for Postgres implemented from scratch in Go that’s not just about running single queries at set times but that can also execute more complicated sequences of operations. GitHub repo.

Hans-Jürgen Schönig

Recommending GNU RecutilsGNU Recutils is a set of tools and libraries to access human-editable, plain text databases called ‘recfiles’.

James Tomasino

Streams and Tables in Apache Kafka: Elasticity, Fault Tolerance and Advanced Concepts

Michael Noll

How to Remove Times from Dates in SQL Server

Brent Ozar