#324 — October 2, 2020

Read on the Web

Database Weekly

Amazon Timestream Goes GA: Time Series Data 'at Any Scale' — Given Timescale’s big announcements last week, Amazon’s formal GA release of its Timestream time series data store comes at an interesting time. The popularity of time series databases shows no sign of waning and AWS clearly wants a piece of the action.

Amazon Web Services

🤓 Time-Series Compression Algos, Explained » — Get a deep-dive into the history of compression algorithms, how they work, and when and why to apply them to your projects (✨ fun fact: TimescaleDB applies various types to get 90%+ storage efficiency).

Timescale sponsor

Pandemic Driving ‘Back to Basics’ in Big Data, Study Suggests — Much as riskier equities tend to get abandoned during crisis times, so too with more fanciful big data projects, it seems. BI and data visualization use is up at the expense of AI and machine learning.

Datanami

ClickHouse, Redshift and 2.5 Billion Rows of Time Series Data — Rounding out our focus on time series data this week, here Brandon uses AWS to generate 2.5 billion rows of true time series data and uses ClickHouse (a ‘big data scale OLAP RDBMS’) to demonstrate some very impressive query performance.

Brandon Harris

⚡️ Quick bytes:

Using AWS Lambda as a Consumer for Amazon Kinesis — Some best practices when using Lambda with Kinesis for high-throughput, low latency data stream processing.

James Beswick

Prisma’s Data Guide — A growing library of articles making databases more approachable. Topics include data modeling, Postgres, and DB basics.

Prisma sponsor

Build a Data Streaming Pipeline using Kafka Streams and Quarkus — One for team Java. Build a data streaming and processing pipeline using Kafka concepts like joins, windows, processors, state stores, punctuators, and interactive queries.

Kapil Shukla (Red Hat Developer)

Tips for Running MongoDB in Production Using Change Streams — Real-time tracking and auditing functionality has crept into a lot of databases that didn’t have it by default with ‘Change Streams’ being MongoDB’s approach for applications to access real-time data changes without endless polling.

Onyancha Brian Henry

⚙️ Code and Tools

dbcrossbar: Move Large Datasets Between Different Databases and Formats — Copy tabular data between databases, CSV files and cloud storage. Written in Rust.

Faraday, Inc.

rqlite 5.5: A Distributed Relational Database Built on SQLite — Think SQLite but turned into a ‘proper’ distributed database (using Raft consensus) and that’s what you get here. v5.5.0 adds support for parameterized SQL statements.

rqlite

sqlbench: Measures and Compares The Execution Time of SQL Queries — Only for Postgres right now, though pull requests for other databases are welcome. Written in Go.

Felix Geisendörfer

Jailer 10.0: A Database Subsetting and Relational Data Browsing Tool — Navigate bidirectionally through databases by following foreign-key-based or user-defined relationships. Built in Java and supports relational databases supported through a JDBC driver.

Wisser