Issue 178 — October 27, 2017
TileDB: A Multi-Dimensional Array Data Management System
— A database that started life at MIT and Intel that’s designed for storing massive dense and sparse multi-dimensional array data.
ELI5: How Does A Database Handle 1B Users?
— It was interesting to see a popular question on database scaling on Reddit’s ‘Explain Like I’m 5’ sub-Reddit this week.
Diagnose query latency with Datadog
— To get to the bottom of database performance issues, you need high-resolution data from all your systems. See metrics from all your database instances, client applications, load balancers, and more with Datadog.
Postgres 10 vs. TimescaleDB for Time-Series Data
— PostgreSQL 10 promises an easier partitioning mechanism to help scale for big data. But how does that compare to TimescaleDB for time-series workloads?
Amazon Aurora Now with PostgreSQL Compatibility
— AWS Aurora which helps you grow beyond the 6TB limit of RDS is now generally available for Postgres on top of RDS.
How to Dump Google Analytics for SQL in 3 Hours
— A look at using cloud services to track requests to a tracking pixel and then using BigQuery to analyze the results.
Open-Source ML Server 'PredictionIO' Gets Apache Promotion
— Apache PredictionIO, an open source machine learning server and toolset, is now a ‘top level’ Apache project.
Postgres Domain Integrity In Depth
— An extensive write up about ways to keep your data ‘squeaky clean’ by way of integrity constraints. There’s a
to learn from here.
World-Class Data Engineering with Amazon Redshift
— Join us on Nov 17 in San Francisco for 2.5 hours private training. Get $100 off with the code “DBWEEKLY”.
SQLite 3.21.0 Released
— Mostly tweaks, fixes, and optimizations.
7 Keys to Better MySQL Performance
Real-Time Databases Explained: Why Meteor, RethinkDB, Parse and Firebase Don't Scale
an article covering the same ground.
The Best Bits of Postgres 10 for Developers
— A brief overview of some favourite developer-focused features in Postgres 10.
Data Squeeze: Hadoop Utility to Compact Small Files
GoCD - Open Source Continuous Delivery Server
— Open source continuous delivery tool specializing in advanced workflow modeling and dependency management.
Reflow: A Language and Runtime for Distributed, Incremental Data Processing in the Cloud
Hyrise: An In-Memory Hybrid Storage Engine
Spydra: Ephemeral Hadoop Clusters using Google Compute Platform