#344 — March 5, 2021

Read on the Web

Database Weekly

sq: A General Swiss-Army Knife for Data — There are lot of ‘Swiss Army knife’-esque data 'multitools' around (such as xsv for CSV or jq for JSON) – think of this one as being like jq but for all sorts of formats and systems. You can import an Excel sheet into a Postgres table, move query results from SQL Server to SQLite, export an entire database to CSV.. all sorts of things. Many formats are supported and Postgres, SQLite, SQL Server and MySQL are the supported RDBMS so far.

Neil O'Toole

Clickhouse as an Alternative to ElasticSearch and MySQLClickhouse is an open-source column-oriented OLAP system and the author talks about how and, importantly, why his team are using it for log storage and analytics for their project.

Anton Sidashin

[Guide] How to Calculate The True Cost of a Database — Use this guide to add up your license costs, operational overhead costs, infrastructure costs and everything in between - so that you have a clear picture of what you're spending (and where you can spend less).

CockroachDB sponsor

Google Cloud Memorystore: It's Managed memcachedmemcached is a long standing memory object caching system that was originally created at LiveJournal(!) and Google Cloud Platform has its own managed memcached system called Memorystore which is now generally available.

Google Cloud

A Look Back at 2011 and the Emergence of Hadoop — Datanami is a source we often link to as a strong news site in the data science and analytics space, and they’re celebrating their 10th birthday with a look back at how things were in 2011. They say Hadoop is almost a ‘dirty word’ today, but back in 2011, it was the cutting edge of ‘big data coolness..’


Quick Bits

How to Use a Machine Learning Model from a Google Sheet using BigQuery ML — Spreadsheets aren’t going away any time soon, so any projects that involve bringing modern data science practices to spreadsheets immediately gets my interest.

Karl Weinmeister (Google Cloud)

Building a Recommendation Engine Inside Postgres with Python and Pandas — Learn how you can leverage Python and Pandas (a popular data analysis tool) from directly inside Postgres to build your own recommendation engine.

Craig Kerstiens

Creating Amazon Timestream Interpolated Views using Amazon Kinesis Data Analytics for Apache Flink — Sounds like a bit of buzzword bingo in the title, eh? The idea here is building a streaming data pipeline where aggregations are generated during ingestion to enable faster eventual querying by a dashboard (QuickSight, in this case).

Will Taff and John Gray (AWS)

Migrating to Aurora: Easy, Except The Bill“Migrating our production database from Postgres to Aurora was easy, until we noticed that our daily database costs more than doubled.” Luckily they went on to mitigate this, though they seem mildly ambivalent about the move.

Kimberly Nicholls

How to Efficiently Choose the Right Database for Your Apps — Note that this is on PingCAP’s blog (the creators of TiDB, an open source distributed SQL database) so be aware of the bias. Nonetheless, some reasonable questions are asked here and the author uses several databases in their systems (including MySQL, Redis, and Couchbase).

Leitao Guo

AWS Claims 'Better Performance for Less' versus Azure for SQL Server — Is this AWS tooting their own horn? Yes, it is. But it’s just another small step in their ongoing rivalry with Microsoft over running SQL Server workloads. It would be fun to see Azure’s comeback to this.

Fred Wurden (AWS)

Delivering an Even Better Redis Experience on Azure — We don’t want to be seen as biased, so let’s let Azure do a little cheerleading too – this post covers the new Enterprise level offering of Azure Cache for Redis (Azure’s managed Redis service).

Kyle Teegarden (Microsoft)

Building a Inventory Management System with Google BigQuery and Cloud Run

Aja Hammerly (Google Cloud)

🛠 Projects and Tools

OrbitDB: Peer to Peer Databases for the Decentralized Web — A serverless, distributed, peer-to-peer database that uses IPFS for storage and automatically syncs across peers. It’s limited to Node and browser use cases for now and here’s an introductory guide.

OrbitDB Community

Import a SQL Database to MongoDB in 5 Steps with Studio 3T Enterprise — The easiest way to import an entire SQL database to MongoDB is with Studio 3T and its innovative SQL Migration feature.

Studio 3T sponsor

TerminusDB 4.2: Open Source Graph Database and Document StoreTerminusDB is aimed at ‘knowledge base’ type use cases where things like immutability, data lineage and collaboration are important. It’s not your typical graph database, especially as it’s built in Prolog!

Luke Feeney

ScalarDB 3.0: A Java Library That Makes Non-ACID Distributed Databases ACID-Compliant — Scalar is not a database in its own right but a Java client that extends the functionality of other data stores you might be using (such as Cassandra).


💻 Jobs

DevOps Engineer at X-Team (Remote) — Join the most energizing community for developers and work on projects for Riot Games, FOX, Sony, Coinbase, and more.