#​358 — June 11, 2021

Read on the Web

✍️  As part of our winding down towards our hiatus in July, we're looking back at some of the best things we've covered over the past few years. This week it's a bumper roundup of data manipulation, processing, or transformation tools — all of which are still available or actively maintained :-)
Peter Cooper, editor

Database Weekly

The Data Processing Tools Special Edition

The focus in this issue is on small or single use tools rather than large enterprise-y systems. Things you can run on the Web or on your own machine for slicing and dicing various forms of data or for connecting to and managing the databases and queries you already use. Enjoy!

Superintendent.app: Load CSV Files and Perform SQL Queries on Them — The only 'news' item in this issue, but it fits so well.. Superintendent is a new desktop app (for Windows, macOS and Linux) for pulling in CSV files to then query using SQL. The results can then be saved back to CSV, if you wish.

Superintendent Team

Datasette: An Open Source 'Multi-Tool' for Exploring and Publishing Data — A tool aimed at anyone who has data they wish to share with the world (e.g. data journalists, curators, archivists..) This has become a mature and popular tool over the years and is well worth exploring. Simon's Personal Data Warehouses talk is also worth a watch.

Simon Willison

Fauna - A Flexible, Developer-Friendly, Serverless Database — Fauna is the “Data API” for modern applications. Whether you’re building new microservices or augmenting existing services applications, Fauna lets you simplify code, reduce costs and ship faster. Learn more...

Fauna sponsor

SQL Fiddle: A Tool for Quick Testing of Database Queries — Set up your schema (which can be empty) on the left then issue queries on the right. Supports specific (slightly old) versions of MySQL, Oracle, Postgres, and SQL Server. DB Fiddle is a similar tool covering newer versions of MySQL and Postgres.

ZZZ Projects

jOOQ SQL Translation: Translate SQL From One Dialect to Another — While SQL has been standardized numerous times, its implementation does tend to vary, so this tool continues to provide an interesting way to see the differences on your own queries. I love it.

Lukas Eder

SQLPad: A Web-Based SQL Editor to Run in Your Private Cloud — SQLPad supports a vast array of systems from standard RDBMS to things like BigQuery, Cassandra, and Apache Pinot.

Rick Bergfalk

Visidata 2: A Terminal Spreadsheet Multitool for Discovering and Arranging Data — Here’s an introductory tutorial to what you can use this for.

Saul Pwanson

Fx: A Command-Line JSON Processing Tool — If you’ve got some files full of JSON that you want to process, Fx will slice and dice it however you want, including using JavaScript one-liners to add a bit of logic to the process. I’ve used this a lot.

Anton Medvedev

sqlfmt: An Opinionated Online SQL Formatter — Provide sqlfmt with some SQL, specify a target line width, and get back some better formatted SQL that takes SQL’s syntax into account. Ideal for blog posts or documentation.

Matt Jibson

Avoiding Integer Overflows with Zero Downtime at Buildkite — Learn how Buildkite migrated over 2 billion rows in one of our largest tables.

Buildkite sponsor

Unix Command Line Tricks For Data Scientists — OK, we’re kinda cheating here, but Unix favorites like sort, split, head, grep, sed and awk make up a serious part of my own data processing arsenal at least..

Kade Killary

A List of CLI Tools for Manipulating Structured Text Data — A list of around 80 tools for working with DSV/CSV, XML, HTML, JSON, YAML, INI and other formats.

Danyil Bohdan

usql: A Universal CLI Tool for Databases — A CLI tool (written in Go) for working with Postgres, SQL Server, MySQL, SQLite3, Oracle Database, CockroachDB, and many more. A database Swiss Army knife, if you will.

Kenneth Shaw

Mockaroo: A Random Data Generator — Specify your schema and get back ‘realistic’ test data in CSV, JSON, SQL or Excel format. You get up to 1000 rows of generation for free. GenerateData.com is a similar tool in this space.

Mockaroo, LLC

SQLSmith: A Random SQL Query Generator — It hasn’t been updated in a few years, but this tool went beyond merely being interesting to actually discovering bugs in real database systems. SQLancer is another, newer tool in this space.

Andreas Seltenreich


Find Data Engineering Jobs with Hired — Take 5 minutes to build your free profile & start getting interviews for your next job. Companies on Hired are actively hiring right now.

DBDiagram: A Database Designer for Developers and Analysts — A Web-based tool to help you draw database relationship diagrams and flows quickly using a simple markup language/DSL.

Holistics Software

dadbod.vim: A Modern Database Interface for Vim — A Vim plugin for interacting with numerous databases, including Postgres.

Tim Pope

q: Run SQL on CSV or TSV Files — A perennially useful utility.

Harel Ben-Attia

xsv: A Fast CSV Command Line Toolkit Written in Rust — Another ‘Swiss Army knife’ for your slightly structured data.

Andrew Gallant

Dbmate: A Lightweight, Framework-Agnostic Database Migration Tool — Written in Go but can be used alongside database-using apps written in any language. Supports MySQL, Postgres, SQLite and ClickHouse.

Adrian Macneil

dbcrossbar: Move Large Datasets Between Different Databases and Formats — Copy tabular data between databases, CSV files and cloud storage. Written in Rust.

Faraday, Inc.

PostgREST: Serve a RESTful API from a Postgres Database — We’ve covered this popular tool numerous times in Postgres Weekly but never here.. It’s a standalone app that turns a PostgreSQL database into a RESTful API. Prefer GraphQL? PostGraphile does similar things in that area.

Joe Nelson and Steve Chavez

gron: Make JSON Greppable — A tool written in Go that transforms JSON into more easily greppable assignments, so you can use grep and see the context/path of the result. Useful.

Tom Hudson