Hamilton Ulmer

Tools for Data Analysis

duckdb wishlist, two months in: 'is the duck rude?', cancelable queries, memory footprints, & javascript UDFs
Feb 21, 2022

This is an update to a quick post about my duckdb feature wishlist. We’ve been very productively using duckdb’s node library for almost two months now (basically since I started my new job).

I’ve figured out that really, duckdb can be used to do so much more than just analytical queries; you can build entire analytical systems that do exploratory data analysis for you. People enthusiastically responded to one of my automatic EDA demos on Twitter. We’re going to be going very deep on this idea within the context of “data modeling” – figuring out how we transform data throughout a data pipeline. Anyone who has been deep into data science or data engineering knows that a lot of our day-to-day involves cleaning and profiling our datasets. Why aren’t our tools helping us do this more efficiently? Boggles the mind. I have some ideas about how to make this better.

But first – “is the duck rude?” Before I get to my updated wishlist, one point of order: a colleague of mine wondered why the duck on the duckdb website seems to be turning away from her. It seems that the brain may register it this way because the beak is the same color as the body.

Once you see the duck turning away, you can’t unsee it.

The solution is simple: add triangular easing function for the beak color that peaks as the head faces forward. It’ll then be clear that the duck is not in fact turning away from the viewer.

At any rate, here are the features I’m generally hoping for today, about two months since I went head-first into duckdb land: