How does it compare with duckdb, which I usualy resort to?
What I like with duckdb is that it's a single binary, no server needed, and it's been happy so far with all the CSV file I've thrown at it.
clickhouse-local is similar to duckdb, you don't need a clickhouse-server running in order to use clickhouse-local. You just need to download the clickhouse binary and start using it.
clickhouse local
ClickHouse local version 25.4.1.1143 (official build).
:)
There are few benefits of using clickhouse-local since ClickHouse can just do lot more than DuckDB. One such example is handling compressed files. ClickHouse can handle compressed files with formats ranging from zstd, lz4, snappy, gz, xz, bz2, zip, tar, 7zip.
clickhouse local --query "SELECT count() FROM file('top-1m-2018-01-10.csv.zip :: *.csv')"
1000000
Also clickhouse-local is much more efficient in handling big csv files[0]
Debian package is of poor quality: not even sure if clickhouse local is included in there, I believe so but there is no manpage, no doc at all, and no `clickhouse-server -h`.
Went to the official page looking for a tarball to download, found only the `curl|sh` joke.
Went to github looking for tagged tarballs, couldn't find any. Looked for INSTALL.md, couldn't find any.
Will try harder later, have to weep my tears for now.
ClickHouse is a single binary. It can be invoked as clickhouse-server, clickhouse-client, and clickhouse-local. The help is available as `clickhouse-local --help`. clickhouse-local also has a shorthand alias, `ch`.
This binary is packaged inside .deb, .rpm, and .tgz, and it is also available for direct download. The curl|sh script selects the platform (x86_64, aarch64 x Linux, Mac, FreeBSD) and downloads the appropriate binary.