bits on data

opensource

the night sky is lit up over the water

Photo by Michail Dementiev on Unsplash

TL;DR: I believe Apache Iceberg won the table format wars, not because of a feature race, but primarily because of the open Iceberg spec. There are some features only available in Iceberg due to the breaking of compatibility with Hive, which was also a contributing factor to the adoption of the implementation.

Read more...

There’s something I have to get off my chest. If you really need to, just read the TLDR and listen to the Justin Bieber parody posted below. If you’re confused by the lingo, the rest of the post will fill in any gaps.

TL;DR: Benchmarketing, the practice of using benchmarks for marketing, is bad. Consumers should run their own benchmarks and ideally open-source them instead of relying on an internal and biased report.

Read more...