Great!
Knowing that you can support all 22 queries of TPC-H would be of huge value to us. And if you still manage to get great performance across the board, even better.
Knowing that you can support all 99 queries of TPC-DS would be absolutely fantastic, and I would not wait for SQL support for that. SQL introduces all kinds of complexities that I don't think you should focus on right now. SQL is not the future. DataFrames and Ibis are. And if you want to be fancy, you might want to take a look at
Malloy and offer support for it on top of your DataFrames (we can probably help a lot on that front).
In other words, I would not try to distribute SQL. I would focus on distributing DataFrames first, Ibis second. In parallel, I would add local (Map-only) support for SQL through DuckDB. Once you have full support for Ibis, the need for SQL distribution will be dramatically diminished, and you're better off sticking to your mission of providing a "*fast and scalable Python dataframe* for Complex Data and Machine Learning workloads." This will include adding support for graphs, which will add a lot more value to your community than adding support for SQL.