-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error accessing an iceberg table hosted in Azure blob storage #194
Comments
Hi @jhatcher1. Thank you for reporting. Are you able to test querying this Iceberg metadata file from a local instance rather than from Azure Blob Storage and let us know whether that works or not? It'll make it easier for us to help. |
Hi @philippemnoel, I'll try to test this with a local iceberg table, but it won't be the exact same metadata since the metadata contains references to the data files in Azure blob storage. |
Makes sense. Still informative as I’m trying to see whether the issue is in the iceberg extension’s integration with Azure or in the iceberg extension itself |
I've tried testing with a local iceberg table:
That error seems to be caused by an upstream issue in duckdb where it doesn't handle the file protocol: duckdb/duckdb#13669 However, I think it shows that the extension was at least able to read the metadata.json file to determine the list of snapshot files to read. |
Got it. Thank you for testing. Iceberg support in DuckDB is still rather limited, and unfortunately not seeing as much movement as we'd like. We may need to wait until it improves. |
What happens?
We are trying to connect to an iceberg table in Azure blob storage, using the iceberg foreign data wrapper. When creating the foreign table, we observe the error:
We do not see this error when using duckdb to access the iceberg table table directly, or when using the parquet foreign data wrapper to access the iceberg table's parquet files directly.
To Reproduce
These are the steps performed to reproduce the error:
We do not see this error when using the duckdb CLI directly
We also do not see this error when using the parquet foreign data wrapper:
OS:
macOS (aarch64)
ParadeDB Version:
v0.13.2
Are you using ParadeDB Docker, Helm, or the extension(s) standalone?
ParadeDB Docker Image
Full Name:
Jordan Hatcher
Affiliation:
MindBridge AI
Did you include all relevant data sets for reproducing the issue?
N/A - The reproduction does not require a data set
Did you include the code required to reproduce the issue?
Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?
The text was updated successfully, but these errors were encountered: