Skip to content

Double slash (//) in table location path #15908

@PavelkoSemen

Description

@PavelkoSemen

Apache Iceberg version

1.10.1 (latest release)

Query engine

Spark

Please describe the bug 🐞

Problem

When creating Iceberg tables through Spark with S3 location, the table location contains double slashes (//) in the path. This behavior occurs regardless of whether the table is partitioned or not. The double slash in the path causes the OPTIMIZE operation to behave incorrectly and sometimes delete files, leading to table corruption.

Example

CREATE TABLE test.test.test (
   test_rk integer
)
WITH (
   compression_codec = 'ZSTD',
   format = 'PARQUET',
   format_version = 2,
   location = 's3a://test//test_8262bea6c787'  -- ⚠️ Double slash after 'test/'
)

Example 2

CREATE TABLE test.test.test (
   test_rk decimal(21, 0),
   test real,
   test_l varchar,
   test_stat real,
   test_id integer
)
WITH (
   compression_codec = 'ZSTD',
   format = 'PARQUET',
   format_version = 2,
   location = 's3a://test//test',
   partitioning = ARRAY['test_id']
)
Component Version
iceberg-spark-runtime 3.5_2.12-1.10.1
iceberg-aws-bundle 1.10.1
Spark 3.5.6
Filesystem S3A

simple spark query: df.format('iceberg').mode('overwrite').saveAsTable("test.test.test")

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions