-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Environment
Attribute | Value |
---|---|
library_version | v1.3.1 |
source_id | 2063dda3e6 |
platform | windows_amd64 |
excel extension_version | 7e97933 |
Issue
When querying sheets with many columns and providing header=false
, the generated column names experience an off-by-one error at every rollover from "Z" to "AA", or "ZZ" to "AAA", etc. Currently "Z" is followed by "BA" instead of "AA". I created a sheet column_name.xlsx to test this behavior.
Excerpt:
1 | 2 | ... | 704 | 705 |
A | B | ... | AAB | AAC |
From my testing this erroneous behavior only applies to read_xlsx
and not write_xlsx
.
Z -> AA Rollover Error
FROM read_xlsx(
'column_name.xlsx',
all_varchar=true,
header=false,
range="Z1:AA2"
);
Current behavior
Z varchar |
BA varchar |
---|---|
26 | 27 |
Z | AA |
Expected behavior
Z varchar |
AA varchar |
---|---|
26 | 27 |
Z | AA |
ZZ -> AAA Rollover Error
FROM read_xlsx(
'column_name.xlsx',
all_varchar=true,
header=false,
range="ZZ1:AAA2"
);
Current behavior
BAZ varchar |
BBA varchar |
---|---|
702 | 703 |
ZZ | AAA |
Expected behavior
ZZ varchar |
AAA varchar |
---|---|
702 | 703 |
ZZ | AAA |
Possible Resolution
As I'm unable to compile the project locally, I didn't feel comfortable issuing a pull request but I believe I've found the source of the inconsistency.
Suspected Code
duckdb-excel/src/excel/include/xlsx/xlsx_parts.hpp
Lines 86 to 95 in 9189a80
inline string XLSXCellPos::GetColumnName() const { | |
D_ASSERT(col != 0); | |
string result; | |
idx_t col = this->col - 1; | |
do { | |
result = static_cast<char>('A' + col % 26) + result; | |
col /= 26; | |
} while (col > 0); | |
return result; | |
} |
Suggestion
From some testing outside this project, I believe moving the decrement of col
inside the loop body resolves this issue.
inline string XLSXCellPos::GetColumnName() const {
D_ASSERT(col != 0);
string result;
idx_t col = this->col;
do {
col -= 1
result = static_cast<char>('A' + col % 26) + result;
col /= 26;
} while (col > 0);
return result;
}
With the decrement of col
moved inside the loop body, the do-while
could be rewritten as just a while
loop.
inline string XLSXCellPos::GetColumnName() const {
D_ASSERT(col != 0);
string result;
idx_t col = this->col;
while (col > 0) {
col -= 1
result = static_cast<char>('A' + col % 26) + result;
col /= 26;
}
return result;
}
Thank you so much for your time and effort on this awesome extension!