Skip to content

XLSXCellPos::GetColumnName off-by-one error #58

@genericallyterrible

Description

@genericallyterrible

Environment

Attribute Value
library_version v1.3.1
source_id 2063dda3e6
platform windows_amd64
excel extension_version 7e97933

Issue

When querying sheets with many columns and providing header=false, the generated column names experience an off-by-one error at every rollover from "Z" to "AA", or "ZZ" to "AAA", etc. Currently "Z" is followed by "BA" instead of "AA". I created a sheet column_name.xlsx to test this behavior.

Excerpt:

1 2 ... 704 705
A B ... AAB AAC

From my testing this erroneous behavior only applies to read_xlsx and not write_xlsx.

Z -> AA Rollover Error

FROM read_xlsx(
	'column_name.xlsx',
	all_varchar=true,
	header=false,
	range="Z1:AA2"
);

Current behavior

Z
varchar
BA
varchar
26 27
Z AA

Expected behavior

Z
varchar
AA
varchar
26 27
Z AA

ZZ -> AAA Rollover Error

FROM read_xlsx(
	'column_name.xlsx',
	all_varchar=true,
	header=false,
	range="ZZ1:AAA2"
);

Current behavior

BAZ
varchar
BBA
varchar
702 703
ZZ AAA

Expected behavior

ZZ
varchar
AAA
varchar
702 703
ZZ AAA

Possible Resolution

As I'm unable to compile the project locally, I didn't feel comfortable issuing a pull request but I believe I've found the source of the inconsistency.

Suspected Code

inline string XLSXCellPos::GetColumnName() const {
D_ASSERT(col != 0);
string result;
idx_t col = this->col - 1;
do {
result = static_cast<char>('A' + col % 26) + result;
col /= 26;
} while (col > 0);
return result;
}

Suggestion

From some testing outside this project, I believe moving the decrement of col inside the loop body resolves this issue.

inline string XLSXCellPos::GetColumnName() const {
	D_ASSERT(col != 0);
	string result;
	idx_t col = this->col;
	do {
		col -= 1
		result = static_cast<char>('A' + col % 26) + result;
		col /= 26;
	} while (col > 0);
	return result;
}

With the decrement of col moved inside the loop body, the do-while could be rewritten as just a while loop.

inline string XLSXCellPos::GetColumnName() const {
	D_ASSERT(col != 0);
	string result;
	idx_t col = this->col;
	while (col > 0) {
		col -= 1
		result = static_cast<char>('A' + col % 26) + result;
		col /= 26;
	}
	return result;
}

Thank you so much for your time and effort on this awesome extension!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions