Skip to content

Conversation

jmarrec
Copy link
Contributor

@jmarrec jmarrec commented Oct 2, 2025

Pull request overview

Description of the purpose of this PR

  • Improve CsvParser:
    • skip extra columns (but register a warning)
    • Emplace a null when two consecutive delimiters found (but register a warning)
    • Neither of these issues are fatal yet, if you aren't actually requesting a column that is passed the parsed one or trying to get a number from the null column, you're just fine for now
  • Improve ScheduleManager:
    • print the new warnings
    • Gracefully Fatal:
      • if trying to dereference a column with a null
      • if trying to access a column that is Past the number of columns we parsed (instead of throwing a nlohmann json crypic message)
    • Schedule:File:Shading: fatal out in the edge case where the number of headers is > to the number of actual parsed columns.
  • Add lots of tests for ScheduleManager, and a new test file CsvParser.unit.cc to test out lower level functionality

Pull Request Author

  • Title of PR should be user-synopsis style (clearly understandable in a standalone changelog context)
  • Label the PR with at least one of: Defect, Refactoring, NewFeature, Performance, and/or DoNoPublish
  • Pull requests that impact EnergyPlus code must also include unit tests to cover enhancement or defect repair
  • Author should provide a "walkthrough" of relevant code changes using a GitHub code review comment process
  • If any diffs are expected, author must demonstrate they are justified using plots and descriptions
  • If changes fix a defect, the fix should be demonstrated in plots and descriptions
  • If any defect files are updated to a more recent version, upload new versions here or on DevSupport
  • If IDD requires transition, transition source, rules, ExpandObjects, and IDFs must be updated, and add IDDChange label
  • If structural output changes, add to output rules file and add OutputChange label
  • If adding/removing any LaTeX docs or figures, update that document's CMakeLists file dependencies

Reviewer

  • Perform a Code Review on GitHub
  • If branch is behind develop, merge develop and build locally to check for side effects of the merge
  • If defect, verify by running develop branch and reproducing defect, then running PR and reproducing fix
  • If feature, test running new feature, try creative ways to break it
  • CI status: all green or justified
  • Check that performance is not impacted (CI Linux results include performance check)
  • Run Unit Test(s) locally
  • Check any new function arguments for performance impacts
  • Verify IDF naming conventions and styles, memos and notes and defaults
  • If new idf included, locally check the err file and other outputs

@jmarrec jmarrec self-assigned this Oct 2, 2025
@jmarrec jmarrec added Defect Includes code to repair a defect in EnergyPlus NotIDDChange Code does not impact IDD (can be merged after IO freeze) labels Oct 2, 2025
Comment on lines +67 to +75
std::vector<std::pair<std::string, bool>> const &CsvParser::warnings()
{
return warnings_;
}

bool CsvParser::hasWarnings()
{
return !warnings_.empty();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting back warnings in CsvParser

Comment on lines +283 to +289
if (column_num < num_columns) {
columns.at(column_num).push_back(parse_value(csv, index));
} else {
// Just parse and ignore the value
parse_value(csv, index);
has_extra_columns = true;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid crashing here if you end up finding more values on the row than the number of columns we've determined by parsing the first data row (after header if present)

We setup the has_extra_columns to true here, so we can register a warning

Comment on lines +242 to +257
if (has_extra_columns) {
warnings_.emplace_back(
fmt::format("CsvParser - Line {} - Expected {} columns, got {}. Ignored extra columns. Error in following line.",
this_cur_line_num,
num_columns,
parsed_values),
false);
warnings_.emplace_back(getCurrentLine(), true);
} else if (parsed_values != num_columns) {
success = false;

size_t found_index = csv.find_first_of("\r\n", this_beginning_of_line_index);
std::string line;
if (found_index != std::string::npos) {
line = csv.substr(this_beginning_of_line_index, found_index - this_beginning_of_line_index);
}
errors_.emplace_back(
fmt::format(
"CsvParser - Line {} - Expected {} columns, got {}. Error in following line.", this_cur_line_num, num_columns, parsed_values),
false);
errors_.emplace_back(line, true);
errors_.emplace_back(getCurrentLine(), true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we reach the end of the line, we issue a warning if has_extra_columns, otherwise an error if the resulting number of parsed values is not the expected one.

Comment on lines 261 to +280
} else if (token == Token::DELIMITER) {
next_token(csv, index);
token = look_ahead(csv, index);
if (token == Token::DELIMITER) {
// Two delimiters in a row means a blank value
// This is not yet an error, in case the user is not using this column... It will crash later if they do try to cast it to a number
size_t const next_col = column_num + 1;
if (next_col < num_columns) {
// Push a nan for blank value
columns.at(next_col).push_back(json::value_t::null);
warnings_.emplace_back(fmt::format("CsvParser - Line {} Column {} - Blank value found, setting to null. Error in following line.",
this_cur_line_num,
next_col + 1),
false);
warnings_.emplace_back(getCurrentLine(), true);
} else {
has_extra_columns = true;
}
++parsed_values;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Delimiter case, we scan ahead to check if another delimiter is coming up, in which case we emplace a null.

This is a warning, because unless you actually try to use that column, it's fine.

Comment on lines +618 to +624
for (const auto &[warning, isContinued] : csvParser.warnings()) {
if (isContinued) {
ShowContinueError(state, warning);
} else {
ShowWarningError(state, warning);
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schedule:File:Shading, print the new warnings if any.

Comment on lines +1518 to +1527
TEST_F(EnergyPlusFixture, ScheduleFile_Blanks_OnlyWarnIfNotUsingThatColumn)
{
// On the third line (second data record after header), there is a blank in the second column
// Hour,Value1,Value2
// 0,0.01,0.01
// 1,,0.33
// 2,0.37,0.37
fs::path scheduleFile = FileSystem::makeNativePath(configured_source_directory() / "tst/EnergyPlus/unit/Resources/schedule_file_with_blank.csv");

// Here I am requested a column that is properly filled, and it should work fine
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New test. it works if I have a blank value but I don't try to use that column specifically.

Comment on lines +1566 to +1576
TEST_F(EnergyPlusFixture, ScheduleFile_MissingValue)
{
// On the third line (second data record after header), there is a blank in the second column, no extra delimiter.
// Hour,Value1
// 0,0.01
// 1,
// 2,0.37
fs::path scheduleFile =
FileSystem::makeNativePath(configured_source_directory() / "tst/EnergyPlus/unit/Resources/schedule_file_with_missing_value.csv");

// In this case, the csvParser registers an error and that one is thrown
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a completely missing value here, so I throw

Comment on lines +1620 to +1631
TEST_F(EnergyPlusFixture, ScheduleFile_ExtraColumn)
{
// On the third line (second data record after header), there is an extra column

// Hour,Value1,
// 0,0.01,
// 1,0.04,0.33
// 2,0.37,
fs::path scheduleFile =
FileSystem::makeNativePath(configured_source_directory() / "tst/EnergyPlus/unit/Resources/schedule_file_with_extra_column.csv");

// I am requesting column 2, so it should warn but work
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an extra column, I warn and move on since it's not being used.

Comment on lines +1670 to +1715
TEST_F(EnergyPlusFixture, ScheduleFile_RequestNonExistingColumn)
{

// This a properly formed CSV file with two columns
// Datetime,Value
// 1/1 01:00:00,1.0
// 1/1 02:00:00,1.0
// [...]
// 12/31 23:00:00,0.0
// 12/31 24:00:00,0.0

fs::path scheduleFile = FileSystem::makeNativePath(configured_source_directory() / "tst/EnergyPlus/unit/Resources/schedule_file1.csv");

// I am requesting column 100, so it should NOT work

std::string const idf_objects = delimited_string({
"ScheduleTypeLimits,",
" Any Number; !- Name",

"Schedule:File,",
" Test1, !- Name",
" Any Number, !- Schedule Type Limits Name",
" " + scheduleFile.string() + ", !- File Name",
" 100, !- Column Number",
" 1, !- Rows to Skip at Top",
" 8760, !- Number of Hours of Data",
" Comma, !- Column Separator",
" No, !- Interpolate to Timestep",
" 60, !- Minutes per item",
" Yes; !- Adjust Schedule for Daylight Savings",
});
ASSERT_TRUE(process_idf(idf_objects));

auto &s_glob = state->dataGlobal;

s_glob->TimeStepsInHour = 4; // must initialize this to get schedules initialized
s_glob->MinutesInTimeStep = 15; // must initialize this to get schedules initialized
s_glob->TimeStepZone = 0.25;
s_glob->TimeStepZoneSec = s_glob->TimeStepZone * Constant::rSecsInHour;
state->dataEnvrn->CurrentYearIsLeapYear = false;

EXPECT_THROW(state->init_state(*state), EnergyPlus::FatalError); // read schedules

const std::string expected_error = delimited_string({
" ** Severe ** ProcessScheduleInput: Schedule:File = TEST1",
" ** ~~~ ** Requested column number 100, but found only 2 columns.",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting a column that doesn't exist: fatal

Comment on lines +1726 to +1762
TEST_F(EnergyPlusFixture, ShadowCalculation_CSV_broken)
{
// This file has one more header than data columns
// Surface Name,EAST SIDE TREE,WEST SIDE TREE
// 01/01 00:15,0,
// 01/01 00:30,0,

// a CSV exported with the extra '()' at the end (22.2.0 and below) should still be importable in E+ without crashing
const fs::path scheduleFile = configured_source_directory() / "tst/EnergyPlus/unit/Resources/shading_data_2220_broken.csv";

std::string const idf_objects = delimited_string({
"Schedule:File:Shading,",
" " + scheduleFile.string() + "; !- Name of File",
});
ASSERT_TRUE(process_idf(idf_objects));

auto &s_glob = state->dataGlobal;

s_glob->TimeStepsInHour = 4; // must initialize this to get schedules initialized
s_glob->MinutesInTimeStep = 15; // must initialize this to get schedules initialized
s_glob->TimeStepZone = 0.25;
s_glob->TimeStepZoneSec = s_glob->TimeStepZone * Constant::rSecsInHour;
state->dataEnvrn->CurrentYearIsLeapYear = false;

EXPECT_THROW(state->init_state(*state), EnergyPlus::FatalError); // read schedules

const std::string expected_error = delimited_string({
" ** Severe ** ProcessScheduleInput: Schedule:File:Shading = shading_data_2220_broken.csv",
" ** ~~~ ** For header 'WEST SIDE TREE', Requested column number 3, but found only 2 columns.",
" ** ~~~ ** Error Occurred in " + scheduleFile.string(),
" ** Fatal ** Program terminates due to previous condition.",
" ...Summary of Errors that led to program termination:",
" ..... Reference severe error count=1",
" ..... Last severe error=ProcessScheduleInput: Schedule:File:Shading = shading_data_2220_broken.csv",
});
compare_err_stream(expected_error);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test edge case for Schedule:File:Shading where you have more headers than data columns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Defect Includes code to repair a defect in EnergyPlus NotIDDChange Code does not impact IDD (can be merged after IO freeze)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Schedule:File crash

2 participants