Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller ReleaseSmall by default #22270

Open
1 of 6 tasks
wooster0 opened this issue Dec 19, 2024 · 4 comments
Open
1 of 6 tasks

Smaller ReleaseSmall by default #22270

wooster0 opened this issue Dec 19, 2024 · 4 comments
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Milestone

Comments

@wooster0
Copy link
Contributor

wooster0 commented Dec 19, 2024

#6676 in the past helped make ReleaseSmall smaller by default.
However there is more that can be done when ReleaseSmall is enabled.
When you use ReleaseSmall you're asking for a small binary size, not debuggability.
Here are some things that could help achieve smaller binary size:

  • Set unwind_tables to false by default. This should consistently result in smaller binary size. Unwind tables seem to be used for stack traces (debuggability) which I'm not asking for when I'm compiling with ReleaseSmall. Would be nice to fix Unable to omit unwind tables from binary when using ReleaseSmall + LLVM  #21942 before doing this.
  • (Module: keep frame pointer in ReleaseSmall on x86 #22271) Omit or keep the frame pointer based on the architecture. On some architectures in many cases keeping the frame pointer actually decreases binary size, such as on x86_64.
    This is purely a case-by-case thing and doesn't always apply so I think it's fine to omit frame pointers for most architectures except for some exceptions like x86_64 where in many (again, not all) cases keeping the frame pointer results in smaller binary size. The user can always fine-tune this if they feel the need to so I think a rough "estimate" here would be an improvement over the default in many cases.
    The current default seems to be to always omit the frame pointer if the optimize mode is not Debug.
  • (ELF-specific) don't emit section headers (including the .comment section) for binaries in the self-hosted ELF linker. As far as I understand in most cases section headers are not required for execution of the program itself and are usually useful for debugging and to figure out what a section's name is etc.
    • Do the same for LLVM-emitted binaries. I don't know if this is configurable upfront or if it requires an additional objcopy (see below also) to remove the undesired sections after the binary is emitted.

Additionally, here are some things that should help users achieve smaller binary size on their own:

  • (ELF-specific) support --strip-section-headers from GNU objcopy in Zig objcopy. This option seems to remove section name strings as well as the .comment section from a binary.
  • Support --remove-section from GNU objcopy in Zig objcopy. It should be possible to remove only specific sections from a binary and keep all others. This should be useful in specific cases where undesirable sections are being emitted. Can also be used to omit only .comment and keep section names, etc.
@alexrp
Copy link
Member

alexrp commented Dec 19, 2024

Set unwind_tables to false by default. This should consistently result in smaller binary size. Unwind tables seem to be used for stack traces (debuggability) which I'm not asking for when I'm compiling with ReleaseSmall. Would be nice to fix Unable to omit unwind tables from binary when using ReleaseSmall + LLVM  #21942 before doing this.

I'm not sure yet if I'm on board with this.

Remember, unwind tables as emitted by -funwind-tables into .eh_frame are a different beast from unwind tables emitted into .debug_frame due to debug info. The latter is indeed just there to aid debugging and stack traces, but the former may be necessary for C++ exceptions to work, or just unwindability in general that may be required by program semantics - e.g. if you're implementing a GC that uses DWARF tables to do precise stack scanning.

I'm not outright saying no to this, I'm just saying that this point needs to be carefully considered. It's not for nothing that C compilers default to emitting unwind tables even with -Os.

The current default seems to be to always omit the frame pointer if the optimize mode is not Debug.

Not anymore: #22180

If you'd like to add some targeted exceptions to omitting the FP for ReleaseSmall, I'm fine with that as long as it comes with some basic data to back up the change. The size of the Zig compiler built with that configuration might be a nice metric here.

@wooster0
Copy link
Contributor Author

wooster0 commented Dec 19, 2024

may be necessary for C++ exceptions to work, or just unwindability in general that may be required by program semantics - e.g. if you're implementing a GC that uses DWARF tables to do precise stack scanning

This is a good point.
I guess my opinion is that if the user needs this, they need to re-enable unwind tables on their own.
Could also consider doing this only compilations that don't use C++. That should eliminate at least the C++ case.
But honestly I think ReleaseSmall should aim for the smallest binary size in any way and then the user has to configure their program on top of that. In many cases it should be a simple additional .unwind_tables = true, line in your build.zig. And if what you're doing explicitly depends on unwind tables then it's probably a good idea to explicitly enable them anyway, instead of relying on defaults in the first place.

I guess it might also make sense to document flag defaults based on the optimize mode somewhere to make users more aware of it.

Not anymore: #22180

Oh, I was looking at an old fork.

If you'd like to add some targeted exceptions to omitting the FP for ReleaseSmall, I'm fine with that as long as it comes with some basic data to back up the change. The size of the Zig compiler built with that configuration might be a nice metric here.

Yes I was thinking about taking a few applications for metrics and if the changes make binary size decrease for most of those, go with that as a slightly better default.

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Dec 19, 2024
@andrewrk andrewrk added this to the unplanned milestone Dec 19, 2024
@wooster0
Copy link
Contributor Author

@alexrp just in case you happen to be knowledgeable on this:
Do you think we could fence off this code

zig/lib/std/start.zig

Lines 556 to 568 in 0ff0bdb

const opt_init_array_start = @extern([*]*const fn () callconv(.c) void, .{
.name = "__init_array_start",
.linkage = .weak,
});
const opt_init_array_end = @extern([*]*const fn () callconv(.c) void, .{
.name = "__init_array_end",
.linkage = .weak,
});
if (opt_init_array_start) |init_array_start| {
const init_array_end = opt_init_array_end.?;
const slice = init_array_start[0 .. init_array_end - init_array_start];
for (slice) |func| func();
}
when it's not going to be needed for a lot of programs? Reading #20852 I don't really understand a whole lot about why it was added but it seems to be related to sanitizing threads? So maybe if (builtin.sanitize_thread)? I doubt that's it though.

@andrewrk
Copy link
Member

It's known at link time whether that logic is needed. When a compilation is pure Zig code, then that logic is never needed. Additionally, there could be a flag added to disallow constructors in object files, causing a link error to occur if these sections are discovered in linker inputs. So, I could see there being a corresponding @import("builtin") flag that tells whether this startup logic is potentially needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Projects
None yet
Development

No branches or pull requests

3 participants