Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to OTP-28 (master branch) #1490

Merged
merged 4 commits into from
Feb 14, 2025
Merged

Conversation

bettio
Copy link
Collaborator

@bettio bettio commented Jan 23, 2025

Enable again tests against OTP from master branch.

In order to support what will be released as OTP-28 some changes have been required:

  • New encoding for atoms
  • Uncompressed literals
  • to_atom NIFs (binary_to_atom, etc...) are now BIFs when used in guards (hence support for GCBIFs in CALL_EXT related opcodes has been introduced for OTP < 28 support and when this optimization is not applied)

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

"Unsupported %i bytes atom:\n"
"AtomVM unlike OTP >= 28 doesn't support atoms containing up to 255"
"codepoints, but it is limited to 255 bytes.\n"
"If you see this unlikely error please let the developers know.\n",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help me rephrase this, I can write it in a better way.

Copy link
Collaborator

@UncleGrumpy UncleGrumpy Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:
"Unsupported atom length %i bytes.\n"
"Unlike OTP >= 28, AtomVM supports a maximum of 255 codepoints.\n"
"If you are seeing this error please open an issue on GitHub:\n"
"https://github.com/atomvm/AtomVM/issues\n"

@bettio bettio marked this pull request as ready for review February 8, 2025 16:05
Copy link
Collaborator

@UncleGrumpy UncleGrumpy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a suggestion on how to rephrase that error message.

@UncleGrumpy
Copy link
Collaborator

That looks great. I obviously did not explain the limitation correctly ;-)

Copy link
Collaborator

@UncleGrumpy UncleGrumpy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted a few things that look suspicious to me, but maybe I missed something.

@bettio bettio requested a review from UncleGrumpy February 9, 2025 12:57
@bettio bettio force-pushed the otp28-support branch 2 times, most recently from cd56b45 to b193643 Compare February 12, 2025 13:33
@@ -482,7 +540,14 @@ int atom_table_ensure_atoms(
}
}

unsigned long hash = sdbm_hash(current_atom, atom_string_len(current_atom));
if (is_long_format) {
uint8_t *atom_copy = malloc(atom_len + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no check that allocation succeeds here.

Could we somehow avoid the copy as we do if !is_long_format? For example we could alter the structure of a node (struct HNode).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HNode is meant to be a fixed size struct so I cannot save it there.
My opinion here is to add a PackBEAM step that "optimizes" atoms by reverting them to the fixed length encoding, so no additional memory will be used.
On the long term I would like to have a shared atom table for each .avm file that uses the right format and that saves space from a number of duplicated atoms, but this is another (OT) story.

TL;DR: let's keep this as is just for .beam files that didn't receive any further processing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no check that allocation succeeds here.

Fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit confused about HNode is a fixed size struct.

Couldn't we do:

struct HNode
{
    struct HNode *next;
    uint8_t len;
    const char* key;
    long index;
};

This would of course mean more RAM for the default case, but we could eventually save some without altering avm format by sharing pointer to key with common prefix search or substrings.

Also we could also limit the number of atoms to 2^24 and pack index and len on the same uint32_t.

Copy link
Collaborator Author

@bettio bettio Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, ok, maybe we've been talking about 2 different things.
It sounds like a clever change, but I think it might turn into a very big change way beyond this PR scope that is just adding OTP-28 support.

@bettio bettio requested a review from pguyot February 13, 2025 13:44
Atom length prefix now is a varlen int, so up to 255 codepoints can be
supported.

AtomVM will not support this. When going embedded some features should
be left out, and this is one of them.

See also:
- erlang/otp#8913
- erlang/otp#9336

Signed-off-by: Davide Bettio <[email protected]>
Starting from OTP-28 literals are not compressed anymore, so `LITT` will
contain uncompressed data.
OTP PR and issues contain further information about how to distinguish
between compressed and uncompressed sections.

See also:
- erlang/otp#8940
- erlang/otp#8967
- erlang/otp#8988

Signed-off-by: Davide Bettio <[email protected]>
`binary_to_atom`, `binary_to_existing_atom`, `list_to_atom`,
`list_to_existing_atom` used to be NIFs, starting from OTP-28 are
optimized to BIFs.

Add support to `call_ext` and other call instructions to `GCBIFFunctionType` in
order to make those NIFs work with OTP < 28 too.
Also these functions are used as BIFs when used as guards, while they
are used as NIFs in other cases.

See also: erlang/otp@ad10c97

Signed-off-by: Davide Bettio <[email protected]>
Enabled again with master (that will be OTP-28).

Signed-off-by: Davide Bettio <[email protected]>
Copy link
Collaborator

@UncleGrumpy UncleGrumpy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple allocations that are not checked.

i += codepoint_size;
}

atom = malloc(atom_string_len + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no allocation check here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #1535 for release-0.6, I will fix the conflict when I'll merge release-0.6 into main, after this PR is merged.

} else {
// * 2 is the worst case size
size_t buf_len = atom_string_len * 2;
atom = malloc(buf_len + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No allocation check here either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this code is old, I will open a PR against release-0.6.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #1535 for release-0.6, I will fix the conflict when I'll merge release-0.6 into main, after this PR is merged.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good plan. I approved #1535.

@bettio bettio requested a review from UncleGrumpy February 14, 2025 00:20
Copy link
Collaborator

@UncleGrumpy UncleGrumpy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the fix in #1535, this looks good to me now.

Copy link
Collaborator

@UncleGrumpy UncleGrumpy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the fix in #1535 this looks good to me now.

@pguyot
Copy link
Collaborator

pguyot commented Feb 14, 2025

I confirm this crashes after this PR but succeeded with previous binaries, when compiled with OTP27.

-module(test_list_to_existing_atom_1).

-export([
start/0,
g/2
]).

start() ->
    R1 = erlang:list_to_existing_atom("start"),
    Fun = fun erlang:list_to_existing_atom/1,
    g(Fun, atom_to_list(R1)).

g(Fun, R1) ->
    erlang:display(Fun(R1)).

Indeed, there are some paths where existence of bifs isn't checked.
I tried the same thing with OTP24 compiler for max/2, but it doesn't crash, the compiler generates a call_ext or call_ext_last which are handled.

@bettio
Copy link
Collaborator Author

bettio commented Feb 14, 2025

-module(test_list_to_existing_atom_1). -export([ start/0, g/2 ]). start() -> R1 = erlang:list_to_existing_atom("start"), Fun = fun erlang:list_to_existing_atom/1, g(Fun, atom_to_list(R1)). g(Fun, R1) -> erlang:display(Fun(R1)).

I think the problem is here:

1159     if (term_is_atom(index_or_function)) {                              \
1160         term module = boxed_value[1];                                   \
1161         fun_arity = term_to_int(boxed_value[3]);                        \
1162         AtomString module_name = globalcontext_atomstring_from_term(glb, module); \
1163         AtomString function_name = globalcontext_atomstring_from_term(glb, index_or_function); \
1164         struct Nif *nif = (struct Nif *) nifs_get(module_name, function_name, fun_arity); \
1165         if (!IS_NULL_PTR(nif)) {                                        \

@bettio
Copy link
Collaborator Author

bettio commented Feb 14, 2025

I confirm this crashes after this PR but succeeded with previous binaries, when compiled with OTP27.

-module(test_list_to_existing_atom_1).

-export([
start/0,
g/2
]).

start() ->
    R1 = erlang:list_to_existing_atom("start"),
    Fun = fun erlang:list_to_existing_atom/1,
    g(Fun, atom_to_list(R1)).

g(Fun, R1) ->
    erlang:display(Fun(R1)).

Indeed, there are some paths where existence of bifs isn't checked. I tried the same thing with OTP24 compiler for max/2, but it doesn't crash, the compiler generates a call_ext or call_ext_last which are handled.

Good catch, this is a bug in release-0.6 as well, I'm going to make a fix for it:

  9 start() ->
 10     Fun = fun erlang:'not'/1,
 11     ?MODULE:h(Fun, true, false, 1) - 1.

[...]

 21 h(Fun, In, Exp, V) ->
 22     case Fun(In) of
 23         Exp -> V;
 24         _ -> 0
 25     end.

@bettio
Copy link
Collaborator Author

bettio commented Feb 14, 2025

I confirm this crashes after this PR but succeeded with previous binaries, when compiled with OTP27.

-module(test_list_to_existing_atom_1).

-export([
start/0,
g/2
]).

start() ->
    R1 = erlang:list_to_existing_atom("start"),
    Fun = fun erlang:list_to_existing_atom/1,
    g(Fun, atom_to_list(R1)).

g(Fun, R1) ->
    erlang:display(Fun(R1)).

Indeed, there are some paths where existence of bifs isn't checked. I tried the same thing with OTP24 compiler for max/2, but it doesn't crash, the compiler generates a call_ext or call_ext_last which are handled.

Opened PR for fixing this in release-0.6: #1538

@bettio bettio merged commit 2c82f47 into atomvm:main Feb 14, 2025
110 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants