Skip to content

Conversation

@alejandro-colomar
Copy link
Collaborator

@alejandro-colomar alejandro-colomar commented Jul 18, 2025

Cc: @Karlson2k


Revisions:

v1b
  • Rebase
$ git range-diff 17a3bb51df92^..gh/strncpytail shadow/master..strncpytail 
1:  17a3bb51 = 1:  52ed2fb5 lib/string/strcpy/: strncpytail(), STRNCPYTAIL(): Add APIs
2:  0747fecc = 2:  25ed13db lib/utmp.c: Use STRNCPYTAIL() instead of its pattern
v1c
  • Rebase
$ git rd 
1:  52ed2fb5 = 1:  a9207ca3 lib/string/strcpy/: strncpytail(), STRNCPYTAIL(): Add APIs
2:  25ed13db ! 2:  22ce0a47 lib/utmp.c: Use STRNCPYTAIL() instead of its pattern
    @@ lib/utmp.c
      #include "string/strdup/xstrdup.h"
      #include "string/strdup/xstrndup.h"
     @@ lib/utmp.c: prepare_utmp(const char *name, const char *line, const char *host,
    -   if (NULL != ut) {
    +       && ('\0' != ut->ut_id[0])) {
                STRNCPY(utent->ut_id, ut->ut_id);
        } else {
     -          STRNCPY(utent->ut_id, strnul(line) - MIN(strlen(line), countof(utent->ut_id)));
v1d
  • Rebase
$ git range-diff a9207ca3^..22ce0a47 5f7630f8^..6b0a0ac0
1:  a9207ca3 = 1:  5f7630f8 lib/string/strcpy/: strncpytail(), STRNCPYTAIL(): Add APIs
2:  22ce0a47 = 2:  6b0a0ac0 lib/utmp.c: Use STRNCPYTAIL() instead of its pattern
v1e
  • Rebase
$ git rd 
1:  5f7630f8 = 1:  4fc3f696 lib/string/strcpy/: strncpytail(), STRNCPYTAIL(): Add APIs
2:  6b0a0ac0 = 2:  df8b604d lib/utmp.c: Use STRNCPYTAIL() instead of its pattern
v1f
  • Rebase
$ git rd 
1:  4fc3f696 = 1:  163d8535 lib/string/strcpy/: strncpytail(), STRNCPYTAIL(): Add APIs
2:  df8b604d ! 2:  95ab3cb6 lib/utmp.c: Use STRNCPYTAIL() instead of its pattern
    @@ Commit message
     
      ## lib/utmp.c ##
     @@
    - #include "string/strcmp/streq.h"
    + #include "string/strcmp/strneq.h"
      #include "string/strcmp/strprefix.h"
      #include "string/strcpy/strncpy.h"
     +#include "string/strcpy/strncpytail.h"
v1g
  • Rebase
$ git rd 
1:  163d8535 ! 1:  4638da01 lib/string/strcpy/: strncpytail(), STRNCPYTAIL(): Add APIs
    @@ lib/Makefile.am: libshadow_la_SOURCES = \
     +  string/strcpy/strncpytail.h \
        string/strcpy/strtcpy.c \
        string/strcpy/strtcpy.h \
    -   string/strdup/strndupa.c \
    +   string/strdup/strdup.c \
     
      ## lib/string/strcpy/strncpytail.c (new) ##
     @@
2:  95ab3cb6 ! 2:  cd8af171 lib/utmp.c: Use STRNCPYTAIL() instead of its pattern
    @@ lib/utmp.c
      #include "string/strcpy/strncpy.h"
     +#include "string/strcpy/strncpytail.h"
      #include "string/strcpy/strtcpy.h"
    - #include "string/strdup/xstrdup.h"
    - #include "string/strdup/xstrndup.h"
    + #include "string/strdup/strdup.h"
    + #include "string/strdup/strndup.h"
     @@ lib/utmp.c: prepare_utmp(const char *name, const char *line, const char *host,
            && ('\0' != ut->ut_id[0])) {
                STRNCPY(utent->ut_id, ut->ut_id);

@alejandro-colomar alejandro-colomar force-pushed the strncpytail branch 2 times, most recently from 9acb3e0 to 0747fec Compare July 18, 2025 13:31
@Karlson2k
Copy link
Contributor

Nice.
Improves code readability.

@Karlson2k
Copy link
Contributor

Karlson2k commented Jul 18, 2025

For the records.
The next code could be faster:

inline char *
strncpytail(char *restrict dst, const char *restrict src, size_t dsize)
{
  size_t src_len;
  src_len = strlen(src);

  if (dsize > src_len)
    return memcpy (dst, src, src_len + 1);

  return memcpy (dst, src + (src_len - dsize), dsize);
}

Large strings / buffers can be processed much faster.

But if performance is not an issue, the code in the PR will work fine.

Note: the binary is still a few bytes larger.

@alejandro-colomar
Copy link
Collaborator Author

For the records. The next code could be faster:

inline char *
strncpytail(char *restrict dst, const char *restrict src, size_t dsize)
{
  size_t src_len;
  src_len = strlen(src);

  if (dsize > src_len)
    return memcpy (dst, src, src_len + 1);

  return memcpy (dst, src + (src_len - dsize), dsize);
}

Large strings / buffers can be processed much faster.

But if performance is not an issue, the code in the PR will work fine.

Note: the binary is still a few bytes larger.

Indeed, since this API is designed for short strings (ut_id), I'll optimize for readability for now.

@alejandro-colomar alejandro-colomar added the Simpler A good issue for a new beginner label Jul 20, 2025
@alejandro-colomar
Copy link
Collaborator Author

@hallyn The term nonstring comes from https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-nonstring-variable-attribute. (To avoid you having to read the lengthy discussion in the other thread.)

@hallyn
Copy link
Member

hallyn commented Aug 11, 2025

@hallyn The term nonstring comes from https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-nonstring-variable-attribute. (To avoid you having to read the lengthy discussion in the other thread.)

Thanks

@Karlson2k

This comment was marked as off-topic.

@hallyn
Copy link
Member

hallyn commented Aug 12, 2025

@hallyn The term nonstring comes from https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-nonstring-variable-attribute. (To avoid you having to read the lengthy discussion in the other thread.)

Also need to mention that GCC means by this term "do not use strlen() and other string-based function because the data MAY be not NUL-terminated or could be binary data", but here, in this project, it means different: "it is a text data in fixed size buffer/array, which terminates at the end of the array without NUL-termination or terminated early by first NUL".

Hm, do we need to define a new term?

@Karlson2k
Copy link
Contributor

Karlson2k commented Aug 12, 2025

@hallyn The term nonstring comes from https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-nonstring-variable-attribute. (To avoid you having to read the lengthy discussion in the other thread.)

Also need to mention that GCC means by this term "do not use strlen() and other string-based function because the data MAY be not NUL-terminated or could be binary data", but here, in this project, it means different: "it is a text data in fixed size buffer/array, which terminates at the end of the array without NUL-termination or terminated early by first NUL".

Hm, do we need to define a new term?

Probably.
@alejandro-colomar insisting that it is perfectly described by "fixed size array with zero-padded data". It also assumed that zero-padding must not be checked and data reading must be stopped at the first zero. Also assumed that the data cannot have zeros because zero is used for padding.
I'm trying to convince Alejandro that what is described is not zero-padding, but is just optional early zero-termination.

I any case, I think both terms "non-string" and "zero-padded array" are confusing and not fully describe this kind of entities.

@Karlson2k

This comment was marked as off-topic.

@alejandro-colomar
Copy link
Collaborator Author

Please, do not continue the discussion unthreaded. That makes it impossible to follow the page. If you want to have a lengthy discussion, open a comment on any line of code, which can be hidden easily.

size_t dsize);


// nonstring copy tail-of-string
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hallyn

I'm standing by this term. For the mnemonic comment explaining the name, nonstring is the most appropriate thing to use. You can think of strn as "string-non".

Copy link
Collaborator Author

@alejandro-colomar alejandro-colomar Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to mention that GCC means by this term "do not use strlen() and other string-based function because the data MAY be not NUL-terminated or could be binary data", but here, in this project, it means different: "it is a text data in fixed size buffer/array, which terminates at the end of the array without NUL-termination or terminated early by first NUL".

Hm, do we need to define a new term?

We already have "fixed-width null-padded character array", which is what utmp(5) members are.

nonstring is a more generic thing, which englobes this and other creatures. But for this mnemonic use case, nonstring is easier to remember.

@Karlson2k Please do not respond here. I already know your opinion, and am not going to agree with it. Anyone interested in it, can read the lengthy thread above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hallyn Please merge when you're happy with the name. Feel free to close all conversations, if you need that to be able to merge.

This works similar to strncpy(3), except that it truncates from the
start of the string if the string doesn't fit.  It is useful for utmp(5)
ut_id, where the tail of the string is more useful and distinctive.

Signed-off-by: Alejandro Colomar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Simpler A good issue for a new beginner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants