Skip to content

Conversation

@alejandro-colomar
Copy link
Collaborator

@alejandro-colomar alejandro-colomar commented Aug 11, 2025

I've optimized the algorithm in several ways (both performance and readability):

  • If there's a match-by-pid, we don't spend time copying a match-by-line, which is considered worse.
  • By using the stack, we don't have superfluous allocations.
  • The enum and the ordered levels of matching make the algorithm much more readable than it originally was.

Reported-by: @Karlson2k

Queued after #1296 .


Revisions:

v2
$ git range-diff shadow/master gh/ut  ut
1:  722a3baa = 1:  722a3baa lib/utmp.c: get_current_utmp(): Rename local variable
2:  34b2ed3b = 2:  34b2ed3b lib/utmp.c: get_current_utmp(): Refactor conditional
3:  8d47dcb6 = 3:  8d47dcb6 lib/utmp.c: get_current_utmp(): Use an enum for readability
4:  9e8fd525 = 4:  9e8fd525 lib/utmp.c: get_current_utmp(): Use the stack for temporary copies
-:  -------- > 5:  9a3e362e lib/string/strdup/: memdup(), MEMDUP(): Add APIs
-:  -------- > 6:  1e72a0b8 lib/utmp.c: get_current_utmp(): Don't exit(3) from library code
-:  -------- > 7:  eb01f5ac lib/utmp.c: get_current_utmp(): Use MEMDUP() instead of its pattern
-:  -------- > 8:  9f55b18b lib/utmp.c: get_current_utmp(): Move MEMDUP() to the return statement
v2b
  • Rebase
$ git rd 
1:  722a3baa = 1:  4750a0d2 lib/utmp.c: get_current_utmp(): Rename local variable
2:  34b2ed3b = 2:  d6d8b16f lib/utmp.c: get_current_utmp(): Refactor conditional
3:  8d47dcb6 = 3:  544d7b16 lib/utmp.c: get_current_utmp(): Use an enum for readability
4:  9e8fd525 = 4:  1c8dc8d1 lib/utmp.c: get_current_utmp(): Use the stack for temporary copies
5:  9a3e362e = 5:  8624ca55 lib/string/strdup/: memdup(), MEMDUP(): Add APIs
6:  1e72a0b8 = 6:  36941b6b lib/utmp.c: get_current_utmp(): Don't exit(3) from library code
7:  eb01f5ac = 7:  e1a28feb lib/utmp.c: get_current_utmp(): Use MEMDUP() instead of its pattern
8:  9f55b18b = 8:  791e81b9 lib/utmp.c: get_current_utmp(): Move MEMDUP() to the return statement
v2c
  • Rebase
$ git rd 
1:  4750a0d2 = 1:  c5b432f6 lib/utmp.c: get_current_utmp(): Rename local variable
2:  d6d8b16f = 2:  0d98f40b lib/utmp.c: get_current_utmp(): Refactor conditional
3:  544d7b16 = 3:  38661891 lib/utmp.c: get_current_utmp(): Use an enum for readability
4:  1c8dc8d1 = 4:  eb3f0659 lib/utmp.c: get_current_utmp(): Use the stack for temporary copies
5:  8624ca55 = 5:  c1820b1b lib/string/strdup/: memdup(), MEMDUP(): Add APIs
6:  36941b6b = 6:  6d10069a lib/utmp.c: get_current_utmp(): Don't exit(3) from library code
7:  e1a28feb = 7:  b9702a86 lib/utmp.c: get_current_utmp(): Use MEMDUP() instead of its pattern
8:  791e81b9 = 8:  7e0ea2a8 lib/utmp.c: get_current_utmp(): Move MEMDUP() to the return statement
v2d
  • Rebase
$ git rd 
1:  c5b432f6 = 1:  2ac8ca73 lib/utmp.c: get_current_utmp(): Rename local variable
2:  0d98f40b = 2:  e3c226af lib/utmp.c: get_current_utmp(): Refactor conditional
3:  38661891 = 3:  a210a977 lib/utmp.c: get_current_utmp(): Use an enum for readability
4:  eb3f0659 = 4:  a0b06c75 lib/utmp.c: get_current_utmp(): Use the stack for temporary copies
5:  c1820b1b ! 5:  e96d999c lib/string/strdup/: memdup(), MEMDUP(): Add APIs
    @@ lib/Makefile.am: libshadow_la_SOURCES = \
        string/strcpy/strtcpy.h \
     +  string/strdup/memdup.c \
     +  string/strdup/memdup.h \
    +   string/strdup/strdup.c \
    +   string/strdup/strdup.h \
        string/strdup/strndupa.c \
    -   string/strdup/strndupa.h \
    -   string/strdup/xstrdup.c \
     
      ## lib/string/strdup/memdup.c (new) ##
     @@
6:  6d10069a ! 6:  c67333a5 lib/utmp.c: get_current_utmp(): Don't exit(3) from library code
    @@ Commit message
         Signed-off-by: Alejandro Colomar <[email protected]>
     
      ## lib/utmp.c ##
    -@@
    - #include <string.h>
    - #include <fcntl.h>
    - 
    -+#include "alloc/malloc.h"
    - #include "alloc/x/xcalloc.h"
    --#include "alloc/x/xmalloc.h"
    - #include "attr.h"
    - #include "sizeof.h"
    - #include "string/strchr/strnul.h"
     @@ lib/utmp.c: get_current_utmp(pid_t main_pid)
        }
      
7:  b9702a86 ! 7:  832bfc52 lib/utmp.c: get_current_utmp(): Use MEMDUP() instead of its pattern
    @@ Commit message
     
      ## lib/utmp.c ##
     @@
    - #include <string.h>
      #include <fcntl.h>
      
    + #include "alloc/calloc.h"
     -#include "alloc/malloc.h"
    - #include "alloc/x/xcalloc.h"
      #include "attr.h"
      #include "sizeof.h"
    + #include "string/strchr/strnul.h"
     @@
      #include "string/strcmp/strprefix.h"
      #include "string/strcpy/strncpy.h"
      #include "string/strcpy/strtcpy.h"
     +#include "string/strdup/memdup.h"
    - #include "string/strdup/xstrdup.h"
    - #include "string/strdup/xstrndup.h"
    + #include "string/strdup/strdup.h"
    + #include "string/strdup/strndup.h"
      
     @@ lib/utmp.c: get_current_utmp(pid_t main_pid)
                }
8:  7e0ea2a8 = 8:  a9f85f1b lib/utmp.c: get_current_utmp(): Move MEMDUP() to the return statement
v3
$ git range-diff shadow/master..gh/ut gh/memdup..ut 
1:  2ac8ca73c < -:  --------- lib/utmp.c: get_current_utmp(): Rename local variable
2:  e3c226af5 = 1:  84d29f190 lib/utmp.c: get_current_utmp(): Refactor conditional
3:  a210a9773 = 2:  579b7aaa4 lib/utmp.c: get_current_utmp(): Use an enum for readability
4:  a0b06c75e ! 3:  ecee78c80 lib/utmp.c: get_current_utmp(): Use the stack for temporary copies
    @@ lib/utmp.c: get_current_utmp(pid_t main_pid)
      
     -  if (NULL == ut)
     -          ut = ut_by_pid ?: ut_by_line;
    --
    --  if (NULL != ut) {
    --          struct utmpx  *dup;
    --
    --          dup = XMALLOC(1, struct utmpx);
    --          memcpy(dup, ut, sizeof(*ut));
    --          ut = dup;
    -+  if (match) {
    -+          ut = XMALLOC(1, struct utmpx);
    -+          *ut = ut_copy;
    -+  } else {
    -+          ut = NULL;
    -   }
    ++  ut = match ? MEMDUP(&ut_copy, struct utmpx) : NULL;
      
    +-  if (NULL != ut)
    +-          ut = MEMDUP(ut, struct utmpx);
    +-
     -  free(ut_by_line);
     -  free(ut_by_pid);
        endutxent();
5:  e96d999c3 < -:  --------- lib/string/strdup/: memdup(), MEMDUP(): Add APIs
6:  c67333a54 < -:  --------- lib/utmp.c: get_current_utmp(): Don't exit(3) from library code
7:  832bfc52f < -:  --------- lib/utmp.c: get_current_utmp(): Use MEMDUP() instead of its pattern
8:  a9f85f1b1 = 4:  7c64a9b26 lib/utmp.c: get_current_utmp(): Move MEMDUP() to the return statement
$ git diff gh/ut..ut 
diff --git a/lib/string/strdup/memdup.h b/lib/string/strdup/memdup.h
index df21d624b..cff575b9a 100644
--- a/lib/string/strdup/memdup.h
+++ b/lib/string/strdup/memdup.h
@@ -22,7 +22,7 @@ ATTR_MALLOC(free)
 inline void *memdup(const void *p, size_t size);
 
 
-// memory duplicate
+// memdup - memory duplicate
 inline void *
 memdup(const void *p, size_t size)
 {
diff --git a/lib/utmp.c b/lib/utmp.c
index a965bfa1d..7c0c1cc10 100644
--- a/lib/utmp.c
+++ b/lib/utmp.c
@@ -27,6 +27,7 @@
 #include <fcntl.h>
 
 #include "alloc/calloc.h"
+#include "alloc/malloc.h"
 #include "attr.h"
 #include "sizeof.h"
 #include "string/strchr/strnul.h"

@alejandro-colomar alejandro-colomar changed the title ut match algorithm improvements. ut-match algorithm improvements. Aug 11, 2025
@alejandro-colomar alejandro-colomar marked this pull request as ready for review August 11, 2025 12:34
Copy link
Contributor

@Karlson2k Karlson2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one looks fine

Copy link
Contributor

@Karlson2k Karlson2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one looks bad.
Separating conditions by moving them to the nested "if()" does not improve readability.

@alejandro-colomar
Copy link
Collaborator Author

alejandro-colomar commented Aug 11, 2025

This one looks bad. Separating conditions by moving them to the nested "if()" does not improve readability.

IMO, it's more readable in this specific case, because it aligns the similar conditions. It also makes the other commits smaller.

(In fact, it made me finally fully understand it, which I hadn't understood from the previous implementation.)

@Karlson2k
Copy link
Contributor

Karlson2k commented Aug 11, 2025

This one looks bad. Separating conditions by moving them to the nested "if()" does not improve readability.

IMO, it's more readable in this specific case, because it aligns the similar conditions. It also makes the other commits smaller.

"Smaller commits" should not be a reason for "worse code".
Try with the initial order or conditions and enum check at the first place.
It is even more readable then this.

		} else if (   (match < UT_MATCH_BY_LINE)
			   && (LOGIN_PROCESS == ut->ut_type) /* Be more picky when matching by 'ut_line' only */
			   && is_my_tty(ut->ut_line))
		{
			match = UT_MATCH_BY_LINE;
			ut_copy = *ut;
		}

vs

		} else if (   LOGIN_PROCESS == ut->ut_type /* Be more picky when matching by 'ut_line' only */
			   && is_my_tty(ut->ut_line)) {
			if (match < UT_MATCH_BY_LINE)
			{
				match = UT_MATCH_BY_LINE;
				ut_copy = *ut;
			}
		}

Simpler, shorter, better and more efficient.

Copy link
Contributor

@Karlson2k Karlson2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested if() should be avoided.

@alejandro-colomar
Copy link
Collaborator Author

This one looks bad. Separating conditions by moving them to the nested "if()" does not improve readability.

IMO, it's more readable in this specific case, because it aligns the similar conditions. It also makes the other commits smaller.

"Smaller commits" should not be a reason for "worse code". Try with the initial order or conditions and enum check at the first place. It is even more readable then this.

		} else if (   (match < UT_MATCH_BY_LINE)
			   && (LOGIN_PROCESS == ut->ut_type) /* Be more picky when matching by 'ut_line' only */
			   && is_my_tty(ut->ut_line))
		{
			match = UT_MATCH_BY_LINE;
			ut_copy = *ut;
		}

vs

		} else if (   LOGIN_PROCESS == ut->ut_type /* Be more picky when matching by 'ut_line' only */
			   && is_my_tty(ut->ut_line)) {
			if (match < UT_MATCH_BY_LINE)
			{
				match = UT_MATCH_BY_LINE;
				ut_copy = *ut;
			}
		}

Simpler, shorter, better

I disagree.

I like much better the latter (but I'd fix the braces). Both UT_MATCH_BY_LINE appear together, which makes more sense. It is also more consistent with the other uses of the enumerators in the loop.

and more efficient.

@Karlson2k
Copy link
Contributor

Karlson2k commented Aug 11, 2025

Besides inconsistent styles, questionable readability improvements in some places, the code is fine.
It should work.
A single enum variable is better then two boolean variables.

This function already returned NULL on some errors.  It didn't make any
sense to exit(3) on allocation failure.  Instead, just return NULL.

Signed-off-by: Alejandro Colomar <[email protected]>
Cc: "Evgeny Grin (Karlson2k)" <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
This aligns similar code, enhancing readability of the algorithm.

Signed-off-by: Alejandro Colomar <[email protected]>
This tells in a more readable way what kind of match we have.
I'll use this enum for simplifying other stuff in the following commit.

Signed-off-by: Alejandro Colomar <[email protected]>
As Evgeny reported, malloc(3) is overkill here.  He suggested using
other methods, which are less than ideal, but it's true that it's
overkill, and with the stack we can get something readable and more
efficient.

Reported-by: Evgeny Grin (Karlson2k) <[email protected]>
Signed-off-by: Alejandro Colomar <[email protected]>
This removes the reuse of 'ut', with which I wasn't entirely happy.
It also reduces the scope between setutxent()/endutxent().

Signed-off-by: Alejandro Colomar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants