Skip to content

Investigate weird serialization logic #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions includes/object-cache.php
Original file line number Diff line number Diff line change
Expand Up @@ -1976,11 +1976,16 @@ protected function maybe_unserialize( $original ) {
return igbinary_unserialize( $original );
}

// Don't attempt to unserialize data that wasn't serialized going in.
// Unserialize data that was serialized
if ( $this->is_serialized( $original ) ) {
// phpcs:ignore WordPress.PHP.NoSilencedErrors.Discouraged, WordPress.PHP.DiscouragedPHPFunctions.serialize_unserialize
$value = @unserialize( $original );

// Just in case the data was serialized twice
if ( is_string( $value ) && $this->is_serialized( $value ) ) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_string test is the first test in the is_serialized method - we can drop the first condition.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference the WordPress maybe_serialize function is as follows:

function maybe_serialize( $data ) {
    if ( is_array( $data ) || is_object( $data ) ) {
        return serialize( $data );
    }
 
    /*
     * Double serialization is required for backward compatibility.
     * See https://core.trac.wordpress.org/ticket/12930
     * Also the world will end. See WP 3.6.1.
     */
    if ( is_serialized( $data, false ) ) {
        return serialize( $data );
    }
 
    return $data;
}

Don't think we should follow the backward compatibility discussed in https://core.trac.wordpress.org/ticket/12930

$value = @unserialize( $original );
}

return is_object( $value ) ? clone $value : $value;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we cloning the unserialized object right away?

}

Expand All @@ -2006,17 +2011,13 @@ protected function maybe_serialize( $data ) {
return igbinary_serialize( $data );
}

if ( is_array( $data ) || is_object( $data ) ) {
// phpcs:ignore WordPress.PHP.DiscouragedPHPFunctions.serialize_serialize
return serialize( $data );
}

// Don't serialize data that's already serialized
if ( $this->is_serialized( $data, false ) ) {
// phpcs:ignore WordPress.PHP.DiscouragedPHPFunctions.serialize_serialize
return serialize( $data );
return $data;
}

return $data;
// phpcs:ignore WordPress.PHP.DiscouragedPHPFunctions.serialize_serialize
return serialize( $data );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always serialize, except if it already was serialized? @naxvog thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Redis currently contain double serialized data? If so, what were the use cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<?php
class Naxvog_Test {
        private $creation_time;
        private $serialization_time;
        public $some_prop;
        private $some_secret_prop;
        public function __construct() {
                $this->creation_time = time();
                $this->some_secret_prop = 'baz';
                $this->some_prop = 'bar';
        }
        public function __serialize() {
                return [
                        'serialization_time' => time(),
                        'creation_time' => $this->creation_time,
                ];
        }
        public function __unserialize( $data ) {
                $this->serialization_time = $data['serialization_time'];
                $this->creation_time = $data['creation_time'];
        }
}
add_action( 'init', function() {
        $a = new Naxvog_Test();
        $s = serialize( $a );
        #var_dump( $a, $s );
        #var_dump( unserialize( $s ) );
        if ( isset( $_GET['debug-set'] ) ) {
                wp_cache_set( 'naxvog_test_obj1', $a );
                wp_cache_set( 'naxvog_test_obj2', $s );
        }
        if ( isset( $_GET['debug-get'] ) ) {
                var_dump( wp_cache_get( 'naxvog_test_obj1' ) );
                var_dump( wp_cache_get( 'naxvog_test_obj2' ) );
        }
} );

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check is_serialized() twice in maybe_unserialize()?

Copy link
Collaborator

@naxvog naxvog Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather prefer a conversion script running on plugin update resolving such issues but this might be a non trivial approach, would most likely require a LUA script and will take some time (best suited for action scheduler).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of any valid way to serialize twice. Sure there might be serialized data within a serialized object but this is expected behaviour.

Have to look again but I'm fairly sure that I have not found any double serialized key our docker dev environment. Will have a look on my production redis instance. Should be fairly easy to find.

Copy link
Collaborator

@naxvog naxvog Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK found some instances (ran vim searching for [[:cntrl:]]s:[[:digit:]]\+:"O on a copy of the rdb database):

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply solve this by running this twice?

// Don't attempt to unserialize data that wasn't serialized going in.
if ( $this->is_serialized( $original ) ) {
// phpcs:ignore WordPress.PHP.NoSilencedErrors.Discouraged, WordPress.PHP.DiscouragedPHPFunctions.serialize_unserialize
$value = @unserialize( $original );
return is_object( $value ) ? clone $value : $value;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to look in the code of those plugins to confirm that this is not intended. If not we should find the cause for this double serialization instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running unserialize() twice seems risky, in case plugins do their own serialization.

}

/**
Expand Down Expand Up @@ -2073,6 +2074,7 @@ protected function is_serialized( $data, $strict = true ) {
return false;
}
}

$token = $data[0];

switch ( $token ) {
Expand Down