Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Pure Python NFS client #997

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

POC: Pure Python NFS client #997

wants to merge 14 commits into from

Conversation

twiggler
Copy link
Contributor

@twiggler twiggler commented Jan 17, 2025

This is an experimental NFS V3 client which supports:

  • Querying of PortMapper to discover ports of programs
  • Mounting of a root directory
  • Reading of directory entries
  • Reading (downloading) of a file.
  • Authentication using AuthNull & AuthUnix

A demo is also included.
Some coarse tests are included.

The code is considered POC quality and thus a bit rough around the edges.

References:

@twiggler twiggler force-pushed the nfs branch 3 times, most recently from 3e96e96 to cddedf5 Compare January 17, 2025 16:15
@twiggler twiggler requested a review from Miauwkeru January 17, 2025 16:15
@twiggler twiggler force-pushed the nfs branch 2 times, most recently from 7f8cf31 to 41060bb Compare January 17, 2025 16:24
Copy link

codecov bot commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 82.78805% with 121 lines in your changes missing coverage. Please review.

Project coverage is 77.89%. Comparing base (aab863c) to head (0314b25).
Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
dissect/target/helpers/nfs/demo.py 0.00% 44 Missing ⚠️
dissect/target/helpers/sunrpc/serializer.py 85.09% 31 Missing ⚠️
dissect/target/helpers/nfs/client.py 70.37% 16 Missing ⚠️
dissect/target/helpers/nfs/serializer.py 84.26% 14 Missing ⚠️
dissect/target/helpers/sunrpc/client.py 88.34% 12 Missing ⚠️
dissect/target/helpers/nfs/nfs3.py 98.44% 2 Missing ⚠️
dissect/target/helpers/sunrpc/sunrpc.py 97.36% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #997      +/-   ##
==========================================
+ Coverage   77.71%   77.89%   +0.17%     
==========================================
  Files         326      334       +8     
  Lines       28543    29362     +819     
==========================================
+ Hits        22183    22872     +689     
- Misses       6360     6490     +130     
Flag Coverage Δ
unittests 77.89% <82.78%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dissect/target/helpers/nfs/client.py Outdated Show resolved Hide resolved
Comment on lines 35 to 36
if mount_stat != MountStat.MNT3_OK:
return mount_stat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just raising an exception would be better, so we have a consistent output of the deserialize function?

Copy link
Contributor Author

@twiggler twiggler Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I keep this as-is to closely follow the rfc here, and then throw an exception in a MountClient (to be created), which can be built on top of the sunrpc.Client. Or perhaps I add a mount method to the NfsClient, which is simpler.

I will add a comment in the demo

Comment on lines +28 to +31
class Serializer(ABC, Generic[Serializable]):
@abstractmethod
def serialize(self, _: Serializable) -> bytes:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is a better idea to have something like this:

class Serializable(ABC):
    @abstractmethod
    def serialize(self) -> bytes:
        pass

@dataclass
class CookieVerf3(Serializer):

    def serialize(self) -> bytes:
        ... # Do the serialize code here

Then the serialization can be done in the dataclasses instead of multiple different serializers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(see reply on following comment)

Comment on lines 65 to 71
class Deserializer(ABC, Generic[Serializable]):
def deserialize_from_bytes(self, payload: bytes) -> Serializable:
return self.deserialize(io.BytesIO(payload))

@abstractmethod
def deserialize(self, _: io.BytesIO) -> Serializable:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be an idea to do something like this:

# Reference to self
from typing_extensions import Self


class Deserializable(ABC):
    @classmethod
    def from_bytes(cls, payload: bytes) -> Self:
        return self.deserialize(io.BytesIO(payload))

    @abstractmethod
    @classmethod
    def deserialize(cls, _: io.BytesIO) -> Self:
        pass

    ...

@dataclass
class FileAttributes3(Deserializer):
    def deserialize(cls, payload: io.BytesIO) -> Self:
        ...
        return cls(...)

All the _read_* functions would need to be converted to a classmethod to to use it in the deserialize functions.

However, it would bring the serialization/deserialization closer to the data it actually tries to deserialize/serialize which i feel makes more sense.

Copy link
Contributor Author

@twiggler twiggler Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, I think it is not a bad idea to treat serialization to a specific format as a separate concern, especially since the cost seems to be low.

Related, if we make deserializable a class method then I wonder how we can for example deserialize the params and result child members in the Message(Serializer). I don´t think python has support for reifying the type parameters at run time, and hence we would not know which deserializer to use for fields of generic dataclasses .

Finally coupling the data to the deserializer makes it harder to compose parsers.
I favored to keep things simple, but for example the _read_optional and _read_var_length could be made more flexible by returning parsers themselves if so required in later stories:

_read_var_length(self, payload: io.BytesIO, deserializer: Deserializer[ElementType]) -> Deserializer[list[ElementType]]:,

_read_optional(self, payload: io.BytesIO, deserializer: Deserializer[ElementType]) -> Deserializer[ElementType | None]

, so that they can be chained futher. Another example is a or combinator.
When we couple the data to the deserialization, we shut this down.

Copy link
Contributor Author

@twiggler twiggler Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @Miauwkeru IRL:

His proposal is to use polymorphism and replace fields which are dependent on a type parameter with Deserializable, as in the following:

X = TypeVar("Verifier")


class Deserializable():
    @classmethod
    def deserialize(cls, data: bytes) -> Deserializable:
        pass


@dataclass
class Z(Deserializable):
    @classmethod
    def deserialize(cls, data: bytes) -> Z:
        return cls(s="Roel", i=5)

    s: str
    i: int


@dataclass
class K(Generic[X], Deserializable):
    z: Deserializable
    a: X


z = Z(s="Roel", i=5)

# Access through either Generic or Deserializable 
k = K[Z](z=z, a=z)

k.z.i   # type error (Any), access through super class
z2: Z = k.z, z.i  # ok, but tedious
cast(Z, k.a).i  # ok, but tedious

k.a.i   # ok, access through field dependent on type parameter

The downside to this approach is that we lose type inference if we access z through k.z.i, compared to the approach with generics, as in k.a.i.

This type inference is for example used in Client::readdirplus, where results gets correctly inferred to be ReadDirPlusResult3 | NfsStat.

It needs to be carefully weighed if the strong locality of Serializer with its data offsets the loss of type inference.
Besided that, shutting down of composition of deserializers is also a concern.

Comment on lines 252 to 261
if messageType == MessageType.REPLY:
replyStat = self._read_enum(payload, ReplyStat)
if replyStat == ReplyStat.MSG_ACCEPTED:
reply = self._read_accepted_reply(payload)
elif replyStat == ReplyStat.MSG_DENIED:
reply = self._read_rejected_reply(payload)

return sunrpc.Message(xid, reply)

raise NotImplementedError("Only REPLY messages are deserializable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduces indentation a bit

Suggested change
if messageType == MessageType.REPLY:
replyStat = self._read_enum(payload, ReplyStat)
if replyStat == ReplyStat.MSG_ACCEPTED:
reply = self._read_accepted_reply(payload)
elif replyStat == ReplyStat.MSG_DENIED:
reply = self._read_rejected_reply(payload)
return sunrpc.Message(xid, reply)
raise NotImplementedError("Only REPLY messages are deserializable")
if messageType != MessageType.REPLY:
raise NotImplementedError("Only REPLY messages are deserializable")
replyStat = self._read_enum(payload, ReplyStat)
if replyStat == ReplyStat.MSG_ACCEPTED:
reply = self._read_accepted_reply(payload)
elif replyStat == ReplyStat.MSG_DENIED:
reply = self._read_rejected_reply(payload)
return sunrpc.Message(xid, reply)

Comment on lines 95 to 104
MNT3_OK = 0 # no error
MNT3ERR_PERM = 1 # Not owner
MNT3ERR_NOENT = 2 # No such file or directory
MNT3ERR_IO = 5 # I/O error
MNT3ERR_ACCES = 13 # Permission denied
MNT3ERR_NOTDIR = 20 # Not a directory
MNT3ERR_INVAL = 22 # Invalid argument
MNT3ERR_NAMETOOLONG = 63 # Filename too long
MNT3ERR_NOTSUPP = 10004 # Operation not supported
MNT3ERR_SERVERFAULT = 10006 # A failure on the server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea to remove the MNT3 prefix for less visual noise?

pass


class Client(Generic[Credentials, Verifier]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion for a bit of additional usability, would it be an idea to also make this a context manager or create a context manager for this class?

that will allow you to do:

with Client(...):
   ...

and will automatically clean up the connections it makes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thought about it but it did not make the cut.

Probably more pressing than I originally thought, given that the demo was eating up ports

dissect/target/helpers/sunrpc/client.py Outdated Show resolved Hide resolved


class PortMappingSerializer(Serializer[sunrpc.PortMapping]):
def serialize(self, portMapping: sunrpc.PortMapping) -> bytes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake_case

Comment on lines 37 to 39
mount_port = port_mapper_client.call(100000, 2, 3, params_mount, PortMappingSerializer(), UInt32Serializer())
params_nfs = PortMapping(program=NFS_PROGRAM, version=NFS_V3, protocol=Protocol.TCP)
nfs_port = port_mapper_client.call(100000, 2, 3, params_nfs, PortMappingSerializer(), UInt32Serializer())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do the numbers mean?

@@ -36,20 +40,24 @@ class ReadDirResult(NamedTuple):
# RdJ Bit annoying that the Credentials and Verifier keep propagating as type parameters of the class.
# Alternatively, we could use type erasure and couple the auth data with the auth serializer,
# and make the auth data in the `CallBody` class opaque.
class Client(Generic[Credentials, Verifier]):
class Client(AbstractContextManager, Generic[Credentials, Verifier]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically no need to inherit AbstractContextManager. You could also just define the __enter__ and __exit__ functions. More isn't needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I find it convenient, this way you get the __enter__ for free.

Comment on lines 253 to 259
replyStat = self._read_enum(payload, ReplyStat)
if replyStat == ReplyStat.MSG_ACCEPTED:
reply = self._read_accepted_reply(payload)
elif replyStat == ReplyStat.MSG_DENIED:
reply = self._read_rejected_reply(payload)

return sunrpc.Message(xid, reply)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to avoid any weird shenanigans, maybe still a good idea to put reply as None above the conditions. So it is at least defined.

@classmethod
def connect(
cls, hostname: str, port: int, auth: AuthScheme[Credentials, Verifier], local_port: int = 0
) -> "Client":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the annotations allow you to not use the "Client" but Client instead, if i r

Suggested change
) -> "Client":
) -> Client:

PMAP_PORT = 111

@classmethod
def connect_port_mapper(cls, hostname: str) -> "Client":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def connect_port_mapper(cls, hostname: str) -> "Client":
def connect_port_mapper(cls, hostname: str) -> Client:

) -> Results:
"""Synchronously call an RPC procedure and return the result"""

callBody = sunrpc.CallBody(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake_case

Comment on lines 127 to 128
filehandle: FileHandle3
authFlavors: list[int]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake case

Comment on lines 225 to 228
paramsSerializer: XdrSerializer[ProcedureParams],
resultsDeserializer: XdrDeserializer[ProcedureResults],
credentialsSerializer: AuthSerializer[Credentials],
verifierSerializer: AuthSerializer[Verifier],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake_case

Comment on lines 87 to 90
filehandle=FileHandle3(
opaque=b"\x01\x00\x07\x00\x02\x00\xec\x02\x00\x00\x00\x00\xb5g\x131&\xf1I\xed\xb8R\rx\\h8\xb4"
),
authFlavors=[1],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake_case

Comment on lines 48 to 60
def serialize(self, i: int) -> bytes:
return i.to_bytes(length=4, byteorder="big", signed=True)

def deserialize(self, payload: io.BytesIO) -> int:
return int.from_bytes(payload.read(4), byteorder="big", signed=True)


class UInt32Serializer(Serializer[int], Deserializer[int]):
def serialize(self, i: int) -> bytes:
return i.to_bytes(length=4, byteorder="big", signed=False)

def deserialize(self, payload: io.BytesIO) -> int:
return int.from_bytes(payload.read(4), byteorder="big", signed=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def serialize(self, i: int) -> bytes:
return i.to_bytes(length=4, byteorder="big", signed=True)
def deserialize(self, payload: io.BytesIO) -> int:
return int.from_bytes(payload.read(4), byteorder="big", signed=True)
class UInt32Serializer(Serializer[int], Deserializer[int]):
def serialize(self, i: int) -> bytes:
return i.to_bytes(length=4, byteorder="big", signed=False)
def deserialize(self, payload: io.BytesIO) -> int:
return int.from_bytes(payload.read(4), byteorder="big", signed=False)
_signed = True
def serialize(self, i: int) -> bytes:
return i.to_bytes(length=4, byteorder="big", signed=self._signed)
def deserialize(self, payload: io.BytesIO) -> int:
return int.from_bytes(payload.read(4), byteorder="big", signed=self._signed)
class UInt32Serializer(Int32Serializer):
_signed = False

wouldn't something like this work?

Comment on lines 10 to 17
ProcedureParams = TypeVar("ProcedureParams")
ProcedureResults = TypeVar("ProcedureResults")
Credentials = TypeVar("Credentials")
Verifier = TypeVar("Verifier")
Serializable = TypeVar("Serializable")
AuthProtocol = TypeVar("AuthProtocol")
EnumType = TypeVar("EN", bound=Enum)
ElementType = TypeVar("ET")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an idea to add bound for every type var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, there is no bound on any of the other TypeVars

@twiggler twiggler requested a review from Miauwkeru January 28, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants