-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factor stream resource producing code into a separate class #290
base: main
Are you sure you want to change the base?
Factor stream resource producing code into a separate class #290
Conversation
from ._hdfdataset import _HDFDataset | ||
|
||
@dataclass | ||
class _HDFDataset: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per @coretl:
This should be slimmed to only what's needed to describe a file, not what's needed to write it.
The pattern detector is different from the others because it is actually writing the HDF5 file, not describing a file written by an external program (IOC). For the pattern detector, separate dataclass should extend this to add maxshape
and fillvalue
.
With that separtion, we can make all of these fields required fields.
Some of this information is used to construct the EventDescriptor (shape
, dtype
, name
?, multiplier
). Other information is used to construct the StreamResource (path
, meaning the "path" of the dataset inside the layout of the HDF5 file, distinct from the file; and optionally smwr
).
Add dtype_str
, the numpy-style dtype like "<i4"
or "<f8"
.
This should also include the uri
. (For old-style StreamResource we can convert this into root
and resource_path
by ignoring the hostname, hard-coding root
to "/"
as discussed, and using the rest of the path as resource_path
.)
86d7f97
to
2ac7873
Compare
bundler_composer( | ||
mimetype="application/x-hdf5", | ||
uri=uri, | ||
data_key=ds.name.replace("/", "_"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which would make this:
data_key=ds.name.replace("/", "_"), | |
data_key=ds.data_key, |
I would make data_keys with /
in them an error, not a silent replacement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question regarding the removal of replace("/", "_")
. If I remove this I have to replace DATA_PATH = /entry/data/data
with DATA_PATH = _entry_data_data
in the pattern_generator.py
and so on. If not I get a regex not matching error
from the bluesky runengine . I tentatively replaced the paths this way in the code but would like to ask for suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are different things, data_key is the name within the descriptor, and dataset is the path within the HDF file. So for pattern generator I think we want something like:
_HDFDataset(
data_key=f"{detector_name}-data,
dataset=DATA_PATH,
)
I'm not sure quite how you pass the detector name through and into the pattern generator though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pattern_generator.py
, PATH parameters were assigned to name
(now data_key
). What I meant is that, deleting the replace("/", "_")
resulted in errors so I had to modify the PATH parameters. The other thing is that, I am not sure I understand passing detector name to _HDFDataset
since I am not experienced with the usage of this class. Is a device going to assign the parameters in this class? I would be happy to have more context.
uri=uri, | ||
data_key=ds.name.replace("/", "_"), | ||
parameters={ | ||
"path": ds.path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this would be:
"path": ds.path, | |
"dataset": ds.dataset, |
9f50cb7
to
2aaab09
Compare
#: Name of the data_key within the Descriptor document | ||
data_key: str | ||
dtype_numpy: Optional[str] = None | ||
swmr: bool = False | ||
shape: Optional[List[int]] = None | ||
multiplier: Optional[int] = 1 | ||
#: Name of the dataset within the HDF file | ||
dataset: Optional[str] = None | ||
device_name: Optional[str] = None | ||
block: Optional[str] = None | ||
maxshape: tuple[Any, ...] = (None,) | ||
dtype: Optional[Any] = None | ||
fillvalue: Optional[int] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#: Name of the data_key within the Descriptor document | |
data_key: str | |
dtype_numpy: Optional[str] = None | |
swmr: bool = False | |
shape: Optional[List[int]] = None | |
multiplier: Optional[int] = 1 | |
#: Name of the dataset within the HDF file | |
dataset: Optional[str] = None | |
device_name: Optional[str] = None | |
block: Optional[str] = None | |
maxshape: tuple[Any, ...] = (None,) | |
dtype: Optional[Any] = None | |
fillvalue: Optional[int] = None | |
#: Name of the data_key within the Descriptor document | |
data_key: str | |
# Numpy dtype representation of the type | |
dtype_numpy: str | |
# Is the h5 file written in SingleWriter, MultipleRead mode | |
swmr: bool = False | |
# Shape of a frame of the h5 file | |
shape: List[int] = [] | |
# How many frames per Stream Datum index | |
multiplier: int = 1 | |
#: Name of the dataset within the HDF file | |
dataset: str = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that looks right to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for explaining!
|
||
# pixel sum path | ||
SUM_PATH = "/entry/sum" | ||
SUM_PATH = "_entry_sum" | ||
|
||
MAX_UINT8_VALUE = np.iinfo(np.uint8).max | ||
|
||
SLICE_NAME = "AD_HDF5_SWMR_SLICE" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dataclass | |
class PatternDataset(_HDFDataset): | |
maxshape: tuple[Any, ...] = (None,) | |
fillvalue: Optional[int] = None | |
dtype: Optional[type] = None # Or whatever this was on the previous class |
def get_full_file_description( | ||
datasets: List[DatasetConfig], outer_shape: tuple[int, ...] | ||
datasets: List[_HDFDataset], outer_shape: tuple[int, ...] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datasets: List[_HDFDataset], outer_shape: tuple[int, ...] | |
datasets: List[PatternDataset], outer_shape: tuple[int, ...] |
@@ -255,24 +173,24 @@ def _get_new_path(self, directory: DirectoryProvider) -> Path: | |||
new_path: Path = info.root / info.resource_dir / filename | |||
return new_path | |||
|
|||
def _get_datasets(self) -> List[DatasetConfig]: | |||
raw_dataset = DatasetConfig( | |||
def _get_datasets(self) -> List[_HDFDataset]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_datasets(self) -> List[_HDFDataset]: | |
def _get_datasets(self) -> List[PatternDataset]: |
def _get_datasets(self) -> List[DatasetConfig]: | ||
raw_dataset = DatasetConfig( | ||
def _get_datasets(self) -> List[_HDFDataset]: | ||
raw_dataset = _HDFDataset( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raw_dataset = _HDFDataset( | |
raw_dataset = PatternDataset( |
Closes #173