Intelligent preflagging of obviously bad data/weights #182

JSKenyon · 2022-07-22T11:42:03Z

Describe the problem that the feature should address
Currently, QuartiCal endeavours to catch np.inf,np.nan and 0 values in the input data. Values may, however, be obviously bad without falling into one of these categories. This includes data/weight values with peculiarly large or small e.g. a weight of 1e30, or a data point with 100x the amplitude of its neighbours. The difficulty comes from identifying these points on the fly without introducing something as inflexible as an arbitrary threshold.

Describe the solution you'd like
I am still a little unsure of how I want to handle this. The simplest approach may be a very cautious thresholding operation based on the median of a chunk of weights/data. This should only fail if the data is very poorly flagged (flagged points will not be considered in the evaluation of the median).

Describe alternatives you've considered
It may be possible to accomplish this using a filter, but I suspect it will be too sensitive/unreliable near chunk edges and in regions where many points are affected.

Additional context
This was motivated by a segfault reported by @ulricharmel. The segfault was related to values of 1e37 in the weight column causing problems in the inversion code (likely an overflow).

The text was updated successfully, but these errors were encountered:

JSKenyon added the enhancement New feature or request label Jul 22, 2022

JSKenyon self-assigned this Jul 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intelligent preflagging of obviously bad data/weights #182

Intelligent preflagging of obviously bad data/weights #182

JSKenyon commented Jul 22, 2022

Intelligent preflagging of obviously bad data/weights #182

Intelligent preflagging of obviously bad data/weights #182

Comments

JSKenyon commented Jul 22, 2022