Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle problem with dff auto-percentile calculation #1288

Merged
merged 2 commits into from
Feb 22, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 31 additions & 16 deletions caiman/utils/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,8 @@ def df_percentile(inputData, axis=None):
Extracting the percentile of the data where the mode occurs and its value.
Used to determine the filtering level for DF/F extraction. Note that
computation can be inaccurate for short traces.

If errors occur, will just return fallback value of median (50% percentile)
Copy link
Member

@pgunn pgunn Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If errors occur, returns fallback value ...

"""
if axis is not None:

Expand All @@ -181,24 +183,37 @@ def fnc(x):
else:
# Create the function that we can use for the half-sample mode
err = True
while err:
err_count = 0
max_err_count = 10
while err==True and err_count < max_err_count:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while err and err_count...

Testing boolean values for ==True is an antipattern

try:
bandwidth, mesh, density, cdf = kde(inputData)
err = False
except:
logging.warning('Percentile computation failed. Duplicating ' + 'and trying again.')
except ValueError:
err_count += 1
logging.warning(f"kde() failure {err_count}. Concatenating traces and recomputing.")
if not isinstance(inputData, list):
inputData = inputData.tolist()
inputData += inputData

data_prct = cdf[np.argmax(density)] * 100
val = mesh[np.argmax(density)]
# There are three ways the above calculation can go wrong.
# For each case: just use median.
# if cdf was never defined, it means kde never ran without error
try:
cdf
except NameError:
logging.warning('Max err count reached. Reverting to median.')
data_prct = 50
val = np.median(np.array(inputData))
else:
data_prct = cdf[np.argmax(density)] * 100
val = mesh[np.argmax(density)]

if data_prct >= 100 or data_prct < 0:
logging.warning('Invalid percentile computed possibly due ' + 'short trace. Duplicating and recomuputing.')
if not isinstance(inputData, list):
inputData = inputData.tolist()
inputData *= 2
err = True
logging.warning('Invalid percentile (<0 or >100) computed. Reverting to median.')
data_prct = 50
val = np.median(np.array(inputData))

if np.isnan(data_prct):
logging.warning('NaN percentile computed. Reverting to median.')
data_prct = 50
Expand All @@ -215,8 +230,6 @@ def fnc(x):
Daniel B. Smith, PhD
Updated 1-23-2013
"""


def kde(data, N=None, MIN=None, MAX=None):

# Parameters to set up the mesh on which to calculate
Expand All @@ -241,12 +254,13 @@ def kde(data, N=None, MIN=None, MAX=None):
SqDCTData = (DCTData[1:] / 2)**2

# The fixed point calculation finds the bandwidth = t_star
# fixed_point is function defined below
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited it a little. Will be nice to come back at some point and document better. The matlab code that it's taken from is slightly better documented: it's a useful method for finding the optimal width of a gaussian kernel to smooth data so you can get a kernel density estimate (estimate of a probability density). Original Matlab code:
https://www.mathworks.com/matlabcentral/fileexchange/14034-kernel-density-estimator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just meant the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh: it was leftover debugging note to myself I removed. fixed_point was argument to function, and I was wondering where it came from

guess = 0.1
try:
t_star = scipy.optimize.brentq(fixed_point, 0, guess, args=(M, I, SqDCTData))
except ValueError:
print('Oops!')
return None
t_star = scipy.optimize.brentq(fixed_point, 0, guess, args=(M, I, SqDCTData))
except ValueError as e:
#
raise ValueError(f"Unable to find root: {str(e)}")

# Smooth the DCTransformed data using t_star
SmDCTData = DCTData * np.exp(-np.arange(N)**2 * np.pi**2 * t_star / 2)
Expand All @@ -262,6 +276,7 @@ def kde(data, N=None, MIN=None, MAX=None):


def fixed_point(t, M, I, a2):
# TODO: document this
l = 7
I = np.float64(I)
M = np.float64(M)
Expand Down
Loading