Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for further improvement of compute_metrics_dataset.py #22

Open
swickner opened this issue Aug 11, 2021 · 2 comments
Open

Suggestion for further improvement of compute_metrics_dataset.py #22

swickner opened this issue Aug 11, 2021 · 2 comments

Comments

@swickner
Copy link

This is a suggestion for further improvement of the file dosed/functions/compute_metrics_dataset.py. When working with the framework on detecting two different types of events, it was necessary to adapt the function in order to achieve a good detection.

The current implementation averages the metrics for each type of event. If there are five records with ten events of each event type and the network detects all ten events in one of the records and none of the events in the other nine records, the metrics are still good. This is because a record with no predicted events results in an empty list, which is not added to metrics_test[event_num][metric] and therefore not included when averaging the result over all ten records.

for event_num in range(network.number_of_classes - 1):
    for metric in metrics.keys():
        metrics_test[event_num][metric] = np.nanmean(np.array(metrics_test[event_num][metric]))

return metrics_test

I suggest to adapt the function as follows. This will weight the metrics with the proportion of records where events of this type are found. With this adaption, the network learns to find the most possible number of events in each record and not only to maximize the results in one record. Further this suggestion weights the prediction of no events of one event type with -1 to achieve a more balanced prediction of the two event types.

for event_num in range(network.number_of_classes - 1):
    for metric in metrics.keys():
        num_records_with_found_events = len(metrics_test[event_num][metric])
        num_records = len(test_dataset.records)
        if (num_records_with_found_events > 0):
            metrics_test[event_num][metric] = np.nanmean(np.array(metrics_test[event_num][metric])) * (num_records_with_found_events / num_records)
        else:
            metrics_test[event_num][metric] = -1

Maybe this suggestion could improve the prediction process.

@agramfort
Copy link
Contributor

it was a long time ago when I looked at this code. Feel free to open PRs and we'll merge quickly.

@vthorey who is in charge at Dreem of this repo now?

@swickner
Copy link
Author

Allright, I will open a pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants