Skip to content

Missing escape of square brackets causes diff_jsons.py in some cram tests to be false negative #1665

@corneliusroemer

Description

@corneliusroemer

While writing a new cram test for #1664, I copy/pasted part of an existing cram test:

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" \
> --exclude-regex-paths "['seqid']" -- \
> "$TESTDIR/../data/ancestral_mutations_with_root_sequence.json" \
> "$CRAMTMP/$TESTFILE/ancestral_mutations.json"
{}

Out of curiosity or boredom, I tested the test, editing the diff target to see if the test would fail. To my surprise, the diff_jsons didn't report a change. That led me down a bit of a rabbit hole.

TLDR: in multiple tests, we wrongly use the --exclude-regex-paths feature of diff_jsons.py (wrapping around DeepDiff) causing those test to always pass.

The problem is that if square brackets are not escaped with backslashes, they are interpreted as character classes containing ' which is in all paths, hence all paths are excluded. For a longer explanation see: https://stackoverflow.com/a/79173188/7483211

I've identified 4 places where we exclude everything, but we might want to review the other places where we use exclude-regex-paths as well.

Here are the cases where we exclude everything:

cram tests/functional
......................!
--- tests/functional/translate/cram/translate-with-gff-and-gene-name.t
+++ tests/functional/translate/cram/translate-with-gff-and-gene-name.t.err
@@ -29,4 +29,6 @@
   >  --exclude-regex-paths "['seqid']" -- \
   >  "${DATA}/zika/aa_muts_gff.json" \
   >  aa_muts.json
+  Exclude regex ['seqid'] matches `'` which means this diff will always pass which is probably not what you want.
+  You probably forgot to escape something in your regex. See for example: https://stackoverflow.com/a/79173188/7483211
   {}
!
--- tests/functional/translate/cram/translate-with-gff-and-gene.t
+++ tests/functional/translate/cram/translate-with-gff-and-gene.t.err
@@ -26,4 +26,6 @@
   >  --exclude-regex-paths "['seqid']" -- \
   >  "${DATA}/zika/aa_muts_gff.json" \
   >  aa_muts.json
+  Exclude regex ['seqid'] matches `'` which means this diff will always pass which is probably not what you want.
+  You probably forgot to escape something in your regex. See for example: https://stackoverflow.com/a/79173188/7483211
   {}
!
--- tests/functional/translate/cram/translate-with-gff-and-locus-tag.t
+++ tests/functional/translate/cram/translate-with-gff-and-locus-tag.t.err
@@ -26,4 +26,6 @@
   >  --exclude-regex-paths "['seqid']" -- \
   >  "${DATA}/zika/aa_muts_gff.json" \
   >  aa_muts.json
+  Exclude regex ['seqid'] matches `'` which means this diff will always pass which is probably not what you want.
+  You probably forgot to escape something in your regex. See for example: https://stackoverflow.com/a/79173188/7483211
   {}
..................................!
--- tests/functional/ancestral/cram/infer-amino-acid-sequences-with-root-sequence.t
+++ tests/functional/ancestral/cram/infer-amino-acid-sequences-with-root-sequence.t.err
@@ -23,4 +23,6 @@
   >   --exclude-regex-paths "['seqid']" -- \
   >   "$TESTDIR/../data/ancestral_mutations_with_root_sequence.json" \
   >   "$CRAMTMP/$TESTFILE/ancestral_mutations.json"
+  Exclude regex ['seqid'] matches `'` which means this diff will always pass which is probably not what you want.
+  You probably forgot to escape something in your regex. See for example: https://stackoverflow.com/a/79173188/7483211
   {}
......................................................................................................................
# Ran 178 tests, 0 skipped, 4 failed.

Tests that should have failed but didn't:

I think @jameshadfield and @huddlej are best placed to check those test failures - I've made them pass assuming the current behaviour is correct (and was just not fixed at the time the change happened due to test failure being masked).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions