PDBFixer cannot handle nucleic acids with residue name N #281

jamesmkrieger · 2023-11-14T17:57:09Z

   fixer.addMissingAtoms()
  File "/home/jkrieger/software/miniconda/envs/prody-github/lib/python3.9/site-packages/pdbfixer/pdbfixer.py", line 902, in addMissingAtoms
    (newTopology, newPositions, newAtoms, existingAtomMap) = self._addAtomsToTopology(True, True)
  File "/home/jkrieger/software/miniconda/envs/prody-github/lib/python3.9/site-packages/pdbfixer/pdbfixer.py", line 400, in _addAtomsToTopology
    self._addMissingResiduesToChain(newChain, insertHere, startPosition, endPosition, loopDirection, residue, newAtoms, newPositions, firstIndex)
  File "/home/jkrieger/software/miniconda/envs/prody-github/lib/python3.9/site-packages/pdbfixer/pdbfixer.py", line 511, in _addMissingResiduesToChain
    template = self.templates[residueName]
KeyError: 'N'

This was triggered by 7s7b.cif downloaded from the PDB

The text was updated successfully, but these errors were encountered:

peastman · 2023-11-14T21:11:21Z

That's one I haven't seen before. What is N supposed to mean? Is this file trying to use the nucleotide sequence search codes where N means, "Accept any nucleotide at this position?"

jamesmkrieger · 2023-11-14T21:42:10Z

Perhaps it means they don’t know what nucleotide is there because they don’t have enough resolution

peastman · 2023-11-14T21:47:49Z

Maybe, but the sequence in a PDB file is supposed to be a real sequence, not IUPAC codes. Oh well, I guess someone has figured out yet another way to make a messed up PDB file!

What should we do in this situation? You've told it to add missing residues based on the sequence. But since the sequence doesn't tell us what to add at that position?

jamesmkrieger · 2023-11-15T07:48:53Z

Yeah, it is another strange thing to have

Perhaps raise a warning and skip that residue rather than completely stopping?

sukritsingh · 2024-06-10T01:02:40Z

Out of curiosity I dug through the source paper and it's not it's not even clear that it's just a single nucleotide being skipped (truly seems a bit of a sloppy entry).

I think skipping on "N" seems a bit dangerous because it's not even clear how many nucleotides are supposed to take it's place. Perhaps the safest bet is to just throw a more clear error message indicating that the entry has unrecognized single-letter-codes for nucleotides, and indicate the index where that happens?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDBFixer cannot handle nucleic acids with residue name N #281

PDBFixer cannot handle nucleic acids with residue name N #281

jamesmkrieger commented Nov 14, 2023

peastman commented Nov 14, 2023

jamesmkrieger commented Nov 14, 2023

peastman commented Nov 14, 2023

jamesmkrieger commented Nov 15, 2023

sukritsingh commented Jun 10, 2024

PDBFixer cannot handle nucleic acids with residue name N #281

PDBFixer cannot handle nucleic acids with residue name N #281

Comments

jamesmkrieger commented Nov 14, 2023

peastman commented Nov 14, 2023

jamesmkrieger commented Nov 14, 2023

peastman commented Nov 14, 2023

jamesmkrieger commented Nov 15, 2023

sukritsingh commented Jun 10, 2024