Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taking care of each region when creating a gninatype #109

Open
drorhunvural opened this issue Mar 17, 2023 · 1 comment
Open

Taking care of each region when creating a gninatype #109

drorhunvural opened this issue Mar 17, 2023 · 1 comment

Comments

@drorhunvural
Copy link

drorhunvural commented Mar 17, 2023

Hi,

I'm converting a pdb file to a gninatype file. I have a process similar to the gninatype function in the link

def gninatype(file):
    # creates gninatype file for model input
    f=open(file.replace('.pdb','.types'),'w')
    f.write(file)
    f.close()
    atom_map=molgrid.FileMappedGninaTyper(f'{pathlib.Path(os.path.realpath(__file__)).resolve().parent}/gninamap')
    dataloader=molgrid.ExampleProvider(atom_map,shuffle=False,default_batch_size=1)
    train_types=file.replace('.pdb','.types')
    dataloader.populate(train_types)
    example=dataloader.next()
    coords=example.coord_sets[0].coords.tonumpy()
    types=example.coord_sets[0].type_index.tonumpy()
    types=np.int_(types) 
    fout=open(file.replace('.pdb','.gninatypes'),'wb')
    for i in range(coords.shape[0]):
        fout.write(struct.pack('fffi',coords[i][0],coords[i][1],coords[i][2],types[i]))
    fout.close()
    os.remove(train_types)
    return file.replace('.pdb','.gninatypes')

Are the features in gninamap (28 different features) applied for each x, y, z coordinates row (for each pocket)?

To ask my question more clearly, For example I have 1a4h.pdb file and I am generating 1a4h.gninatypes with above function called gninatype.

I have data file like below

18.5426 -3.5417 -4.3501 1a4h.gninatypes
16.4473 -2.0545 -9.2645 1a4h.gninatypes
11.5426 -5.5317 -7.3222 1a4h.gninatypes
17.5426 -6.5419 -1.6552 1a4h.gninatypes
...

The characteristics of each region are important to me. Are individual features of all individual regions (each row in the dataset) retained with a single gninatypes? Or do I need to set up a structure like the one below?

18.5426 -3.5417 -4.3501 1a4h_pocket1.gninatypes
16.4473 -2.0545 -9.2645 1a4h_pocket2.gninatypes
11.5426 -5.5317 -7.3222 1a4h_pocket3.gninatypes
17.5426 -6.5419 -1.6552 1a4h_pocket4.gninatypes

If you advise me to set up a second dataset structure, how do I do it?

@dkoes
Copy link
Contributor

dkoes commented Mar 17, 2023

It's up to you want data you put in the gninatype file. Typically we store the entire structure. If ExampleProvider is being populated with a list of PDBs, it will provide all the coordinates that are in the PDB (after all, at no point hav eyou defined the binding site for it to prune around).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants