Add a converter from PDB to Zarr to the DatasetFactory #171

zhu0619 · 2024-08-09T17:40:26Z

Changelogs

Added PDBConverter which is able to convert pdb files to zarr file
Added ARRAY_TO_PDB which is able to load pdb from zarr file
Allow add_from_file to handle multiple files.
Added simple tutorial for creating dataset from pdb file.

Checklist:

Was this PR discussed in an issue? It is recommended to first discuss a new feature into a GitHub issue before opening a PR.
Add tests to cover the fixed bug(s) or the newly introduced feature(s) (if appropriate).
Update the API documentation if a new function is added, or an existing one is deleted.
Write concise and explanatory changelogs above.
If possible, assign one of the following labels to the PR: feature, fix, chore, documentation or test (or ask a maintainer to do it for you).

Issue #172

During the conversion, only the most essential structural information is retained, including 3D coordinates, chain ID, residue ID, insertion code, residue name, heteroatom indicator, atom name, element, atom ID, B-factor, occupancy, and charge.

zhu0619 · 2024-08-09T18:59:53Z

Currently, fastpdb can only be installed via pip. I created an issue in their repository to request support for conda installation.

cwognum

Thanks @zhu0619 !

I know it took some searching, but I think the solution you came up using fastpdb with is very polished! 💅

I did have some comments. In addition to these comments, would you also mind adding test cases?

env.yml

polaris/dataset/_adapters.py

polaris/dataset/_adapter_utils.py

polaris/dataset/_factory.py

polaris/dataset/_adapter_utils.py

polaris/dataset/converters/_pdb.py

cwognum · 2024-08-16T19:35:34Z

FYI - We'll hold of on merging this to give #121 priority!

zhu0619 and others added 8 commits July 26, 2024 17:07

wip

77c04af

prototype

cdced54

Merge branch 'main' into feat/pdb

7d1e5ea

add fastpdb prototype

c60c8c0

wip

d7c9fd9

wip

2e47d70

wip

b61dfdd

add pdb converter

977b095

zhu0619 requested a review from cwognum as a code owner August 9, 2024 17:40

zhu0619 marked this pull request as draft August 9, 2024 17:40

zhu0619 changed the title ~~PDB converter~~ feature/PDB converter Aug 9, 2024

zhu0619 added the feature Annotates any PR that adds new features; Used in the release process label Aug 9, 2024

zhu0619 added 5 commits August 9, 2024 13:43

missing imports

3261d2d

minor changes

d9bce7f

add tutorial

876138d

remove dev files

7e0c127

add dep

0f2c6e9

zhu0619 linked an issue Aug 9, 2024 that may be closed by this pull request

Adding a pdbConverter #172

Closed

zhu0619 marked this pull request as ready for review August 9, 2024 18:56

env

ce52de0

zhu0619 requested a review from Andrewq11 August 9, 2024 19:12

zhu0619 added 2 commits August 9, 2024 15:15

update api

b3eafc6

update docs

6962a31

cwognum requested changes Aug 12, 2024

View reviewed changes

zhu0619 added 5 commits August 13, 2024 22:12

add opt dep

6eb4bcc

update adaptor name

0377289

wip

474244c

update import

a847d3e

refactor to add_from_files

50d0b80

zhu0619 and others added 12 commits August 13, 2024 22:54

refactor

8edf177

add tests

779b91b

update deps

9ecae2b

update load_to_memeory

0e7fb6e

refactor pdb pointer

ec337c2

add create_dataset_from_files

19ebae7

add info

d5c1242

rename tutorials

46cacb6

fix mkdocs

f7d9dcb

ruff

24580fb

Merge branch 'main' into feat/pdb

d0de43a

format notebooks

6e8c66d

zhu0619 requested a review from cwognum August 15, 2024 13:57

cwognum added 2 commits August 15, 2024 12:19

Revert some formatting changes

8b588d9

Revert some more formatting changes

8eda8f5

cwognum changed the title ~~feature/PDB converter~~ Add a converter from PDB to Zarr to the DatasetFactory Aug 15, 2024

cwognum added 2 commits August 16, 2024 15:25

Addressed minor feedback

b026bba

Comment

569bb0b

cwognum approved these changes Aug 16, 2024

View reviewed changes

Merge branch 'main' into feat/pdb

5222948

zhu0619 merged commit e23c4a1 into main Aug 19, 2024
4 checks passed

cwognum mentioned this pull request Aug 20, 2024

Visualization Layer: Data Parsing Service #176

Merged

5 tasks

zhu0619 deleted the feat/pdb branch August 27, 2024 16:06

cwognum mentioned this pull request Aug 27, 2024

XL Datasets: Minimal Zarr-only dataset implementation #186

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a converter from PDB to Zarr to the DatasetFactory #171

Add a converter from PDB to Zarr to the DatasetFactory #171

Uh oh!

zhu0619 commented Aug 9, 2024 •

edited

Loading

Uh oh!

zhu0619 commented Aug 9, 2024 •

edited

Loading

Uh oh!

cwognum left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cwognum commented Aug 16, 2024

Uh oh!

Uh oh!

Uh oh!

Add a converter from PDB to Zarr to the DatasetFactory #171

Add a converter from PDB to Zarr to the DatasetFactory #171

Uh oh!

Conversation

zhu0619 commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelogs

Uh oh!

zhu0619 commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cwognum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cwognum commented Aug 16, 2024

Uh oh!

Uh oh!

Uh oh!

zhu0619 commented Aug 9, 2024 •

edited

Loading

zhu0619 commented Aug 9, 2024 •

edited

Loading