Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extracting the substructure patterns of CircularFingerprint. #224

Closed
wants to merge 114 commits into from
Closed

extracting the substructure patterns of CircularFingerprint. #224

wants to merge 114 commits into from

Conversation

vedina
Copy link
Contributor

@vedina vedina commented Aug 13, 2016

No description provided.

ntk73 and others added 30 commits April 20, 2016 09:30
Signed-off-by: Egon Willighagen <egon.willighagen@gmail.com>
Signed-off-by: Egon Willighagen <egon.willighagen@gmail.com>
…Here we simply store a visit map and only bend/stretch a bond if it's the first pair it's seen in.

Signed-off-by: Egon Willighagen <egon.willighagen@gmail.com>
Signed-off-by: Egon Willighagen <egon.willighagen@gmail.com>
Signed-off-by: Egon Willighagen <egon.willighagen@gmail.com>
…epresentation, for example (H2O)8.

Signed-off-by: Egon Willighagen <egon.willighagen@gmail.com>
…error handling.

Signed-off-by: Egon Willighagen <egon.willighagen@gmail.com>
kaibioinfo and others added 24 commits August 13, 2016 17:07
…t everything into one class with local access modifier
…ash wedges) as we also use it to assign wiggly bonds.
…erwrite aromaticity information of the configured atom (see bug #1322)
@@ -1258,6 +1262,14 @@ private int findBond(int a1, int a2) {
if (atomAdj[a1][n] == a2) return bondAdj[a1][n];
return -1;
}

// convecience: find value within array
private int findArrayIndex(int value, int array[]){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use Ints.indexOf()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

@johnmay
Copy link
Member

johnmay commented Aug 13, 2016

I need to rebase this but looks like interesting functionality. Late here but the only things that stick out a little at the moment is the SMARTS generation in the fingerprint, okay for now but I would prefer to centralise this but obviously would need to write that/use AMBITs one. Also wondering how you get rings in a circular fingerprint? Shouldn't all the subgraphs be trees, will probably become clear once I look more in depth.

@vedina
Copy link
Contributor Author

vedina commented Aug 14, 2016

  • Fine to centralize SMARTS generation if possible, as here it uses internal CircularFingerprint structures;
  • There are no AMBIT dependencies in this code (intentionally) - of course SMARTS should be valid at the end;
  • Rings appear if the depth is large enough. @ntk73 may explain implementation details.
    As this feature is modeled after JCompoundMapper here is JCompoundMapper output for benzene

ECFP6 ( :1 means count , it's not part of the SMARTS)

[*]=1=C=1C=C2[*]2:1 
[*]=1=C=12([*]2):1  
[*]1C1=2(=[*]=2):1  
[*]1C1=CC=CC=2=[*]=2:1  
[*]=1=C=1C=CC=C2[*]2:1  
c1ccccc1:1  
[*]1c1=CC=Cc=2=[*]=2:1  
[*]1c1=CC=2=[*]=2:1 
[*]1C1=CC=CC=2=[*]=2:1

ECFP4 (no ring structures)

[*]1C1=2(=[*]=2):1  
[*]=1=C=1C=CC=C2[*]2:1  
[*]1C1=CC=CC=2=[*]=2:1  
[*]1C1=CC=2=[*]=2:1 
[*]=1=C=1C=C2[*]2:1 
[*]=1=C=12([*]2):1

our code generates these fragments for benzene (default CLASS_ECFP6)

c(:a):a
c(c:a)c:a
c(cc:a):a
c(:a)cc:a
c(cc:a)cc:a
c(ccc:a)c:a
c(cccc:a):a
c(:a)cccc:a
c(c:a)ccc:a
c1ccccc1

and for CLASS_ECFP4

c(:a):a
c(c:a)c:a
c(cc:a):a
c(:a)cc:a
c(cc:a)cc:a
c(ccc:a)c:a
c(cccc:a):a
c(:a)cccc:a
c(c:a)ccc:a

@johnmay
Copy link
Member

johnmay commented Aug 18, 2016

Rebased - #231

@johnmay johnmay closed this Aug 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants