Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch/cxsmiles read #200

Merged
merged 13 commits into from Apr 21, 2016
Merged

Patch/cxsmiles read #200

merged 13 commits into from Apr 21, 2016

Conversation

johnmay
Copy link
Member

@johnmay johnmay commented Mar 29, 2016

  1. A few patches on rendering that we left behind from another branch.
  2. Read CXSMILES layers and set information in CDK data-structures (Main part).

@egonw
Copy link
Member

egonw commented Mar 30, 2016

BTW, what is a CXSMILES?

@johnmay
Copy link
Member Author

johnmay commented Mar 30, 2016

CXSMILES (ChemAxon Extended SMILES) is layers of info in the SMILES title that add additional semantics - think of it like an InChI AuxInfo but unlike the AuxInfo you can still canonicalise it.

CCC |(x,y,z;x,y,z;x,y,z)| coords
*C* |$_AP1;;Y'$| atom labels

It allows the extension of SMILES to support additional features whilst maintaining backwards compatibility. Indigo (GGA/EPAM) already have support for some features such as fragment grouping and atom labels.

From discussions with a former ChemAxon dev, it wasn't intended for external consumption and just for JChemBase. Despite some horrid syntax (reusing ',' for field and record separation) it is actually quite useful. Projects such as HELM use the atom-labels (although they could have done that another way).

@egonw
Copy link
Member

egonw commented Mar 30, 2016

Did you create an IChemFormat subclassing the SMILESFormat?

@johnmay
Copy link
Member Author

johnmay commented Mar 30, 2016

They're not really a different format. Every CXSMILES is a valid SMILES. Would you consider reactions SMILES and SMILES different? It's perfectly valid to mix these in a single file:

CN1C=NC2=C1C(=O)N(C(=O)N2C)C caffeine
CCO.CC(=O)O>[H+]>CC(=O)OCC.O 
*CC |$R'$|
>>*CC |$R'$|

@egonw
Copy link
Member

egonw commented Mar 30, 2016

If every CXSMILES is indeed a valid SMILES, then it's fine, I guess... yet, not every full SMILES parser may parse full CXSMILES, I guess? But if any SMILES parser will safely ignore the CX bits, then it should indeed be no problem.

@johnmay
Copy link
Member Author

johnmay commented Mar 30, 2016

Yep, the coordinate output will be very useful. I've previously documented how to do it in a custom way in the JavaDoc of SmilesGenerator but have a more portable way makes this much easier.

…action. We use a new property which is picked up during the depiction generation. Some minor tweaks were made to sizing calculations in particular ensuring -1 is not used which was accidentally making some spacing smaller then needed.
@johnmay
Copy link
Member Author

johnmay commented Apr 18, 2016

Some updates:

Minimum arrow width padding, before:
image
Now
image

Conditions drawn below the arrow:

image

@egonw egonw merged commit 18639d9 into master Apr 21, 2016
@johnmay
Copy link
Member Author

johnmay commented Apr 21, 2016

Cool thanks, do you have any commits pending? Otherwise I think it's time for a release.

@egonw
Copy link
Member

egonw commented Apr 21, 2016

No, but the next two weeks I'm taking a number of days off and plan to work on the CDK paper and possible my CDK book... that will likely results in some patches, but that's fine for a later release.

@johnmay johnmay deleted the patch/cxsmiles-read branch April 24, 2016 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants