Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch/even faster atom types #176

Merged
merged 3 commits into from Nov 15, 2015
Merged

Conversation

egonw
Copy link
Member

@egonw egonw commented Nov 15, 2015

OK, this further takes down the computation time:

CDKBenchmark.testPerceiveAtomType avgt 10 124423.327 ± 22679.255 ns/op
CDKBenchmark.testPerceiveOneByOne avgt 10 182389.348 ± 40002.280 ns/op

I also tried passing around the full map, all the way down to the methods that determine if an nitrogen is part of an amide or thioamide, but that undid basically all of the above speed up... must have done something wrong.

For now, please have a look at this. I'm looking forward to hear the advantage on parsing the 100k random ChEMBL compounds...

@johnmay
Copy link
Member

johnmay commented Nov 15, 2015

Looking good!

100000 2.98 s (33560.20 s-1)

I'm also on a macbook air which can be little underpowered. Here's the code -

        final BufferedReader br = new BufferedReader(new FileReader("/data/chembl_20_subset.smi"));
        SmilesParser smipar = new SmilesParser(SilentChemObjectBuilder.getInstance());
        String line = null;

        CDKAtomTypeMatcher atm = CDKAtomTypeMatcher.getInstance(SilentChemObjectBuilder.getInstance());

        long t = 0;
        int cnt = 0;
        while ((line = br.readLine()) != null) {
            IAtomContainer mol = smipar.parseSmiles(line);
            long t0 = System.nanoTime();
            IAtomType[] atypes = atm.findMatchingAtomTypes(mol);
            long t1 = System.nanoTime();
            t += (t1 - t0);
            if (++cnt % 2500 == 0) { // show progress
                System.err.printf("\r %d %.2f s (%.2f s-1)", cnt, t / 1e9, cnt / (t / 1e9));
            }
        }

        System.err.printf("\r %d %.2f s (%.2f s-1)\n", cnt, t / 1e9, cnt / (t / 1e9));

johnmay added a commit that referenced this pull request Nov 15, 2015
@johnmay johnmay merged commit 424cb00 into cdk:master Nov 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants