Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PICARD-1379: Port _astrcmp to new Python Unicode API #1010

Merged

Conversation

phw
Copy link
Member

@phw phw commented Oct 18, 2018

Summary

  • This is a…
    • Bug fix
    • Feature addition
    • Refactoring
    • Minor / simple change (like a typo)
    • Other
  • Describe this change in 1-2 sentences:

Problem

Fix deprecation warning during compilation by porting _astrcmp to the new Python Unicode API introduced with Python 3.3, see https://docs.python.org/3/c-api/unicode.html

Solution

Action

@phw
Copy link
Member Author

phw commented Oct 18, 2018

See also #1008

Ensures thread safety of the LevenshteinDistance call
@phw
Copy link
Member Author

phw commented Oct 18, 2018

The final commit changed the code again a bit so that it looks more closely to the old one. But there is a small difference: We create an actual copy of the unicode data. This is a bit slower of course, but it ensures we can safely call Py_UNBLOCK_THREADS without risking the string data being altered while we iterate over it. This actually was a potential issue with the old code, which operated on the internal unicode representation directly after Py_UNBLOCK_THREADS got called.

My first approach in ac8d804 worked also without copy, but I think we have to take the performance hit here.

@phw phw merged commit 6c69347 into metabrainz:master Oct 22, 2018
@phw phw deleted the PICARD-1379-fix-astrcmp-deprecation-warnings branch October 22, 2018 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants