Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FFI] Use the correct function in DecRef #15999

Closed
wants to merge 1 commit into from

Conversation

yelite
Copy link
Contributor

@yelite yelite commented Oct 27, 2023

Found this problem when investigating for #15996

cc @Lunderberg

@github-actions github-actions bot requested a review from Lunderberg October 27, 2023 03:12
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my, that's a good catch, thank you! This would effectively mean that any python exception thrown across the FFI border would never be garbage-collected, and that any local variables in its stack trace would be in memory until the program exits. This is primarily used to improve debugging in cases where the exception is causing the program to exit anyways, but there's no guarantee that that's the case.

@yelite yelite force-pushed the fix-py-decref-main branch from f37c3e5 to d9d837b Compare October 27, 2023 21:33
@yelite yelite force-pushed the fix-py-decref-main branch from d9d837b to b8d09a1 Compare October 30, 2023 13:12
@yelite
Copy link
Contributor Author

yelite commented Oct 30, 2023

The fix in this PR exposes an actual problem of making C++ wrapper of a Python object. Running pytest tests/python/unittest/test_crt.py will result in error "the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)".

In addition, I think we should also consider the stopping order of runtime here. After the Python runtime shuts down, DecRef can no longer be called.

@Lunderberg Since this goes far beyond my initial intention of fixing a typo, can you take this over and make proper fix?

@Lunderberg
Copy link
Contributor

@yelite Certainly, and I have PR #16021 opened which includes both the typo fix and the GIL handling. It looks like it was an issue not with the underlying API usage, that the GIL wasn't explicitly acquired before using the python API. This wasn't an issue in single-threaded usage, because the GIL was held by the parent scope. It also didn't trigger an error previously, because the calls to Py_IncRef couldn't ever bring the reference count to zero.

Regarding stopping order, we should be set. The main concern would be the unspecified order of static destructors across compilation units, where both the CPython implementation and the last_error use static objects. However, the last_error field only stores a python object during the FFI handoff, and never retains it beyond that. For error propagation from Python to C++, it is set here in Python, then retrieved and re-thrown here in C++. For error propagation from C++ to Python, it is set here in C++, as part of the API_END() macro, and is then retrieved here in python. Since the last_error doesn't contain a wrapped python object when the static destructors are being called, we shouldn't have any ordering issues with CPython teardown.

@yelite yelite closed this Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants