Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dump when building on Ubuntu24.04 #32

Open
AlexanderWells-diamond opened this issue Jan 10, 2025 · 12 comments
Open

Core dump when building on Ubuntu24.04 #32

AlexanderWells-diamond opened this issue Jan 10, 2025 · 12 comments

Comments

@AlexanderWells-diamond
Copy link
Contributor

AlexanderWells-diamond commented Jan 10, 2025

The PythonSoftIOC CI weekly run, which runs using "ubuntu-latest" and the master branches of all its various dependencies, is seeing a core dump in its latest runs.

The issue appears to be that "ubuntu-latest" is now "24.04", upgraded from "22.04". When building epicscorelibs on this system, it seems to cause a core dump when we first try and call into EPICS C code.

I'm unfortunately not sure what the actual cause of the failure is. My guess is that the version of the C runtime has updated, but I would have expected building epicscorelibs on the system itself to not have an issue with that.

I have a branch and PR where I have been investigating the issue here.

@mdavidsaver
Copy link
Member

This is probably another manifestation of epics-base/epics-base#514

Un-defining the C preprocessor macro _FORTIFY_SOURCE would be one conclusive test. Including ci-core-dumper should also give clear signs in the stack trace.

@AlexanderWells-diamond
Copy link
Contributor Author

I've run ci-core-dump here and have found this failing thread that confirms the issue is in dbAllocRecord:

Thread 1 (Thread 0x7ff339573080 (LWP 4898)):
  #0  0x00007ff33929eb1c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
  #1  0x00007ff33924526e in raise () from /lib/x86_64-linux-gnu/libc.so.6
  #2  <signal handler called>
  #3  0x00007ff33929eb1c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
  #4  0x00007ff33924526e in raise () from /lib/x86_64-linux-gnu/libc.so.6
  #5  0x00007ff3392288ff in abort () from /lib/x86_64-linux-gnu/libc.so.6
  #6  0x00007ff3392297b6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
  #7  0x00007ff339336c19 in __fortify_fail () from /lib/x86_64-linux-gnu/libc.so.6
  #8  0x00007ff3393365d4 in __chk_fail () from /lib/x86_64-linux-gnu/libc.so.6
  #9  0x00007ff336aca9a9 in dbAllocRecord () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #10 0x00007ff336ab8613 in dbCreateRecord () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #11 0x00007ff336abd5b3 in dbRecordHead.part.0 () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #12 0x00007ff336ac0042 in dbReadCOM () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #13 0x00007ff336a93cb2 in dbLoadDatabase () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
...

I'm afraid I don't know how to undefine a macro in EPICS from epicscorelibs build process.

Does that linked merged PR imply that this issue will be fixed in the future, with the next EPICS release? Is there anything I can do in the meantime to resolve this issue (asides from not using Ubunut-24)?

@mdavidsaver
Copy link
Member

I'm afraid I don't know how to undefine a macro in EPICS from epicscorelibs build process.

hmm. I don't think that I have had to deal with this either, until now. Annoyingly, this is handled by undef_macros=, separate from define_macros=. None of the existing mechanics in epicscorelibs.config are passing undef_macros= through to dependent module builds. I'm not even sure if setuptools-dso will handle this correctly. Sigh...

@mdavidsaver
Copy link
Member

... I'm not even sure if setuptools-dso will handle this correctly. Sigh...

Surprisingly, it looks like this may not be difficult. Apparently, for define_macros=[X] in addition to the documented forms ('NAME', 'value') -> -DNAME=value and ('NAME',None) -> -DNAME, there is an undocumented third form ('NAME',) -> -UNAME.

@mdavidsaver
Copy link
Member

@AlexanderWells-diamond Could you test #33 ?

@AlexanderWells-diamond
Copy link
Contributor Author

No obvious luck when just dumping it into the existing CI run, see here.

Neither any success when testing in my own Ubunutu-24.04 image - exactly the same error appears at exactly the same point.

@mdavidsaver
Copy link
Member

exactly the same error appears at exactly the same point.

Please re-test. When attempting to test for _FORTIFY_SOURCE I forgot that this builtin C macro is only defined when optimization is enabled.

With pip install -v ... you should see:

Detect _FORTIFY_SOURCE 3
Bypass _FORTIFY_SOURCE

then later many repetitions of -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2.

@AlexanderWells-diamond
Copy link
Contributor Author

AlexanderWells-diamond commented Jan 16, 2025

We no longer see an abort, and the tests pass fine. Thank you for fixing this!

@mdavidsaver
Copy link
Member

#33 is merged. fyi. this is a workaround. The real fix will come with the next merge from epics-base into epicscorelibs.

@AlexanderWells-diamond
Copy link
Contributor Author

FYI I've had another report of this issue in pythonSoftIOC. Is it worth doing an epicscorelibs release, or is there likely to be a new EPICS release in the near future to fix this properly? Thanks.

@mdavidsaver
Copy link
Member

... a new EPICS release in the near future to fix this properly? ...

@anjohnson Any thoughts on a Base release with the _FORTIFY_SOURCE=3 fixes?

@anjohnson
Copy link
Member

@mdavidsaver Making a Base 7.0.9 release in time for the ISIS meeting (or sooner?) seems like a good target, and ought to be possible.

The Release Notes currently only mention that we override that to _FORTIFY_SOURCE=2 though, which we added just before the 7.0.8.1 release. They also seem rather light for adding 135 commits since then. Please go through your recently merged PRs and add notes about anything significant that has been merged but you didn't already document there; that will save me from having to do it myself when I make the release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants