-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallelize field_transform #38
Comments
Parallelization might not work together with compression. |
Right, that was one of the reasons not to try parallel NetCDF some years ago. Switching the compression off is probably not a problem for kernel applications, where the wave fields are never moved, but for all applications, where databases are sent to IRIS or whoever, this is a problem. |
The only way I can think of to avoid this: use the old round robin IO and write the correct chunking readily compressed in the SOLVER :/ |
but didn't we test that writing the correct chunking in the SOLVER is a bazillion times slower? |
yes, but there might be room to optimize it. Only include those processors, that actually have to write stuff, threading, reduce number of dumps by buffering as many steps as possible in memory. I am waiting since a week for field_tranform on a 10TB database, which I computed in a few hours, and it's only 30% done. |
Well, that is annoying. When trying to increase the dump buffer, keep in mind the low memory on most HPC machines. But I'm curious... |
I guess we would need to control, which part of the mesh goes where on the cluster: if each node only has one processor that has crust, it might fit larger time chunks. |
So here we go: system maintenance and field transform was killed. We should at least have a restart capability. This should be really easy to implement. |
Once going to very large databases (~10TB), the field_transform becomes a serious bottleneck. While computation can be done in a few hours, serial field transform takes multiple days to weeks (serial read/write rates on parallel file systems are really bad, about 30 MB/s on CSCS machines).
Workload should be limited, because this refers to a single loop with very few lines
https://github.com/geodynamics/axisem/blob/master/SOLVER/UTILS/field_transform.F90#L851-L909
The text was updated successfully, but these errors were encountered: