-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question | Could mixin/extensions be used for supporting incremental build and test? #213
Comments
I don't think that you can make this assumption. Maybe the user does want to build and test all packages. So first of all this needs an explicit statement that this is what the user would like to do imo.
Your assumption that a package only needs to be rebuild if any files within the package itself change isn't correct. There could be changes in the environment or dependencies used by the package like header files on the system which still required the package to be rebuild. Again I think in this case an explicit statement of intend is necessary that this is what the user wants to do and is aware of the limitations.
A mixin doesn't have any logic. A mixin name is simply expanded into a set of command line arguments. So it can't determine the package names needed in the current situation (based on whatever state). This kind of functionality needs to be in an extension with Python code to implement the logic.
The existing
The question here is "what-means-modified"? Any file under the package directory (how about
The question here is "rebuilt-since-when"? The last The following question applies to both cases: would it be better to add a new option for this (like
I don't understand the question - maybe you can elaborate on it. |
Agreed, this incremental behavior should be explicitly specified and opt in. I was thinking that providing a mixin would serve as the explicit statement, but as you pointed out: mixin name simply expand to sets of command line arguments. Can an extension add sub arguments to the command line arg parse, or can they only add behaviors to existing args? Do any of the common extensions demonstrate this? https://github.com/colcon/colcon-common-extensions/blob/master/stdeb.cfg#L3
Yes, changes in the system could necessitate a fresh rebuild. The eventual use case I had in mind was for a speedy local CI agent in a container that could use the container image as a cache-break event, thus if the same docker image binary was used for the build step, then the same build artifacts could be reused. Here is an example of where we use a chaining checksum as the key to cache workspaces, seeded using the last touched file in the docker image build and system dependencies installed:
Aside from
Excellent, that's a good detail to know. Is that documented anywhere on the docs? https://github.com/colcon/colcon.readthedocs.org/blob/f827f8a/user/how-to.rst
Yes, the extension would have to store a checksum or digest manifest for each package to remember the state of the source directory. The extension could also make use of the .*ignore files in the source folder to prevent IDE and vcs(git) meta files from breaking caches.
Would the respective package subdirectories in the build folder be a good place to store this state information? Like where the return codes for build and test command for each package are kept.
I guess I'd lean towards the former. As long as it would play nicely with
If we can entirely account for the state of the system outside the workspace (e.g. detect with the workspace and system fall out of sync given changes to the system), how necessary is it to If I were to Perhaps I'm over thinking things, but if IIRC, |
Several of the extension points allow an extension to contribute to the parser. Just search for
The current docs are focused on example what a user might want to do and atm only contain cases with a single
Yes, I think that would be the right location.
Yes, for more complex sets of packages it is possible (and intended, e.g. you can use that to easily perform the
Well, it depends 😉 usually you should be fine but there are corner cases. E.g. a left over artifact in the build directory could have an undesired effect or a downstream package has cached information in its
Usually a full build will take less than a second per previously build and unchanged package. If that is fast enough is certainly subjective. If you have only done a single build of a subset of packages before the option |
When re-building packages per I'm guessing removing packages from the install folder would not be as straightforward, as if the source changes, we might not have the same uninstall behavior (is that a even a thing here). Thus the conservative route would be to wipe the install folder entirely, and call build for all packages to recreate the
Yep, we're already using ccache to speed up the CI, as we pre-heat the ccache from the nightly docker image, and keep it warm by limiting the cache's scope to the particular PR so hits for the branch are less likely to age out BTW, if you know what non-determinism might be causing the miserable hit rates for the debug build, as opposed to the 100% rate we have for release builds, let me know what you might think: |
Yes, deleting the build directories of all packages about to be processed will result in correct behavior.
You uninstall steps shouldn't be based on the current state of the source directory (since as you mentioned that might have changed). The necessary steps to uninstall the package need to be generated at build time and when triggering an
If you use an isolated install (which is the default when building from source) you can simply delete the package specific install directories |
Awsome!
Ok, so assuming an isolated install, and the hypothetical
Minimal caching but Correct buildWipe both build and install spaces for changed and dependent packages
Moderate caching but Imprecise buildWipe only install spaces for changed packages
Maximum caching but Imprecise buildWipe only install spaces for changed and dependent packages
|
Yes, these processes should be fine. For an extra safe guard (in the last case where you didn't wipe
You should be able to use
You want to watch out for the exact verb / term here. In #37 where such a functionality is discussed there is some controversy about the term to be used: |
Would using
Here is a DRY'er mock up using in-place args to avoid tracking around two variables:
Minimal caching but Correct build
Moderate caching but Imprecise build
Maximum caching but Imprecise build
|
It will only result in a full CMake configure run. The actual build doesn't change if the source files / all included headers didn't change. |
@ruffsl If the question has been answered can you go ahead and close the ticket? |
Say a user has built a workspace and subsequently runs colcon test. The user then modifies a number of files in the workspace and again invokes
colcon build
and thencolcon test
. Ideally, one could expect packages that are were modified, as well packages above those modified, to be rebuilt. Then for testing, only those packages rebuilt, would be retested. Thus the caching of build and test would be maximized, yet still encompassing all changes to the workspace since the previous build and test.Would this be achievable using mixins and extensions? Or would core changes be required?
I'm thinking an extension could retain a digest-manifest for all files in the workspace, so it could be fully aware of file level changes. Those file changes could then be associated to package names that then could be passed to colcon build and test arg, e.g.
BTW, are the sets of packages captured by each arg
OR
'd orAND
'd together?And would something like
incremental clean
vsincremental dirty
be helpful for specifying whether effected packages should be cleaned or not prior to build or test? Related: #37The text was updated successfully, but these errors were encountered: