-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rewrite] rewrite all repo to properly use DVC, MLFlow and data engineering practices #15
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Chouffe,
Thanks a lot Arthur for this amazing work !
I left you a comment on something that's buggy, but the rest is fine with me.
Just one thing, I'm not sure about the relevance of having all three stages:
- train_yolov8_baseline_small_dataset
- train_yolov8_baseline_full_dataset
- train_yolov8_best
For me, the first two are simple tests. You can explain how to launch them in the README, but I don't think they should be there. I suggest keeping train_yolov8_best and renaming it train_yolov8. What do you think?
--n 10 \ | ||
--loglevel "info" | ||
|
||
yolov8_benchmark: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got an error here :
(.venv) mateo@mateo:~/pyronear/vision/mlops/pyro-mlops$ make yolov8_benchmark
python ./scripts/model/yolov8/benchmark.py \
--input-dir ./data/04_models/yolov8/ \
--output-dir ./data/06_reporting/yolov8/ \
--loglevel "info"
INFO:root:{'input_dir': PosixPath('data/04_models/yolov8'), 'output_dir': PosixPath('data/06_reporting/yolov8'), 'loglevel': 'info'}
INFO:root:Loading files data/04_models/yolov8/best/args.yaml and data/04_models/yolov8/best/results.csv
/home/mateo/pyronear/vision/mlops/pyro-mlops/./scripts/model/yolov8/benchmark.py:77: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
df[k] = str(args[k])
/home/mateo/pyronear/vision/mlops/pyro-mlops/./scripts/model/yolov8/benchmark.py:77: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
df[k] = str(args[k])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it only log warning?
If that's the case I would not mind TBH. Or we could spend some time optimizing the dataframe operations to comply with the warning?
Thanks for taking the time to review this @MateoLostanlen. The small stages were used to tighten the feedback loop to training yolo models with a smaller dataset. That usually saves me a lot of time instead of burning GPU compute for something that is not sure to work well. I think we can get rid of them if you think that's confusing or the other options would be to add some comments? |
@MateoLostanlen I have been working directly on my github repo here to add hyperparameter search for yolov9 and yolo10. |
Description of the PR:
dvc pull
awayHow to test:
Follow the README file to test the repo and dvc setup.
I just copied and pasted all code from the repo developed here: https://github.com/earthtoolsmaker/pyronear-mlops