Skip to content

Latest commit

 

History

History
13 lines (10 loc) · 945 Bytes

data_pipe.md

File metadata and controls

13 lines (10 loc) · 945 Bytes
input operation output notes
raw/metadata.csv clean metadata interim/{dataset}/metadata.csv
interim/metadata.csv download images raw/{dataset}/images/
raw/{dataset}/images/ preprocess images processed/{dataset}/images/
processed/{dataset}/images/ calculate features interim/{dataset}/features.csv
interim/{dataset}/features.csv
processed/{dataset}/images/
create_recommender models/retrieval
models/retrieval_exclusion
images are for testing and tracing the model graph
interim/{dataset}/metadata.csv process_metadata processed/fixtures/*.json
processed/metadata.csv

_ tables will be eventually write to spark tables instead of csvs _