Create a virtual environment, then run pip install -r requirements.txt
to install project dependencies.
The following folders need to be created:
data/
- no1_original
- no2_original_split/
- fake_split
- real_split
- no3_all_features_split/
- fake_split
- real_split
- no4_embeddings_split/
- fake_split
- real_split
- no5_embeddings
- no6_numerical
- testset
- weak_labeling/
- analysis
- confusion_matrix
- describe
- sources
- snuba/
- goal
- result
- Full dataset with original dataset - NELA-GT-2019 (csv)
- Split dataset with original features (csv)
- Split dataset with all features except word embeddings (pkl)
- Split dataset with only word embeddings (pkl)
- Full dataset with only word embeddings (pkl)
- Full dataset with numerical features and true labels (pkl and csv)
- Cleaned testset as csv
- Scores for Snorkel weak labeling systems
- For dataset description, histograms and boxplots
- Containing the sources from NELA-GT-2019
- Scores for Snuba weak labeling system