Skip to content

Source code for <Beyond Human-Like Processing: Large Language Models Perform Equivalently on Forward and Backward Scientific Text>

License

Notifications You must be signed in to change notification settings

braingpt-lovelab/backwards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beyond Human-Like Processing: Large Language Models Perform Equivalently on Forward and Backward Scientific Text

Work with this repo locally:

git clone https://github.com/braingpt-lovelab/backwards --recursive
  • Will recursively grab the asubmodule for human participant data from this repo.
  • Will recursively grab the submodule for BrainBench testcases from this repo.

Repo structure

  • model_training/: training scripts for both forward and backward GPT-2 models.
  • analyses/: post-training analyses scripts for producing results in the paper.

Training

cd model_training

  • Entry-point is launch_training.sh which calls train.py or train_backwards.py given configurations.
  • Training configurations can be set in configs/ and accel_config.yaml.
  • Forward and backward tokenizers can be trained from scratch by tokenizer.py and tokenizer_backwards.py.
  • Training data is hosted here.

Reproduce analyses from scratch:

cd analyses

  • Produce model responses: run_choice.py and run_choice_backwards.py.
  • Statistical analyses: anova_stats.R.
  • Fig. 3: plot_model_vs_human.py.
  • Fig. 4 & Table 1: get_ppl_final_val.py to obtain valiation results and plot_ppl_val_and_test.py for plotting.
  • Fig. 5: plot_error_correlation_model_vs_human.py
  • Fig. S1: neuro_term_tagging.py to obtain raw results and python plot_token_analyses.py for plotting.

Attribution

@article{luo2024beyond,
  title={Beyond Human-Like Processing: Large Language Models Perform Equivalently on Forward and Backward Scientific Text},
  author={Luo, X. and Ramscar, M. and Love, B. C.},
  journal={arXiv preprint arXiv:2411.11061},
  year={2024}
}

About

Source code for <Beyond Human-Like Processing: Large Language Models Perform Equivalently on Forward and Backward Scientific Text>

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published