Skip to content

Commit

Permalink
Update documentation to use uv
Browse files Browse the repository at this point in the history
  • Loading branch information
timsaucer committed Jan 14, 2025
1 parent 9c0c8dc commit 755a8f5
Show file tree
Hide file tree
Showing 7 changed files with 48 additions and 56 deletions.
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,13 +175,17 @@ over `pip`.

Bootstrap (`uv`):

By default `uv` will attempt to build the datafusion python package. For our development we prefer to build manually. This means
that when creating your virtual environment using `uv sync` you need to pass in the additional `--no-install-package datafusion`
and for `uv run` commands the additional parameter `--no-project`

```bash
# fetch this repo
git clone git@github.com:apache/datafusion-python.git
# create the virtual enviornment
uv sync --dev --no-install-package datafusion
# activate the environment
source venv/bin/activate
source .venv/bin/activate
```

Bootstrap (`pip`):
Expand All @@ -190,9 +194,9 @@ Bootstrap (`pip`):
# fetch this repo
git clone git@github.com:apache/datafusion-python.git
# prepare development environment (used to build wheel / install in development)
python3 -m venv venv
python3 -m venv .venv
# activate the venv
source venv/bin/activate
source .venv/bin/activate
# update pip itself if necessary
python -m pip install -U pip
# install dependencies
Expand All @@ -217,8 +221,8 @@ Alternatively if you are using `uv` you can do the following without
needing to activate the virtual environment:

```bash
uv maturin delelop --uv
uv pytest .
uv run --no-project maturin develop --uv
uv --no-project pytest .
```

### Running & Installing pre-commit hooks
Expand Down
2 changes: 1 addition & 1 deletion dev/python_lint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,6 @@
# DataFusion CI does

set -e
source venv/bin/activate
source .venv/bin/activate
flake8 --exclude venv,benchmarks/db-benchmark --ignore=E501,W503
black --line-length 79 .
4 changes: 2 additions & 2 deletions dev/release/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,8 +172,8 @@ git checkout 40.0.0-rc1
git submodule update --init --recursive

# create the env
python3 -m venv venv
source venv/bin/activate
python3 -m venv .venv
source .venv/bin/activate

# install release candidate
pip install --extra-index-url https://test.pypi.org/simple/ datafusion==40.0.0
Expand Down
4 changes: 2 additions & 2 deletions dev/release/verify-release-candidate.sh
Original file line number Diff line number Diff line change
Expand Up @@ -125,8 +125,8 @@ test_source_distribution() {
git clone https://github.com/apache/arrow-testing.git testing
git clone https://github.com/apache/parquet-testing.git parquet-testing

python3 -m venv venv
source venv/bin/activate
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -U pip
python3 -m pip install -r requirements-310.txt
maturin develop
Expand Down
32 changes: 11 additions & 21 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,42 +26,32 @@ when changes are merged to the main branch.
## Dependencies

It's recommended to install build dependencies and build the documentation
inside a Python `venv`.
inside a Python `venv` using `uv`.

To prepare building the documentation run the following on the root level of the project:

1. Set up virtual environment if it was not already created
```bash
python3 -m venv venv
```
1. Activate virtual environment
```bash
source venv/bin/activate
```
1. Install Datafusion's Python dependencies
```bash
pip install -r requirements-310.txt
```
1. Install documentation dependencies
```bash
pip install -r docs/requirements.txt
```
```bash
# Set up a virtual environment with the documentation dependencies
uv sync --dev --group docs --no-install-package datafusion
```

## Build & Preview

Run the provided script to build the HTML pages.

```bash
cd docs
./build.sh
# Build the repository
uv run --no-project maturin develop --uv
# Build the documentation
uv run --no-project docs/build.sh
```

The HTML will be generated into a `build` directory.
The HTML will be generated into a `build` directory in `docs`.

Preview the site on Linux by running this command.

```bash
firefox build/html/index.html
firefox docs/build/html/index.html
```

## Release Process
Expand Down
6 changes: 6 additions & 0 deletions docs/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@

set -e

original_dir=$(pwd)
script_dir=$(dirname "$(realpath "$0")")
cd "$script_dir" || exit

if [ ! -f pokemon.csv ]; then
curl -O https://gist.githubusercontent.com/ritchie46/cac6b337ea52281aa23c049250a4ff03/raw/89a957ff3919d90e6ef2d34235e6bf22304f3366/pokemon.csv
fi
Expand All @@ -33,3 +37,5 @@ rm -rf temp 2> /dev/null
mkdir temp
cp -rf source/* temp/
make SOURCEDIR=`pwd`/temp html

cd "$original_dir" || exit
42 changes: 17 additions & 25 deletions docs/source/contributor-guide/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,22 +29,24 @@ Doing so is a great way to help the community as well as get more familiar with
How to develop
--------------

This assumes that you have rust and cargo installed. We use the workflow recommended by `pyo3 <https://github.com/PyO3/pyo3>`_ and `maturin <https://github.com/PyO3/maturin>`_.
This assumes that you have rust and cargo installed. We use the workflow recommended by
`pyo3 <https://github.com/PyO3/pyo3>`_ and `maturin <https://github.com/PyO3/maturin>`_. We recommend using
`uv <https://docs.astral.sh/uv/>`_ for python package management.

By default `uv` will attempt to build the datafusion python package. For our development we prefer to build manually. This means
that when creating your virtual environment using `uv sync` you need to pass in the additional `--no-install-package datafusion`
and for `uv run` commands the additional parameter `--no-project`

Bootstrap:

.. code-block:: shell
# fetch this repo
git clone git@github.com:apache/arrow-datafusion-python.git
# prepare development environment (used to build wheel / install in development)
python3 -m venv venv
# activate the venv
source venv/bin/activate
# update pip itself if necessary
python -m pip install -U pip
# install dependencies (for Python 3.8+)
python -m pip install -r requirements-310.txt
git clone git@github.com:apache/datafusion-python.git
# create the virtual enviornment
uv sync --dev --no-install-package datafusion
# activate the environment
source .venv/bin/activate
The tests rely on test data in git submodules.

Expand All @@ -58,8 +60,8 @@ Whenever rust code changes (your changes or via `git pull`):

.. code-block:: shell
# make sure you activate the venv using "source venv/bin/activate" first
maturin develop
# make sure you activate the venv using "source .venv/bin/activate" first
maturin develop -uv
python -m pytest
Running & Installing pre-commit hooks
Expand All @@ -86,20 +88,10 @@ Mostly, the ``python`` code is limited to pure wrappers with type hints and good
Update Dependencies
-------------------

To change test dependencies, change the `requirements.in` and run

.. code-block:: shell
# install pip-tools (this can be done only once), also consider running in venv
python -m pip install pip-tools
python -m piptools compile --generate-hashes -o requirements-310.txt
To change test dependencies, change the ``pyproject.toml`` and run

To update dependencies, run with `-U`
To update dependencies, run

.. code-block:: shell
python -m piptools compile -U --generate-hashes -o requirements-310.txt
More details about pip-tools `here <https://github.com/jazzband/pip-tools>`_
uv sync --dev --no-install-package datafusion

0 comments on commit 755a8f5

Please sign in to comment.