Skip to content

Commit

Permalink
Merge pull request #1437 from cityofaustin/md-patch-cris-import-prsn
Browse files Browse the repository at this point in the history
Skip CRIS import record updates if a key column is missing a value
  • Loading branch information
mddilley authored Apr 15, 2024
2 parents 5b69bab + fa587d9 commit 0a66f60
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 0 deletions.
9 changes: 9 additions & 0 deletions atd-etl/cris_import/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,12 @@ The contents of this directory define a Docker image that can be used in an Airf
### Zip file acquisition

The CRIS import usually works by pulling down a zip archive from the SFTP endpoint. However, during development, it's much easier if you can have it use a zip file that you have locally on your machine instead. This can be accomplished by putting a zip file (still encrypted and using "CRIS extract" password) in a folder named `atd-etl/cris_import/development_extracts/`. Create the directory if needed. If there are no zip files in that folder, the program will automatically revert to inspecting the SFTP endpoint.

### Local testing

Make a copy of `env-template` and name it `env`. Fill in the values using the 1Password Connect Server secrets (see entries titled `Endpoint for 1Password Connect Server API` and `Vault ID of API Accessible Secrets vault`) and your personal access token.

Drop a CRIS extract zip file into your development folder as described above, and run the import script:
```bash
docker compose run cris-import
```
16 changes: 16 additions & 0 deletions atd-etl/cris_import/cris_import.py
Original file line number Diff line number Diff line change
Expand Up @@ -612,7 +612,23 @@ def align_records(map_state):
input_column_names = util.get_input_column_names(pg, map_state["import_schema"], table, target_columns)

# iterate over each imported record and determine correct action
should_skip_update = False

for source in imported_records:
# Check unique key columns to make sure they all have a value
for key_column in key_columns:
key_column_value = source[key_column]
if key_column_value is None:
print("\nSkipping because unique key column is missing")
print(f"Table: {table}")
print(f"Missing key column: {key_column}")
should_skip_update = True

# If we are missing a column that uniquely identifies the record, we should skip the update
if should_skip_update:
for key_column in key_columns:
print(f"{key_column}: {source[key_column]}")
continue

# generate some record specific SQL fragments to identify the record in larger queries
record_key_sql, import_key_sql = util.get_key_clauses(table_keys, output_map, table, source, map_state["import_schema"])
Expand Down
3 changes: 3 additions & 0 deletions atd-etl/cris_import/env-template
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
OP_API_TOKEN=
OP_CONNECT=
OP_VAULT_ID=

0 comments on commit 0a66f60

Please sign in to comment.