Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that temporary records are not removed by the import script #1446

Merged
merged 2 commits into from
May 9, 2024

Conversation

frankhereford
Copy link
Member

@frankhereford frankhereford commented May 6, 2024

Associated issues

This PR aims to close cityofaustin/atd-data-tech#17143.

Testing

Test this one locally by:

  • Fire up your local stack, at least so that you have a DB & graphql-engine endpoint running.
  • Drop a export from CRIS you want to test with into the development_extracts folder inside the CRIS import folder.
cd atd-etl/cris_import;
docker compose build;
docker compose run cris-import;
  • Did it run?

Ship list

  • Check migrations for any conflicts with latest migrations in master branch
  • Confirm Hasura role permissions for necessary access
  • Code reviewed
  • Product manager approved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the two functions we're not calling anymore.

Comment on lines -672 to -686
try:
# this seemingly violates the principal of treating each record source equally, however, this is
# really only a reflection that we create incomplete temporary records consisting only of a crash record
# and not holding place entities for units, persons, etc.
if table == "crash" and util.has_existing_temporary_record(pg, source["case_id"]):
print("\b🛎: " + str(source["crash_id"]) + " has existing temporary record")
time.sleep(5)
util.remove_existing_temporary_record(pg, source["case_id"])
except:
# Trap the case of a missing case_id key error in the RealDictRow object.
# A RealDictRow, returned by the psycopg2 cursor, is a dictionary-like object,
# but lacks has_key() and other methods.
print("Skipping checking on existing temporary record for " + str(source["crash_id"]))
pass

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core of the PR; dropping this whole try/except which used to check for and remove temp records.

Comment on lines -730 to +715
if file.endswith(".xml"):
if file.endswith(".xml") or file == "crashReports":
Copy link
Member Author

@frankhereford frankhereford May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cherry picking this into this PR from the S3 one because some exports I have floating around (and we all will or already do) which have the PDF containing folder in them, and this checks for that so they don't treated like CSVs. This let me keep using the same test extract from the rest of my work today, and makes it safe for you to pick any you have laying around as well.

Copy link
Member

@johnclary johnclary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great. thank you! 🚢

Copy link
Contributor

@mddilley mddilley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally with an extract that I had lying around, and the import process completed successfully. Thanks for tidying this up! 🧹 🙌 🚢

@frankhereford frankhereford merged commit 4993d6e into master May 9, 2024
9 checks passed
@frankhereford frankhereford deleted the ensure-temp-records-persist branch May 9, 2024 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Tech debt] Remove temporary crash deduping logic from CRIS import ETL
3 participants