Skip to content
This repository has been archived by the owner on Sep 23, 2024. It is now read-only.

Perform logical replication after initial sync (continuation) #218

Closed
wants to merge 33 commits into from

Conversation

josescuderoh
Copy link

@josescuderoh josescuderoh commented Apr 24, 2023

Problem

Describe the problem your PR is trying to solve

This PR aims to complete the work proposed in PR #144 since there original contribuitor will not be able to do so.

If you select new tables to add to an existing log based extract, it'll do a "logical_initial" full table sync on those new tables each and every time the tap runs. It'll never switch over to log based replication for them.

Another example of this issue is that if intend to run with break_at_end_lsn: False then I have to run the tap twice - once to perform the initial sync and again to start the logical replication.

Proposed changes

Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request.
If it fixes a bug or resolves a feature request, be sure to link to that issue.

This Pull Request contains changes which are simpler and, I believe, more correct than those in #130. Both Pull Requests aim to solve #107.

My changes ensure that whenever a LOG_BASED replication runs the bookmark is initialised correctly and any logical replication which is required is performed.

Types of changes

What types of changes does your code introduce to PipelineWise?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

  • Description above provides context of the change
  • I have added tests that prove my fix is effective or that my feature works
  • Unit tests for changes (not needed for documentation changes)
  • CI checks pass with my changes
  • Bumping version in setup.py is an individual PR and not mixed with feature or bugfix PRs
  • Commit message/PR title starts with [AP-NNNN] (if applicable. AP-NNNN = JIRA ID)
  • Branch name starts with AP-NNN (if applicable. AP-NNN = JIRA ID)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions

Perform logical replication after initial sync
This option removes the need to use `write-in-chunks`.
Enable use of `wal2json` `format-version` 2
Use a more useful value for `extracted_at`
Add Postgres range types to schema
@josescuderoh josescuderoh marked this pull request as ready for review April 24, 2023 15:30
@josescuderoh
Copy link
Author

@Samira-El I recreated this PR off an stale PR created by @judahrand and reviewed by you last year. This one includes those + changes from latest master branch. Any chance you could review it to avoid running of a fork? Thanks!

@josescuderoh josescuderoh deleted the issue-107b branch May 1, 2023 18:32
@judahrand
Copy link
Contributor

😞 why closed?

@josescuderoh
Copy link
Author

I'm going to modify the forked repo based on your solution so we can run off it instead since the review process might take some time here. When it's ready I might open a new PR.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants