Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storcon: signal LSN wait to pageserver during live migration #10452

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented Jan 20, 2025

Problem

We've seen the ingest connection manager get stuck shortly after a
migration.

Summary of changes

A speculative mitigation is to use the same mechanism as get page
requests for kicking LSN ingest. The connection manager monitors
LSN waits and queries the broker if no updates are received for the
timeline.

Closes #10351

@VladLazar VladLazar force-pushed the vlad/migration-wait-lsn branch from c0a9d24 to 53767db Compare January 20, 2025 17:28
@VladLazar VladLazar changed the title Vlad/migration wait lsn storcon: signal LSN wait to pageserver during live migration Jan 20, 2025
Copy link

github-actions bot commented Jan 20, 2025

7370 tests run: 6985 passed, 0 failed, 385 skipped (full report)


Flaky tests (8)

Postgres 17

Postgres 16

Postgres 15

Postgres 14

Code coverage* (full report)

  • functions: 33.4% (8461 of 25326 functions)
  • lines: 49.2% (70931 of 144277 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
f797495 at 2025-01-22T14:29:24.504Z :recycle:

Problem

We've seen the ingest connection manager get stuck shortly after a
migration.

Summary of Changes

A speculative mitigation is to use the same mechanism as get page
requests for kicking LSN ingest. The connection manager monitors
LSN waits and queries the broker if no updates are received for the
timeline.
@VladLazar VladLazar force-pushed the vlad/migration-wait-lsn branch from 53767db to 191294e Compare January 21, 2025 17:18
@VladLazar VladLazar marked this pull request as ready for review January 22, 2025 12:39
@VladLazar VladLazar requested a review from a team as a code owner January 22, 2025 12:39
@VladLazar VladLazar requested review from erikgrinaker and jcsp and removed request for erikgrinaker January 22, 2025 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pageserver: WAL receiver did not spawn after shard migration
1 participant