Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pageserver): handle dup layers during gc-compaction #10430

Merged
merged 11 commits into from
Jan 23, 2025

Conversation

skyzh
Copy link
Member

@skyzh skyzh commented Jan 16, 2025

Problem

If gc-compaction decides to rewrite an image layer, it will now cause index_part to lose reference to that layer. In details,

  • Assume there's only one image layer of key 0000...AAAA at LSN 0x100 and generation 0xA in the system.
  • gc-compaction kicks in at gc-horizon 0x100, and then produce 0000...AAAA at LSN 0x100 and generation 0xB.
  • It submits a compaction result update into the index part that unlinks 0000-AAAA-100-A and adds 0000-AAAA-100-B

On the remote storage / local disk side, this is fine -- it unlinks things correctly and uploads the new file. However, the index_part.json itself doesn't record generations. The buggy procedure is as follows:

  1. upload the new file
  2. update the index part to remove the old file and add the new file
  3. remove the new file

Therefore, the correct update result process for gc-compaction should be as follows:

  • When modifying the layer map, delete the old one and upload the new one.
  • When updating the index, uploading the new one in the index without deleting the old one.

Summary of changes

  • Modify finish_gc_compaction to correctly order insertions and deletions.
  • Update the way gc-compaction uploads the layer files.
  • Add new tests.

@skyzh skyzh requested review from jcsp and VladLazar January 16, 2025 16:47
@skyzh skyzh requested a review from a team as a code owner January 16, 2025 16:47
Copy link

github-actions bot commented Jan 16, 2025

7414 tests run: 7027 passed, 0 failed, 387 skipped (full report)


Flaky tests (5)

Postgres 17

Postgres 16

Postgres 15

Postgres 14

Code coverage* (full report)

  • functions: 33.5% (8495 of 25343 functions)
  • lines: 49.3% (71428 of 144828 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
1bb54ef at 2025-01-23T22:03:06.285Z :recycle:

@jcsp
Copy link
Collaborator

jcsp commented Jan 16, 2025

This deserves a test, as it was a near-miss data loss issue. I think we talked about doing something like just gc-compacting, then restarting and doing it again being enough to repro the issue?

@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch 2 times, most recently from c6d2e49 to 2a2ecce Compare January 16, 2025 21:11
@skyzh
Copy link
Member Author

skyzh commented Jan 16, 2025

@jcsp I added two test cases plus one chaos test. The two test cases ensures discard layer due to duplicated layer key warning is hit.

@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch from 2a2ecce to 097e0a6 Compare January 16, 2025 21:15
Copy link
Contributor

@VladLazar VladLazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more risky to fully fix the index-part bug so I'd like to work around it by asking gc-compaction not producing such updates.

Shard ancestor compaction already does this in a safe manner, with the important caveat that the current generation is greater than the generation of the layer being re-written (see compact_shard_ancestors and the call to rewrite_layers).

The trick is in how you call schedule_compaction_update. compacted_from arg must contain only dropped layers (no re-writes) and compacted_to must contain only re-writes or new layers. I see that you were already discarding re-writes within the same generation in KeyHistoryRetention::discard_key, so this should work fine.

@skyzh
Copy link
Member Author

skyzh commented Jan 17, 2025

compacted_from arg must contain only dropped layers (no re-writes) and compacted_to must contain only re-writes or new layers. I see that you were already discarding re-writes within the same generation in KeyHistoryRetention::discard_key, so this should work fine.

Yeah I thought I was doing it in the correct way that same layer in same generations are discarded... so I assume there's still a bug around remote_client

skyzh added 4 commits January 17, 2025 11:24
Signed-off-by: Alex Chi Z <chi@neon.tech>
Signed-off-by: Alex Chi Z <chi@neon.tech>
Signed-off-by: Alex Chi Z <chi@neon.tech>
Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch 2 times, most recently from fc494a6 to 9b1a107 Compare January 17, 2025 20:40
@skyzh
Copy link
Member Author

skyzh commented Jan 17, 2025

The trick is in how you call schedule_compaction_update. compacted_from arg must contain only dropped layers (no re-writes) and compacted_to must contain only re-writes or new layers. I see that you were already discarding re-writes within the same generation in KeyHistoryRetention::discard_key, so this should work fine.

Oh well on a second read I realized that I misread something. So if the code rewrites a layer, it should only appear in the compact_to and not appear in compact_from?

@skyzh
Copy link
Member Author

skyzh commented Jan 17, 2025

...even if the rewritten layer has a different generation than the original one?

Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch from 9b1a107 to 60310f6 Compare January 17, 2025 21:29
skyzh added 2 commits January 20, 2025 14:11
Signed-off-by: Alex Chi Z <chi@neon.tech>
Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh requested a review from VladLazar January 20, 2025 21:27
@skyzh
Copy link
Member Author

skyzh commented Jan 20, 2025

fixed all test flakyness and ready for review :)

@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch 2 times, most recently from b3743ca to 1bd9d6a Compare January 21, 2025 19:34
…mote_client

Signed-off-by: Alex Chi Z <chi@neon.tech>

fix overwriting layers

Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch 2 times, most recently from a8c8769 to 20c5edc Compare January 21, 2025 20:20
@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch 3 times, most recently from 888d744 to bfbd1c2 Compare January 21, 2025 21:13
Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh force-pushed the skyzh/gc-compaction-ignore-key branch from bfbd1c2 to bd921de Compare January 21, 2025 21:41
@skyzh skyzh changed the title fix(pageserver): discard dup layers during gc-compaction fix(pageserver): handle dup layers during gc-compaction Jan 21, 2025
Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh requested a review from VladLazar January 22, 2025 15:31
Copy link
Contributor

@VladLazar VladLazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comment on the duplication of rewrite_layers logic.

Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh enabled auto-merge January 23, 2025 21:24
@skyzh skyzh added this pull request to the merge queue Jan 23, 2025
Merged via the queue into main with commit 8d47a60 Jan 23, 2025
82 checks passed
@skyzh skyzh deleted the skyzh/gc-compaction-ignore-key branch January 23, 2025 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants