Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the purpose of the meta file? #7

Open
davebryson opened this issue Aug 6, 2018 · 6 comments
Open

What's the purpose of the meta file? #7

davebryson opened this issue Aug 6, 2018 · 6 comments

Comments

@davebryson
Copy link

This is really great work! I've started a port in go to get a better understanding of the code base.

What's the purpose of the separate meta file? It appears that it's only opened and closed to read/write a key on start-up. However (if I'm reading the code correctly) any new meta information on commit is written to the actual log file. I don't see where the meta file itself is updated on subsequent commits. Is the purpose of the meta file simply to hold the key? Thanks!

@chjj
Copy link
Contributor

chjj commented Aug 7, 2018

Is the purpose of the meta file simply to hold the key?

Yes, and the key is there to prevent a very specific kind of attack on the tree. I called it "meta" because maybe it would hold some other information in the future other than just the key.

Full expanation of the attack prevented by having a random key:

The idea is that we write a "meta root" on every commit. This meta root contains a pointer to the previous meta root and a pointer to the latest root node. Attached to it is a 20 byte checksum. The meta root is written as the very last part of the write call. When we boot up, we can: 1. Find the meta root since it's always written on a particular byte boundary with a special 4 byte prefix. 2. Verify that the write succeeded by verifying the checksum. This gives us good crash consistency.

This is where the edge-case comes in: if people are allowed to add arbitrary data to the tree, an attacker may try to construct a piece of data which looks like a meta root. A node which has crashed and needs to reboot by finding a meta root may end up using an attacker's piece of data as the meta root, which could have pointers to anywhere on disk and ultimately cause consensus failures.

So, to combat this, everytime an urkel tree is instantiated for the first time, a random key is generated. The 20 byte checksum/hash for each meta root is calculated by permuting the key with the meta root's data.

This way, everyone on the network computes different checksums for their meta roots, and it becomes impossible for an attacker to try to corrupt the tree.

@chjj
Copy link
Contributor

chjj commented Aug 7, 2018

Also note that this could probably also be fixed by ensuring that values are never written on the same boundary that meta roots are (pad them by 1 byte or something). This was just one way of solving the issue.

@chjj
Copy link
Contributor

chjj commented Aug 7, 2018

I've started a port in go

That's totally awesome by the way. :)

Are you porting the radix tree or the simplified trie?

@davebryson
Copy link
Author

davebryson commented Aug 7, 2018

Ahhh... ok. So the meta key is protecting the meta roots in the log file.

Are you porting the radix tree or the simplified trie?

I'm porting the optimized version of the simplified trie. _get and _insert functional, now working on the store piece.

@davebryson
Copy link
Author

I'm porting the optimized version of the simplified trie

Hmmm.... looks like I probably should have started with the radix tree. Is that the future direction of urkel?

@chjj
Copy link
Contributor

chjj commented Aug 7, 2018

I'm porting the optimized version of the simplified trie. _get and _insert functional, now working on the store piece.

Cool.

Hmmm.... looks like I probably should have started with the radix tree. Is that the future direction of urkel?

Not sure yet whether it's the future direction. But I think once you have the data management working, it shouldn't be too hard to port it to the merklix/radix tree. You just have to store colliding bits on each internal node (if there are any), and also add a new type of absence proof.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants