-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxy WS requests to chromium instead of returning its URL directly #8
Conversation
Runner integration tests also now pass when using a build from this branch:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor things.
// Currently, creating a new session will terminate other existing ones, if present. Clients should not rely on this | ||
// behavior and should delete their sessions when they finish. If a session has to be terminated when a new one is | ||
// created, an error is logged. | ||
func (s *Supervisor) Session() (SessionInfo, error) { | ||
func (s *Supervisor) Create() (SessionInfo, error) { | ||
s.sessionsMtx.Lock() | ||
defer s.sessionsMtx.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm trying to follow what you said elsewhere.
This whole function runs with the mutex held.
That means that adding the session to the map on line 137 only has an effect if this function returns early.
The goroutine on lines 147-181 does not interact with the session.
The entry is removed on line 189.
On line 193 onwards is the only place where sess is mutated.
So the only place where it matters that this is already in place is on line 120.
It's OK to call Delete
with an id that is not present on the map, because Delete
also takes the mutex.
It looks like it should be possible to defer adding the session to the map down to line ~ 193.
So, having a pointer in the map allows me to mutate the object, but I'm missing a place where mutating the object in the map is necessary.
Basically what I'm trying to wrap my head around is whether returning from line 166 with an error causes the context to be cancelled. Unless I'm missing something, that doesn't happen. The context will be cancelled after the timeout, which will cause the session to be removed from the map, but that's it (line 213). What I think is that in that case the session shouldn't have been added to the map in the first place.
I'm just trying to clarify whether I'm missing something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole function runs with the mutex held.
That means that adding the session to the map on line 137 only has an effect if this function returns early.
Correct
The goroutine on lines 147-181 does not interact with the session.
Basically what I'm trying to wrap my head around is whether returning from line 166 with an error causes the context to be cancelled. Unless I'm missing something, that doesn't happen. The context will be cancelled after the timeout, which will cause the session to be removed from the map, but that's it (line 213). What I think is that in that case the session shouldn't have been added to the map in the first place.
Also correct. This is not needed because if creating the process fails, then connecting to chromium will necessarily fail and trigger deletion anyway:
Lines 186 to 191 in e8f7d6a
version, err := s.cclient.Version(versionCtx, net.JoinHostPort("localhost", s.opts.ChromiumPort)) | |
if err != nil { | |
logger.Error("could not get chromium info", "err", err) | |
s.delete(id) // We were not able to connect to chrome, the session is borked. | |
return SessionInfo{}, err | |
} |
The entry is removed on line 189.
On line 193 onwards is the only place where sess is mutated.
It's OK to call Delete with an id that is not present on the map, because Delete also takes the mutex.
Correct
It looks like it should be possible to defer adding the session to the map down to line ~ 193.
Right now, the cancel func needs to be on the map so line 189 can cancel it if connecting to chromium fails:
Line 189 in e8f7d6a
s.delete(id) // We were not able to connect to chrome, the session is borked. |
I think you are implying that we could simplify this by not registering the object on the map before registering the AfterFunc, and I think you are right. Let me prototype that and see if the tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried this out in the last commit, tests are happy and I think it makes more sense. LMK what you think!
s.sessions[id] = cancel | ||
sess := &session{cancel: cancel} | ||
|
||
s.sessions[id] = sess | ||
|
||
context.AfterFunc(ctx, func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cancel on line 216 causes this function to be called, which causes (L#144) Delete
to be called, which calls delete
again. What is preventing the loop? (and the deadlock) Shouldn't the order of lines 216 and 217 be swapped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supervisor.delete
is always called with the mutex held, so cancelling the context is safe and won't race with Supervisor.Delete
on the AfterFunc
on L139. When the mutex is released. Supervisor.Delete
will be a noop as L217 removed the entry from the map.
This is not particularly great but we really really want timeouts to cancel and remove the session from the map, and I wasn't able to figure out a less confusing way of doing it unfortunately :(
On recent chromium versions, allegedly at some point after
126.0.6478.126-r0
, chromium has stopped honoring--remote-debugging-address
, specifically--remote-debugging-address=0.0.0.0
. Before this PR, Crocochrome assumed chromium would honor this flag and forwarded chromium's debugging URL to clients untouched, confident that clients would be able to reach that URL.As this is no longer true, we now need to do some plumbing to somehow forward requests to Chromium.
This PR implements a websocket proxy that, given a session ID present in the URL, looks up the chromium URL for that session and proxies the connection to that URL. To make this possible, a couple satellite changes were required:
Session(id string)
endpoint toSupervisor
to fetch existing sessions by ID after they are createdSession()
method that created the session toCreate()
Session(id string)
can retrieve it, in addition to the cancel func that was already storedI have tested this manually for now, until we add the integration test to CI/CD. This will happen in a subsequent PR as there is still some plumbing left to do to get there.
Related slack thread: https://raintank-corp.slack.com/archives/C073DULU8TE/p1721663571781409