-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threading support #4559
base: dev
Are you sure you want to change the base?
Threading support #4559
Conversation
Size difference with the dev branch: Binary size differencenot the same command! tinygo build -size short -o ./build/test.hex -target=feather-rp2040 ./examples/adafruit4650 go: downloading tinygo.org/x/tinyfont v0.3.0 flash ram before after diff before after diff 16636 16636 0 0.00% 4172 4172 0 0.00% tinygo build -size short -o ./build/test.hex -target=feather-rp2040 ./examples/adafruit4650 61440 61440 0 0.00% 6180 6180 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/adt7410/main.go 9600 9600 0 0.00% 4748 4748 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/adxl345/main.go 13560 13560 0 0.00% 6788 6788 0 0.00% tinygo build -size short -o ./build/test.hex -target=pybadge ./examples/amg88xx 8816 8816 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/apa102/main.go 11836 11836 0 0.00% 6580 6580 0 0.00% tinygo build -size short -o ./build/test.hex -target=nano-33-ble ./examples/apds9960/proximity/main.go 9888 9888 0 0.00% 4752 4752 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/apa102/itsybitsy-m0/main.go 8376 8376 0 0.00% 2320 2320 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/at24cx/main.go 8160 8160 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/bh1750/main.go 7464 7464 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/blinkm/main.go 70688 70688 0 0.00% 3660 3660 0 0.00% tinygo build -size short -o ./build/test.hex -target=pinetime ./examples/bma42x/main.go 63948 63948 0 0.00% 6196 6196 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/bmi160/main.go 27796 27796 0 0.00% 4780 4780 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/bmp180/main.go 64008 64008 0 0.00% 6228 6228 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/bmp280/main.go 12232 12232 0 0.00% 4812 4812 0 0.00% tinygo build -size short -o ./build/test.hex -target=trinket-m0 ./examples/bmp388/main.go 8324 8324 0 0.00% 3344 3344 0 0.00% tinygo build -size short -o ./build/test.hex -target=bluepill ./examples/ds1307/sram/main.go 22260 22260 0 0.00% 3540 3540 0 0.00% tinygo build -size short -o ./build/test.hex -target=bluepill ./examples/ds1307/time/main.go 69688 69688 0 0.00% 6368 6368 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/ds3231/main.go 4620 4620 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/easystepper/main.go 69232 69232 0 0.00% 6968 6968 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/flash/console/spi 65756 65756 0 0.00% 9004 9004 0 0.00% tinygo build -size short -o ./build/test.hex -target=pyportal ./examples/flash/console/qspi 7224 7224 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/gc9a01/main.go 67668 67668 0 0.00% 6360 6360 0 0.00% tinygo build -size short -o ./build/test.hex -target=feather-m0 ./examples/gps/i2c/main.go 68212 68212 0 0.00% 6504 6504 0 0.00% tinygo build -size short -o ./build/test.hex -target=feather-m0 ./examples/gps/uart/main.go 8008 8008 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/hcsr04/main.go 5832 5832 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/hd44780/customchar/main.go 5784 5784 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/hd44780/text/main.go 10560 10560 0 0.00% 4748 4748 0 0.00% tinygo build -size short -o ./build/test.hex -target=arduino-nano33 ./examples/hd44780i2c/main.go 14732 14732 0 0.00% 6580 6580 0 0.00% tinygo build -size short -o ./build/test.hex -target=nano-33-ble ./examples/hts221/main.go 16152 16152 0 0.00% 2360 2360 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/hub75/main.go 10252 10252 0 0.00% 6916 6916 0 0.00% tinygo build -size short -o ./build/test.hex -target=pyportal ./examples/ili9341/basic 10776 10776 0 0.00% 4868 4868 0 0.00% tinygo build -size short -o ./build/test.hex -target=xiao ./examples/ili9341/basic 29588 29588 0 0.00% 38076 38076 0 0.00% tinygo build -size short -o ./build/test.hex -target=pyportal ./examples/ili9341/pyportal_boing 10276 10276 0 0.00% 6916 6916 0 0.00% tinygo build -size short -o ./build/test.hex -target=pyportal ./examples/ili9341/scroll 10864 10864 0 0.00% 4868 4868 0 0.00% tinygo build -size short -o ./build/test.hex -target=xiao ./examples/ili9341/scroll 263408 263408 0 0.00% 46748 46748 0 0.00% tinygo build -size short -o ./build/test.hex -target=pyportal ./examples/ili9341/slideshow 11764 11764 0 0.00% 4780 4780 0 0.00% tinygo build -size short -o ./build/test.hex -target=circuitplay-express ./examples/lis3dh/main.go 14096 14096 0 0.00% 6580 6580 0 0.00% tinygo build -size short -o ./build/test.hex -target=nano-33-ble ./examples/lps22hb/main.go 26492 26492 0 0.00% 2328 2328 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/lsm303agr/main.go 12476 12476 0 0.00% 4788 4788 0 0.00% tinygo build -size short -o ./build/test.hex -target=arduino-nano33 ./examples/lsm6ds3/main.go 10756 10756 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/mag3110/main.go 9932 9932 0 0.00% 4780 4780 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/mcp23017/main.go 10380 10380 0 0.00% 4788 4788 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/mcp23017-multiple/main.go 9916 9916 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/mcp3008/main.go 68560 68560 0 0.00% 6188 6188 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/mcp2515/main.go 27104 27104 0 0.00% 3632 3632 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/microbitmatrix/main.go 26936 26936 0 0.00% 5680 5680 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit-v2 ./examples/microbitmatrix/main.go 8272 8272 0 0.00% 4748 4748 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/mma8653/main.go 8180 8180 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/mpu6050/main.go 75448 75448 0 0.00% 7448 7448 0 0.00% tinygo build -size short -o ./build/test.hex -target=p1am-100 ./examples/p1am/main.go 12236 12236 0 0.00% 3352 3352 0 0.00% tinygo build -size short -o ./build/test.hex -target=pico ./examples/pca9685/main.go 6228 6228 0 0.00% 3288 3288 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/pcd8544/setbuffer/main.go 5256 5256 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/pcd8544/setpixel/main.go 10524 10524 0 0.00% 3328 3328 0 0.00% tinygo build -size short -o ./build/test.hex -target=feather-rp2040 ./examples/seesaw 2841 2841 0 0.00% 558 558 0 0.00% tinygo build -size short -o ./build/test.hex -target=arduino ./examples/servo 13744 13744 0 0.00% 3400 3400 0 0.00% tinygo build -size short -o ./build/test.hex -target=pico ./examples/sgp30 8188 8188 0 0.00% 6788 6788 0 0.00% tinygo build -size short -o ./build/test.hex -target=pybadge ./examples/shifter/main.go 57460 57460 0 0.00% 3684 3684 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/sht3x/main.go 57436 57436 0 0.00% 3692 3692 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/sht4x/main.go 57424 57424 0 0.00% 3684 3684 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/shtc3/main.go 6696 6696 0 0.00% 2288 2288 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/ssd1306/i2c_128x32/main.go 6144 6144 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/ssd1306/spi_128x64/main.go 5844 5844 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/ssd1331/main.go 6764 6764 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/st7735/main.go 6692 6692 0 0.00% 2280 2280 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/st7789/main.go 17536 17536 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=circuitplay-express ./examples/thermistor/main.go 10700 10700 0 0.00% 4540 4540 0 0.00% tinygo build -size short -o ./build/test.hex -target=circuitplay-bluefruit ./examples/tone 10204 10204 0 0.00% 4740 4740 0 0.00% tinygo build -size short -o ./build/test.hex -target=arduino-nano33 ./examples/tm1637/main.go 10820 10820 0 0.00% 3340 3340 0 0.00% tinygo build -size short -o ./build/test.hex -target=pico ./examples/touch/capacitive 9624 9624 0 0.00% 6780 6780 0 0.00% tinygo build -size short -o ./build/test.hex -target=pyportal ./examples/touch/resistive/fourwire/main.go 12568 12568 0 0.00% 6976 6976 0 0.00% tinygo build -size short -o ./build/test.hex -target=pyportal ./examples/touch/resistive/pyportal_touchpaint/main.go 15368 15368 0 0.00% 4748 4748 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/vl53l1x/main.go 13848 13848 0 0.00% 4748 4748 0 0.00% tinygo build -size short -o ./build/test.hex -target=itsybitsy-m0 ./examples/vl6180x/main.go 24696 24696 0 0.00% 13720 13720 0 0.00% tinygo build -size short -o ./build/test.hex -target=feather-nrf52840-sense ./examples/waveshare-epd/epd1in54/main.go 6488 6488 0 0.00% 2320 2320 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/waveshare-epd/epd2in13/main.go 6152 6152 0 0.00% 2312 2312 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/waveshare-epd/epd2in13x/main.go 6400 6400 0 0.00% 2320 2320 0 0.00% tinygo build -size short -o ./build/test.hex -target=microbit ./examples/waveshare-epd/epd4in2/main.go 26032 26032 0 0.00% 16412 16412 0 0.00% tinygo build -size short -o ./build/test.hex -target=pico ./examples/waveshare-epd/epd2in66b/main.go 6984 6984 0 0.00% 4780 4780 0 0.00% tinygo build -size short -o ./build/test.hex -target=circuitplay-express ./examples/ws2812 5772 5772 0 0.00% 9522 9522 0 0.00% '-xesppie' is not a recognized feature for this target (ignoring feature) 62732 62732 0 0.00% 5952 5952 0 0.00% tinygo build -size short -o ./build/test.hex -target=feather-nrf52840 ./examples/is31fl3731/main.go 1581 1581 0 0.00% 598 598 0 0.00% tinygo build -size short -o ./build/test.hex -target=arduino ./examples/ws2812 1056 1056 0 0.00% 180 180 0 0.00% tinygo build -size short -o ./build/test.hex -target=digispark ./examples/ws2812 32256 32256 0 0.00% 4780 4780 0 0.00% tinygo build -size short -o ./build/test.hex -target=trinket-m0 ./examples/bme280/main.go 2038270 2038270 0 0.00% 472394 472394 0 0.00% |
bcee341
to
28f51ca
Compare
Updated after some refactoring (moved internal/llsync into internal/task) and added GC support (stop-the-world scanning by pausing all threads and scanning their stacks).
|
0a39dc9
to
d328dd4
Compare
Previously the assembler was reordering this code: jal tinygo_scanstack move $a0, $sp Into this: jal tinygo_scanstack nop move $a0, $sp So it was "helpfully" inserting a branch delay slot, even though this was already being taken care of. Somehow this didn't break, but it does break in the WIP threading branch (#4559) where this bug leads to a crash.
Previously the assembler was reordering this code: jal tinygo_scanstack move $a0, $sp Into this: jal tinygo_scanstack nop move $a0, $sp So it was "helpfully" inserting a branch delay slot, even though this was already being taken care of. Somehow this didn't break, but it does break in the WIP threading branch (#4559) where this bug leads to a crash.
Previously the assembler was reordering this code: jal tinygo_scanstack move $a0, $sp Into this: jal tinygo_scanstack nop move $a0, $sp So it was "helpfully" inserting a branch delay slot, even though this was already being taken care of. Somehow this didn't break, but it does break in the WIP threading branch (#4559) where this bug leads to a crash.
Previously the assembler was reordering this code: jal tinygo_scanstack move $a0, $sp Into this: jal tinygo_scanstack nop move $a0, $sp So it was "helpfully" inserting a branch delay slot, even though this was already being taken care of. Somehow this didn't break, but it does break in the WIP threading branch (#4559) where this bug leads to a crash.
d8cf15f
to
0ca0db4
Compare
Whoa, looks like all tests are passing! |
@aykevl Awesome! Thanks for your hard work |
3d77025
to
5823015
Compare
I don't think there's a whole lot more I can extract and make separate PRs for, so let's mark this as ready for review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much look forward to this work!
src/internal/task/task_threads.c
Outdated
sem_init(&state.startlock, 0, 0); | ||
int result = pthread_create(thread, NULL, &start_wrapper, &state); | ||
|
||
// Wait until the thread has been crated and read all state_pass variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "created"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed!
} | ||
t.state.QueueNext = activeTasks | ||
activeTasks = t | ||
activeTaskLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: defer the unlock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I avoid defer
in places like these since:
defer
can cause heap allocations in some cases (mainly in loops, so not really relevant here, but it's not very visible).- Using
defer
currently makes every call in the function slower and larger, since it needs to be able to recover. - We don't need the run-defer-on-panic behavior here: if we panic, something is very wrong and it doesn't help to unlock.
Admittedly it could be used here, but in many other places (e.g. GCScan
) it can lead to unexpected side effects if it allocates.
So in short, defer
as it is currently implemented is inefficient and can have unexpected side effects so I avoid its use in the runtime.
(Same goes for all the other comments regarding defer
).
} | ||
|
||
// Allow goroutines to start and exit again. | ||
activeTaskLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: defer?
// GC scan phase. Because we need to stop the world while scanning, this kinda | ||
// needs to be done in the tasks package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I find this structure backwards. I would espect the package runtime
to have GCScan
that then calls into this package task
. Especially in light of runtime.gcScanGlobals
and runtime.scanStack
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, and I first tried to put the equivalent of GCScan
in the runtime. But GCScan
also needs access to various private symbols like activeTaskLock
, activeTasks
, scanDoneFutex
, etc so to avoid exporting many private details of the tasks package I put it there.
I can however make scanCurrentStack
and gcScanGlobals
callbacks, so they don't directly call back to the runtime. But then I don't know what to do with scanCurrentStack
in tinygo_task_gc_pause
. In the end, the GC and the scheduler do need some tight integration, I can't see a way around that to be honest.
runGC() | ||
gcLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: defer?
@@ -733,6 +743,7 @@ func ReadMemStats(m *MemStats) { | |||
m.Sys = uint64(heapEnd - heapStart) | |||
m.HeapAlloc = (gcTotalBlocks - gcFreedBlocks) * uint64(bytesPerBlock) | |||
m.Alloc = m.HeapAlloc | |||
gcLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: defer?
QueueNext *Task | ||
|
||
// Semaphore to pause/resume the thread atomically. | ||
pauseSem Semaphore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't Semaphore
be used with pointer receiver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The state
struct (or actually the Task
struct where it is embedded) will always be accessed via a pointer, like *Task
. So essentially when calling a function like pauseSem.Post()
, what happens is something like SemaphorePost(&t.state.pauseSem)
.
Also, I think you mean to say that pauseSem
should have the type *Semaphore
? Since the methods on the Semaphore
type do in fact have a pointer receiver:
func (s *Semaphore) Post() { ... }
func (s *Semaphore) Wait() { ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was about a pointer (pun intended 😄) on *Semaphore
. While Semaphore
methods do have pointer receiver, the instances of pauseSem
and gcSem
members would get copied when state
struct instance is ever passed around.
I've created playground snippet which uses stdlib sync.Mutex
as an example to even trigger go vet
warnings: https://go.dev/play/p/cV7h16_JS1R
src/runtime/scheduler.go
Outdated
@@ -27,6 +29,34 @@ func scheduleLogChan(msg string, ch *channel, t *task.Task) { | |||
} | |||
} | |||
|
|||
func timerQueueAdd(tim *timerNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think naming the timerNode tim
is confusing. We should name it either timer
, tn
, or just t
.
I see that name throughout the changes, is it an abbreviation or common term that I am not familiar with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I've updated the naming of some variables here to be more consistent.
(I think tim
is just short for timer
).
rwMutexStateWLocked = ^uint32(0) | ||
rwMutexMaxReaders = rwMutexStateWLocked - 1 | ||
) | ||
const rwMutexMaxReaders = 1 << 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you chose this specific value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason, except that it's the same value as used in the upstream RWMutex
implementation:
const rwmutexMaxReaders = 1 << 30
This is a small refactor to prepare GC marking for multithreaded stop-the-world.
Using a global lock may be slow, but it is certainly simple and safe. If this global lock becomes a bottleneck, we can of course look into making the GC truly support multithreading.
Move common functions to scheduler.go. They will be used both from the cooperative and from the threads scheduler.
Somewhat surprisingly, this results in smaller code than the old code with the cooperative (tasks) scheduler. Probably because the new RWMutex is also simpler.
This is not a scheduler in the runtime, instead every goroutine is mapped to a single OS thread - meaning 1:1 scheduling. While this may not perform well (or at all) for large numbers of threads, it greatly simplifies many things in the runtime. For example, blocking syscalls can be called directly instead of having to use epoll or similar. Also, we don't need to do anything special to call C code - the default stack is all we need.
Updated the PR after some review comments. And rebased on dev to incorporate #4435. |
Work in progress. Many necessary parts are not yet implemented, but this gives context for PRs I want to make in the future.
See #2446 for motivation (in particular, why this focuses on Linux first) and for a list of things that need to be implemented to make multithreading work in TinyGo.