“client side js is single threaded”
“erm actually using web-workers you can create multi threaded applications in client side js” - 🤓
Setup #
The Web Workers API lets users create dedicated worker threads in the browser to run a specified script.
To use SharedArrayBuffer (the primary way web workers can share memory) the Cross-Origin-Embedder-Policy
and Cross-Origin-Opener-Policy
headers must be set to require-corp
and same-origin
.
python -m http.server
is typically my go to run quickly a http server, but i couldn’t find an easy way to set headers - so i ended up using FastAPI.
# server.py
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
import os
app = FastAPI()
class AddHeadersMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
response = await call_next(request)
response.headers["Cross-Origin-Embedder-Policy"] = "require-corp"
response.headers["Cross-Origin-Opener-Policy"] = "same-origin"
return response
app.add_middleware(AddHeadersMiddleware)
current_dir = os.path.dirname(os.path.abspath(__file__))
app.mount("/static", StaticFiles(directory=current_dir), name="static")
now just fastapi dev main.py
to get started.
NOTE: Web workers also don’t exist within Node.js (gh issue here), so running node main.js
will throw
let worker = new Worker("worker.js")
ReferenceError: Worker is not defined
Main Thread #
Now that we have an environment to use web workers:
Our problem: summing integer from sum 1 → n, each worker stores its calculated sum into a shared data structure, once all threads exit the main thread can calculate the sum.
This problem is a good dummy example as its easy to check output with (n * (n + 1)) / 2
First we can define the main function that will spawn workers, give them the info needed to do their part of the work, then await and collect their results.
// main.js
function sumToN(n) {
return (n * (n+1)) / 2
}
async function main() {
const t1 = performance.now()
let workerStatuses = []
const workerCount = 5
const upperLimit = 1_000
// allocate 4 bytes/32 bits for each worker to store its sum into
const bytesToAllocate = 4 * workerCount
const sab = new SharedArrayBuffer(bytesToAllocate)
for (let i = 0; i < workerCount; i++) {
let worker = new Worker("worker.js")
const p = new Promise((resolve) => {
worker.onmessage = ((e) => {
[workerId, message] = e.data
// resolve promise upon worker's task completion
if (message === "done") {
console.log(`worker ${workerId} done`)
resolve()
}
})
// give the worker the info it needs to complete work
worker.postMessage([i, sab, upperLimit, workerCount])
})
workerStatuses.push(p)
// wait for all workers to complete
await Promise.all(workerStatuses)
}
const sabUint32 = new Uint32Array(sab)
let total = 0
for (let i = 0; i < workerCount; i++) {
// don't _really_ need atomic load here,
// all workers are done, only thread that could access it is the main thread
total += Atomics.load(sabUint32, i)
}
const t2 = performance.now()
const target = sumToN(upperLimit)
console.log(`total: ${total}, target: ${target}, match: ${total === total}`)
console.log("time: ", t2-t1)
}
main()
Threads will communicate the result of their work by storing its value in a SharedArrayBuffer (variable sab) at the index matching their workerId.
The buffer is shared amongst all worker threads so when you go to read/write from it the Atomics API is used to avoid data races.
To ensure we only return from main()
after each worker is done, for each worker,
we create and add to the workerStatuses array a promise that resolves when the main thread receives a “done”.
By using await Promise.all(workerStatues)
we ensure we don’t return until all workers confirm completion.
Checking if the message is === “done” isn’t necessary here as we only expect the workers to respond with their calculated sum once, but if you had multiple communication events, this is one way to go about it.
Worker(s) #
Each worker begins working on calculating its part of the total in a round-robin fashion based on its workerId
, the upperLimit
and workerCount
Once the total is calculated we use the Atomics API to store the value into the array buffer - which is viewed as a list of uint32’s. Array buffers
are just blocks of memory with no associated type (besides []byte I suppose) so new Uint32Array
tell our code to treat it as a list of numbers, each of which take up
4 bytes.
// worker.js
onmessage = ((e) => {
[workerId, sab, upperLimit, workerCount] = e.data
const sabUint32 = new Uint32Array(sab)
let total = 0
for (let i = workerId; i <= upperLimit; i += workerCount) {
total += i
}
// don't _really_ need atomic store here,
// each worker accesses different section of the buffer
Atomics.store(sabUint32, workerId, total)
// Alternatively:
// Atomics.add(sabUint32, 0, total)
postMessage([workerId, "done"])
})
To run this application you of course need a webpage
<!DOCTYPE html>
<html lang="en">
<script src="main.js"></script>
</html>
loading this in your browser, you should see the following in the console:
total: 500500, target: 500500, match: true
When doing performance tests via performance.now()
to estimate duration we can see a speedup as we increase number of digits to sum (N) - the threads are working!
N | multi thread (5) | single |
---|---|---|
1_000 | 119.65 | 0.0250 |
10_000 | 120.74 | 0.1349 |
92_682 | overflow begins | overflow begins |
100_000 | 126.285 | 1.1850 |
1_000_000 | 128.765 | 7.5849 |
10_000_000 | 129.135 | 48.1750 |
100_000_000 | 191.680 | 395.14 |
1_000_000_000 | 792.775 | 3897.515 |
all times in ms
single is a for loop N times, with total += i in the body