Web Workers

Web Workers

“client side js is single threaded”

“erm actually using web-workers you can create multi threaded applications in client side js” - 🤓


Setup #

The Web Workers API lets users create dedicated worker threads in the browser to run a specified script.

To use SharedArrayBuffer (the primary way web workers can share memory) the Cross-Origin-Embedder-Policy and Cross-Origin-Opener-Policy headers must be set to require-corp and same-origin.

python -m http.server is typically my go to run quickly a http server, but i couldn’t find an easy way to set headers - so i ended up using FastAPI.

# server.py
from fastapi import FastAPI  
from fastapi.staticfiles import StaticFiles  
from starlette.middleware.base import BaseHTTPMiddleware  
from starlette.requests import Request  
from starlette.responses import Response  
import os  
  
app = FastAPI()  
  
class AddHeadersMiddleware(BaseHTTPMiddleware):  
    async def dispatch(self, request: Request, call_next):  
        response = await call_next(request)  
  
        response.headers["Cross-Origin-Embedder-Policy"] = "require-corp"  
        response.headers["Cross-Origin-Opener-Policy"] = "same-origin"  
  
        return response  
  
app.add_middleware(AddHeadersMiddleware)  
  
current_dir = os.path.dirname(os.path.abspath(__file__))  
app.mount("/static", StaticFiles(directory=current_dir), name="static")

now just fastapi dev main.py to get started.

NOTE: Web workers also don’t exist within Node.js (gh issue here), so running node main.js will throw

let worker = new Worker("worker.js")
ReferenceError: Worker is not defined

Main Thread #

Now that we have an environment to use web workers:

Our problem: summing integer from sum 1 → n, each worker stores its calculated sum into a shared data structure, once all threads exit the main thread can calculate the sum.

This problem is a good dummy example as its easy to check output with (n * (n + 1)) / 2

First we can define the main function that will spawn workers, give them the info needed to do their part of the work, then await and collect their results.

// main.js
function sumToN(n) {
    return (n * (n+1)) / 2
}

async function main() {
    const t1 = performance.now()
    let workerStatuses = []
    const workerCount = 5
    const upperLimit = 1_000

    // allocate 4 bytes/32 bits for each worker to store its sum into
    const bytesToAllocate = 4 * workerCount
    const sab = new SharedArrayBuffer(bytesToAllocate)

    for (let i = 0; i < workerCount; i++) {
        let worker = new Worker("worker.js")

        const p = new Promise((resolve) => {
            worker.onmessage = ((e) => {
                [workerId, message] = e.data

                // resolve promise upon worker's task completion
                if (message === "done") {
                    console.log(`worker ${workerId} done`)
                    resolve()
                }
            })

            // give the worker the info it needs to complete work
            worker.postMessage([i, sab, upperLimit, workerCount])
        })

        workerStatuses.push(p)

        // wait for all workers to complete
        await Promise.all(workerStatuses)
    }

    const sabUint32 = new Uint32Array(sab)

    let total = 0
    for (let i = 0; i < workerCount; i++) {
        // don't _really_ need atomic load here,
        // all workers are done, only thread that could access it is the main thread
        total += Atomics.load(sabUint32, i)
    }

    const t2 = performance.now()

    const target = sumToN(upperLimit)
    console.log(`total: ${total}, target: ${target}, match: ${total === total}`)
    console.log("time: ", t2-t1)

}

main()

Threads will communicate the result of their work by storing its value in a SharedArrayBuffer (variable sab) at the index matching their workerId.

The buffer is shared amongst all worker threads so when you go to read/write from it the Atomics API is used to avoid data races.

To ensure we only return from main() after each worker is done, for each worker, we create and add to the workerStatuses array a promise that resolves when the main thread receives a “done”. By using await Promise.all(workerStatues) we ensure we don’t return until all workers confirm completion.

Checking if the message is === “done” isn’t necessary here as we only expect the workers to respond with their calculated sum once, but if you had multiple communication events, this is one way to go about it.

Worker(s) #

Each worker begins working on calculating its part of the total in a round-robin fashion based on its workerId, the upperLimit and workerCount

Once the total is calculated we use the Atomics API to store the value into the array buffer - which is viewed as a list of uint32’s. Array buffers are just blocks of memory with no associated type (besides []byte I suppose) so new Uint32Array tell our code to treat it as a list of numbers, each of which take up 4 bytes.

// worker.js
onmessage = ((e) => {  
    [workerId, sab, upperLimit, workerCount] = e.data  
    const sabUint32 = new Uint32Array(sab)  
    let total = 0  
  
    for (let i = workerId; i <= upperLimit; i += workerCount) {  
        total += i  
    }  
  
    // don't _really_ need atomic store here,  
    // each worker accesses different section of the buffer
    Atomics.store(sabUint32, workerId, total)  
  
    // Alternatively:  
    // Atomics.add(sabUint32, 0, total)  
    postMessage([workerId, "done"])  
})

To run this application you of course need a webpage

<!DOCTYPE html>  
<html lang="en">  
    <script src="main.js"></script>  
</html>

loading this in your browser, you should see the following in the console:

total: 500500, target: 500500, match: true

When doing performance tests via performance.now() to estimate duration we can see a speedup as we increase number of digits to sum (N) - the threads are working!

Nmulti thread (5)single
1_000119.650.0250
10_000
120.740.1349
92_682overflow beginsoverflow begins
100_000126.2851.1850
1_000_000128.7657.5849
10_000_000129.13548.1750
100_000_000191.680395.14
1_000_000_000792.7753897.515

all times in ms

single is a for loop N times, with total += i in the body