Thursday, March 20, 2014

Microsoft 70-480: Create a web worker process

Exam Objectives


Start and stop a web worker; pass data to a web worker; configure timeouts and intervals on the web worker; register an event listener for the web worker; limitations of a web worker


Quick Overview of Training Materials



Programming in HTML5 with JavaScript and CSS3 - Training Guide - Chapter 9 (a whole 2 pages)
Advanced Windows Store App Development with HTML5 Jump Start - Module 1
MSDN - Introduction to HTML5 Web Workers: The JavaScript Multi-threading Approach

Because the browser user interface and the JavaScript interpreter run in the same execution thread, acheiving 'concurrency' has often meant using timers, intervals, or other asynchronous calls to avoid locking up the UI while JavaScript works on something heavy.  However, web workers not add a thread safe mechanism for adding concurrent execution to JavaScript programs. The spec does point out that web workers are relatively heavy weight, and shouldn't be spawned in great numbers. 

JSFiddle demo of web workers. Because I didn't feel like hosting a .js file elsewhere for my workers, I used the inline method.  I found out the hard way that just because a worker is on a different thread, crazy code can still crash the browser (like, oh, I don't know, continuously calculating prime numbers...). So yeah, if you want to try the example from the spec, put some kind of safeguard in place (like while(n<100000) or something):



Starting and Stopping the Web Worker


Starting a web worker is really not all that difficult. All you need is a JavaScript file to point the Worker() constructor at:

var worker = new Worker('myWorker.js');

Boom, you created a web worker. Now, if calling an external file is not ideal (like in the demo above), you can create a web worker out of a bunch of text by following a few steps:
  • First, create a <script> element with type="javascript/webworker". Or just "text", it really doesn't matter as long as the browser doesn't interpret it as JavaScript. Give it a unique id (say "worker" for this explaination).
     
  • In your main JS file, create a new variable of type Blob, passing to the Blob constructor the text content of the above <script> tag as an array. Basicallyl this means you have to wrap it in square brackets:

    var workBlob = new Blob([document.getElementById("worker").textContent]);
     
  • Once the Blob is created, we can use a nifty new function to create a URL from it:

    var blobURL = window.webkitURL.createObjectURL(workBlob) || window.URL.createObjectURL(workBlob);
     
  • Now we just create the worker more or less the same as before:

    var worker = new Worker(blobURL);
     

Whatever code is sitting in "myWorker.js" will run until you terminate the worker process by either calling  worker.terminate() from the main thread, or by sending a message to the worker that instructs it to call self.close(). One point that is important to note is that if you want to maintain control of your workers, you have to keep track of them (much like timing intervals on the window).  In the above example, if we call var worker = new Worker(blobURL); then whatever is in our <script> tag will be executed in another thread, perfect. But if we call worker = new Worker(blobURL); again, then the first worker will continue to execute and the variable worker will only point to the second web worker, so we won't be able to communicate with or terminate the first worker.  The above demo avoids this by creating an array of workers, so each new worker is pushed onto the array and they are all accessible by the array index.

If you want to see the thread creation/destruction as it happens, you have to get a little fancier than task manager. I downloaded and installed Process Monitor v3.1 and filtered for Chrome processes and thread operations, and when I created and killed a worker with the above demo, it looked like this:



Passing messages to and from the worker


Messages are passed between the main processing thread and the web worker using the postMessage() method and the onmessage event.  In the main JavaScript file, we would pass a message to a web worker like thus:

worker.postMessage("Hello World");


Within the web worker, we would define a handler function for the onmessage event. When the main program posts a message, it will trigger this event.  In the code below, the contents of the message are stored in a string variable:

var rcvdMessage = new String();
self.onmessage = function (e) { rcvdMessage = e.data; };


Now that the web worker has our message, what to do with it? Chances are we want the web worker to do SOMETHING, so we'll have it send us a message to confirm it got our message:

self.postMessage("You said: " + rcvdMessage);


Finally, we want our main JavaScript to listen for messages back from the worker. This looks a lot like what the worker did to listen for messages from the main page. In this case, we're just going to do an alert with the message contents:

worker.onmessage = function (e) {alert(e.data)};


Which should just tell us "You said: Hello World" ... in case we forgot.  Messages can also be in JSON format.  The article The Basics of Web Workers points out that message data is copied from the main thread to the web worker.  What this means is that large messages (think tens of megabytes) can create a lot of overhead. One way to improve performance with these large chucks of data is to use transferable object (spec). A transferable object can be an array buffer or a canvas proxy. The array buffer lets you pass a large amount of data, and the canvas proxy which turns over control of a canvas element to another document (includes a web worker, but could also be an iframe from another origin)(cite). Transferable objects aren't supported in every browser; in unsupported browsers the fallback is copying (structured cloning). Passing an array buffer looks like this:

worker.postMessage(arraybuffer, [arraybuffer]);


Using a transferable object involves a slightly different function call to postMessage() compared to normal messages. Transferable objects act a little like pass by reference, except the original arraybuffer is cleared once it is transferred to the web worker.


Webworkers and timers


Webworkers can set their own timers using the following syntax:

self.setInterval(function () {}, timeout);
self.setTimeout(function () {}, timeout);


These timers are unique to the web worker, and when the web worker is terminated, they go away. They are not seen at all by the main window (I added an interval sniffer extension to Chrome, but it didn't see the intervals created by the above JSFiddle demo).


Limitations of web workers


Web workers have some limitations in what they are able to do. Workers do not have access to the DOM (node objects, document objects, etc).  Their access to the window object is very limited (they can use the timing functions above, plus the atob(), btoa(), and dump() methods), and it is in a different scope (that is, it is NOT the same window object as the rest of the page (MozDN). Because they don't have access to the DOM, they cannot use local storage. 

Web workers are able to import external .js files using the importScripts() method, though these scripts would still be subject to the same limitations as the web worker itself (at least while executed inside the web worker).  The web worker can also read from the navigator object, which includes descriptive and state information on the browser.  

And one final reminder. Just because a web worker won't lock up your browser UI, JavaScript that would crash the browser in the main window will still crash the browser in a web worker!


No comments:

Post a Comment