Firefox on OSX with dtrace – Part II

March 2nd, 2009
nonken

In the last article I have given an introduction on how to get dtrace running in your development environment, how to compile firefox and how a simple dtrace script looks like.
Today we will take a look at how to write more useful dtrace scripts and I will show how to analyze the resulting data in a simple dojox.charting chart.

Dtrace is a very powerful tool to take a deep and exact look into your systems internals. For us webdevelopers it is a great tool if want to investigate performance issues or bottlenecks within the browser.

A few dtrace details

Dtrace works by using probes provided by so called “providers” which are scattered throughout your system and also in important places in firefox. When starting a dtrace session those probes get activated and you can handle the information given by the probes.

Besides that, dtrace’s internal language D gives you many options of handling the data, making aggregations, predicative constructs and other features to give you a good view of what is happening.

Probe activation

To activate a dtrace probe first of all you need to know what probes are available on your system, in our case which probes are available through firefox. A simple

1
$ dtrace -l | grep -i "javascript"

should return a list of available probes looking like this:

1
2
3
4
5
6
7
8
9
24728 javascript13762    libmozjs.dylib                        js_Execute execute-done
24729 javascript13762    libmozjs.dylib             jsdtrace_execute_done execute-done
24730 javascript13762    libmozjs.dylib                        js_Execute execute-start
24731 javascript13762    libmozjs.dylib            jsdtrace_execute_start execute-start
24732 javascript13762    libmozjs.dylib                      js_Interpret function-args
24733 javascript13762    libmozjs.dylib            jsdtrace_function_args function-args
24734 javascript13762    libmozjs.dylib                      js_Interpret function-entry
24735 javascript13762    libmozjs.dylib           jsdtrace_function_entry function-entry
...

The first column is the probe ID, the second the name of the probe provider, the third the location of the probe, the fourth the name of the program function the probe is located at, and the last is the probe name.
Now when you want to activate a probe you use following schema:

1
provider:module:function:name

So to activate the “function-entry” probe we can write

1
javascript13762:libmozjs.dylib:jsdtrace_function_entry:function-entry

This is long to write, especially in our case since we would want to activate all function-entry probes. Se we can shorten this to

1
javascript*:::function-entry

A simple dtrace script

Now that we know how to activate a probe lets take a look at the dtrace script I was showing in the last blogpost:

1
2
sudo dtrace -n javascript*:::function-entry'{ @[copyinstr(arg1), copyinstr(arg2)] = count(); }'
    -n END'{ printa("n%-6s %-35s  %5@u", @); }'

lets split the dtrace call up in its most important parts:

1
sudo dtrace

here we execute dtrace with root privileges

1
-n javascript*:::function-entry'{ @[copyinstr(arg1), copyinstr(arg2)] = count(); }'

The parameter “-n” tells dtrace that we are about to enable a probe matching the name following the parameter. This is “javascript*:::function-entry”.

Now after enabling the probe, we are telling dtrace what to do with the information we get everytime that probe gets fired. In this case we do an aggregation on the times a function gets called. Similar to an associative array we store a “key” and call the dtrace internal function count() to increment the counter for every specific function call.

When activating a probe and that probe gets called you have access to several probe specific arguments which you refer to using arg0, arg1, arg2, argX. The function-entry arguments are:

arg0: filename
arg1: classname
arg2: funcname

Essentially in the above example, we are storing the class name and function name as the key for the aggregation.

The last unclear bit is the copyinstr call on both arg1 and arg2. copyinstr is used to copy the data from the user-space to the kernels address space. In short, copyinstr makes the information available to the dtrace script so we can output the info in a usable format.

The last part of the command actually is also activating a probe

1
-n END'{ printa("n%-6s %-35s  %5@u", @); }'

Here we are activating the dtrace internal END probe

1
END

An exact same way of writing this would be:

1
dtrace:::END

which gets called at the very end of the dtrace execution. We use the dtrace function “printa” do return the data aggregation in a human readable format.
“printa” almost works the same as “printf”, most developers should know and basically takes a format string as the first argument and aggregation variable as the second string. In this example we use following format string:

1
"n%-6s %-35s  %5@u"

Our aggregation variable has two touple identifiers, the classname and the function name. As a third available parameter we have the aggregation value itself (@[copyinstr(arg1), copyinstr(arg2)] = count();).

At first we make sure we print a “n” for each new element in the aggregation, then left-align the first touple identifier (@[copyinstr(arg1), copyinstr(arg2)] = count();), fill up missing spaces up until 6 characters. The same we do with the second touple identifier (@[copyinstr(arg1), copyinstr(arg2)] = count();), though fill up more remaining space (up to 35 whitespaces) and at the end we display the actual aggregate information (@[copyinstr(arg1), copyinstr(arg2)] = count();). Note the “@” in “%5@u” which is used to access the aggregation information (count();).

To get a full overview of possibilities to format your aggregation using printa have a look at the dtrace documentation.

A step further

When debugging web-applications, we don’t want to type command line scripts every time but rather want to use a set of ready made scripts. Lets put the previous script into a file called “functioncall_count.d”:

1
2
3
4
5
6
7
8
9
javascript*:::function-entry
{
    @[copyinstr(arg1), copyinstr(arg2)] = count();
}

dtrace:::END
{
    printa("n%-6s %-35s  %5@u", @);
}

And execute the script using

1
$ sudo dtrace -s functioncall_count.d

When you load up a webpage, execute a few javascript function calls and then press CTRL+C you should get the same list of function calls as we did before.

Hands on optimisation

Now that we have a little bit of insight into how dtrace works lets look at a bottleneck occuring very often when building web applications: interaction with the DOM

A very often found bottleneck is the miss usage of appending DOM nodes into the DOM when being in an iteration.

Lets take a look at an example:

1
2
3
4
5
6
7
8
9
10
function buildMenu(){
    var cnt = 1000, node
        resultNode = document.getElementById("resultNodes");
    while (cnt>0){
        node = document.createElement("li");
        node.innerHTML = "Iteration "+cnt;
        resultNode.appendChild(node);
        cnt--;
    }
}

and compare the same with following code:

1
2
3
4
5
6
7
8
9
10
11
12
function buildMenu(){
    var cnt = 1000, node
        resultNode = document.getElementById("resultNodes"),
        frgm = document.createDocumentFragment();
    while (cnt>0){
        node = document.createElement("li");
        node.innerHTML = "Iteration "+cnt;
        frgm.appendChild(node);
        cnt--;
    }
    resultNode.appendChild(frgm);
}

Lets write a simple dtrace script which allows us to measure the execution time of such a iteration:

1
2
3
4
5
6
7
8
9
10
11
12
13
javascript*:::function-entry
/copyinstr(arg2) == $$1/
{
        self->ts = timestamp;
}

javascript*:::function-return
/copyinstr(arg2) == $$1/
{
        @times["time: ", copyinstr(arg2)] = avg(timestamp-self->ts);
        @counts["counts: ", copyinstr(arg2)] = count();
        self->ts=0;
}

At first we see that our dtrace script activates two probes or the javascript* provider:

function-entry
function-return

1
2
3
4
5
javascript*:::function-entry
/copyinstr(arg2) == $$1/
{
        self->ts = timestamp;
}

We see two new things:

1
/copyinstr(arg2) == $$1/

and

1
self->ts = timestamp;

Dtrace doesn’t know control structuures such as if, else, if else but is using a predicative syntax.
Basically all we are saying here is: “If copyinstr(arg2) equals the passed first parameter from the shell, execute the probe.”. This allows us to only watch specific functions within application execution and can save us a lot of time, analyzing data returned from a profiling/debuggin session.

The second unknown construct is self->ts = timestamp.
“self” allows us to store data in thread local variables so we can be sure that there won’t be any race conditions in multi threaded applications. This is not really an issue in the JavaScript environment but nevertheless its good practice to work this way. “timestamp” gives us the current timestamp in nanoseconds, this is way more accurate than most of the todays browser profilers.

Now the code for our second probe looks like following:

1
2
3
4
5
6
7
javascript*:::function-return
/copyinstr(arg2) == $$1/
{
        @times["time: ", copyinstr(arg2)] = avg(timestamp-self->ts);
        @counts["counts: ", copyinstr(arg2)] = count();
        self->ts=0;
}

We are creating two aggregations on the function name. One to calculate the average time of execution for the function and one for the amount of times that function got called.

See it in action

Lets take a look at a video demonstration so you can see what difference each of the two approaches make in your application.


JavaScript profiling with dtrace from uxebu on Vimeo.

To watch the video with better quality visit the vimeo site.

Results

After profiling the two different approaches with dtrace we can clearly see that using DocumentFragment within an iteration is faster than appending the node to the DOM straight away.

Appending nodes using DocumentFragment: 252ms
Appending nodes to the DOM within the iteration: 260ms

In our example, the complexity within the loop was still very low. If you are dealing with more complex constructs the performance gain can be significant.

What’s next

Until now we have seen ways to compare different implementations of the same functionality – meaning that we already have to know about a bottleneck. In the next article I will explain how we can spot potential weak spots in your application code and show you how you can use dtrace to make sure that your code optimizations are as effective as they can be.

We at uxebu use these techniques, when we are investigating peformance bottle necks, to find the most efficient way to interact with the DOM.

Comments

Leave a Reply