|
This tutorial gives an overview of using the Coverage and Hotspot Analyzer to
check the code coverage of a test suite, and to optimize code by identifying
hotspots.
The sample data is available in the Analyzer's default database
(example_db) which includes experiment data for the Ana-Find
program and the Data::Dump module.
This project illustrates the use of hotspot analysis to optimize code.
The Ana-Find script parses dictionary files looking for anagrams. The
project contains numerous experiments which contain coverage and
hotspot information gathered from the command 'perlcov-perl
ana-find >/dev/null' (perlcov-perl is the replacement
interpreter used by the Analyzer to collect statistical data about
your program at runtime). Experiments in this project get gradually
faster as hotspots are identified and the code is revised.
Expand the project tree for the Example Ana-Find project by clicking the "+"
on the left. A number of experiments appear sorted by project name, which
happens to be the file's source code control change number.
 With each successive experiment, the runtime (displayed in seconds
in the Runtime column) decreases. The number of subroutines (the # Subs column)
decreases from seven in the first experiment to only two in the last.
One of the most significant performance improvements in the Ana-Find
project can be seen between the experiments "Change 269172" and
"Change 269173". We can see that the test (four repetitions of the
command 'perlcov-perl ana-find >/dev/null') took 129.391 seconds for
Change 269172 and only 90.519 seconds for Change 269173. Ana-Find went
from 34 lines of code (# LOC) to 33, and from five subroutines (#
Subs) to four. We can find out where in the code this optimization
happened by comparing the two versions.
Click on Change 269173 to select it, then click "Mark for Compare" in
the toolbar, View menu or right-click context menu. This marks the
experiment with a check. Now click on Change 269172 to select it, then
click "Compare Hotspots". While an experiment is selected, applicable
keyboard shorcuts (like space bar for "Mark for Compare") are
displayed in the status bar.
The Comparison View window appears showing Change 269172 on the left
and Change 269173 on the right.
 The Sub column displays a list of subroutines in the code. The T1
(selected experiment) and T2 (marked for comparison experiment)
columns show the time taken by each subroutine (in seconds).
In the Sub list we see that the 'signature' subroutine has no time
information in the T2 column (corresponding to Change 269173). This is
the missing subroutine we noticed in the Experiment View. Click on the
signature subroutine to jump to the relevant code in the Change 269172
version.
 The "%" column on the left indicate the percentage of time spent on
each line of code, in this case sub signature took 17% of the total
runtime of the script.
There is no corresponding sub in the Change 269173 version (the
left code display pane is blank), so removing this subroutine
appears to have saved us 9.149 seconds. However, the program still
needs that join to work. Click on the 'find_anagrams' sub in the
top pane to see where it went.
Change 269172 has the following on lines 47 through 49:
 Change 269173 incorporates the join in line 43, so the signature
sub is not needed:
 Adding the join to find_anagrams increased the time spent on
that line from 22% to 24% of the total runtime, but this negligible
compared with the time saved by not calling a separate signature
subroutine.
This project shows the evolution of the test suite for the Data::Dump
module. Click the "+" next to Example Data-Dump to expand the Project
tree and view the experiments.
The experiments in this project were generated by running `make test`
against successive versions of Data::Dump. In version 0.01 only 75 of
190 lines of code was covered, but by version 1.06 this had increased
to 333 of 378 lines.
 To see if the code coverage of Data::Dump's test suite has improved in
a more recent version, we will download version 1.07 of the module, and
create a new experiment on this codebase.
Data::Dump 1.07 can be downloaded here:
http://cpan.org/authors/id/G/GA/GAAS/Data-Dump-1.07.tar.gz
Unpack the tarball in a convenient directory. In the Analyzer, click
the Add Experiment button.
 To keep the naming scheme consistent, we will call this experiment
"1.07, but you could specify any name you choose here. For "Working
Directory", select the directory where you have unpacked
Data-Dump-1.07.tar.gz.
Unlike the Ana-Find experiments, perlcov-perl does not need to be
specified in the experiment's "Run Command" field. The Analyzer
automatically uses it when 'make test' is invoked. Click OK to run
the experiment.
The "Active Runs" dialog box appears displaying the standard output of
'make test'. If anything goes wrong with the specified command, you
can see the it here. Click "Dismiss" to close the dialog box.
A new experiment named "1.07" should appear which shows us an overview
of the coverage information for the version we've added. We can see it
has 11 files, 10 invocations of perl, 13 subroutines, and 389 lines
of code in total -- 379 of which were covered by 'make test'.
Select experiment 1.07 and click "View Coverage".
 The Experiment View window appears showing us the coverage information
for experiment 1.07. We see a larger version of the Coverage histogram
shown in the Project View. Though almost all the code has been covered
in the experiment, by clicking on the light green part of the
histogram at the far left (or by sliding the black triangle over to
it), we can focus on the subroutines where coverage is not quite
complete. With "0-89%" selected in the histogram, our Sub list is
limited to "Data::Dump::tied_str" and "Global code in
t/quote-unicode.t" -- the only parts of the module with less than 98%
coverage.
Select "Data::Dump::tied_str" to jump to sub tied_str in the code.
Lines 339 and 340 are marked in light blue (indicating "cold" areas of
the code). If we're curious about why these lines were not covered by
the test, we can hover the mouse pointer over the bolded if
statement on line 335.
 A hover tip appears telling us that the else was never hit because
the initial test was always true. In this particular case, it is
probably not worth changing the test suite to cover this condition.
Have a look through previous versions of Data::Dump to see how the test
suite has been improved over the course of several revisions. The
experiments can be compared the same way we did when examining
different versions of Ana-Find, highlighting where test coverage has
improved.
|