Fun source code facts

A while ago, for my own amusement, I went through the Solaris source base and searched for the source files with the most lines. For some unknown reason this popped in my head yesterday so I decided to try it again. Here are the top 10 longest files in OpenSolaris:

LengthSource File
29944usr/src/uts/common/io/scsi/targets/sd.c
25920[closed]
25429usr/src/uts/common/inet/tcp/tcp.c
22789[closed]
16954[closed]
16339[closed]
15667usr/src/uts/common/fs/nfs4_vnops.c
14550usr/src/uts/sfmmu/vm/hat_sfmmu.c
13931usr/src/uts/common/dtrace/dtrace.c
13027usr/src/uts/sun4u/starfire/io/idn_proto.c

You can see some of the largest files are still closed source. Note that the length of the file doesn't necessarily indicate anything about the quality of the code, it's more just idle curiosity. Knowing the quality of online journalism these days, I'm sure this will get turned into "Solaris source reveals completely unmaintable code" ...

After looking at this, I decided a much more interesting question was "which source files are the most commented?" To answer this question, I ran evey source file through a script I found that counts the number of commented lines in each file. I filtered out those files that were less than 500 lines long, and ran the results through another script to calculate the percentage of lines that were commented. Lines which have a comment along with source are considered a commented line, so some of the ratios were quite high. I filtered out those files which were mostly tables (like uwidth.c), as these comments didn't really count. I also ignored header files, because they tend to be far more commented that the implementation itself. In the end I had the following list:

PercentageFile
62.9%usr/src/cmd/cmd-inet/usr.lib/mipagent/snmp_stub.c
58.7%usr/src/cmd/sgs/libld/amd64/amd64unwind.c
58.4%usr/src/lib/libtecla/common/expand.c
56.7%usr/src/cmd/lvm/metassist/common/volume_nvpair.c
56.6%usr/src/lib/libtecla/common/cplfile.c
55.6%usr/src/lib/libc/port/gen/mon.c
55.4%usr/src/lib/libadm/common/devreserv.c
55.1%usr/src/lib/libtecla/common/getline.c
54.5%[closed]
54.3%usr/src/uts/common/io/ib/ibtl/ibtl_mem.c

Now, when I write code I tend to hover in the 20-30% comments range (my best of those in the gate is gfs.c, which with Dave's help is 44% comments). Some of the above are rather over-commented (especially snmp_sub.c, which likes to repeat comments above and within functions).

I found this little experiment interesting, but please don't base any conclusions on these results. They are for entertainment purposes only.

Technorati Tag:

Comments:

Post a Comment:
Comments are closed for this entry.
About

Musings about Fishworks, Operating Systems, and the software that runs on them.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today