Ask HN: Got any tips for profiling and tuning C?

tmsh · on July 17, 2010

For profiling: usually as a first pass I turn to PG. :) Seriously though, add '-pg' to your CFLAGS and LDFLAGS, recompile, run, and look at the output in gprof. It's a pretty good way to easily identify bottlenecks. You can pipe the output to dot and graph the call graphs, etc. -- but I've found that less useful than running the code that I'm trying to make more performant and studying the first 20-30 lines of the gprof output.

http://sourceware.org/binutils/docs/gprof/Compiling.html#Com...

There are better alternatives as well. But adding -pg first is just so easy, and usually (I've found for my stuff) is enough...

For code discovery: I've experimented with strace as others mention (and ltrace). And there are awesome things like Fenris in theory:

http://lcamtuf.coredump.cx/fenris/devel.shtml

But I could never really get them to work personally in practice. Though I learned a lot about what good integration at the terminal level could look like by browsing them. At some point, I hacked together vim and gdb integration pretty well for my purposes (or I should say, improved on the clewn project. I'm pretty happy with it). I wonder if others have done similar things. Anyway, I'm curious what others say as well.

stonemetal · on July 18, 2010

AMD CodeAnalyst is free and pretty good if you have an AMD CPU. If you have an Intel processor it still works it just does less.

The two main data access performance tips are:

1) make sure your loops work right to left if you have [10][9][8] iterate over the 8 array first, the 9 array second, the 10 array third(that is the cache optimal ordering).

2) Prefer SoA(structure of Arrays) to AoS(array of structures) Say you have an array of a structure and you need to loop over the array to update one field in the structure you increase cache hits if you make the structure hold arrays of elements instead of an array of structures.

alextingle · on July 18, 2010

Start with Drepper's "What every programmer should know about memory":

http://people.redhat.com/drepper/cpumemory.pdf

swolchok · on July 17, 2010

strace might help you, perhaps with the -e trace=file option. Depends on how mapserver is implemented.

It's not clear to me that this is a C-specific problem.

ghotli · on July 18, 2010

It's not really, all things need to be profiled in high load situations. I'm moreso looking for an overview of the tooling to see if I'm just unaware of a vital tool. Thanks for the strace info, I'll check it out.