How deep a recurse?

Chris has been exploring various limits of a lab M8000. Inspired (well, umm, also maybe board on a conf call) by this and prompted by a Twitter update on Google and recursion from Alec (don't recall if I read it first on his blog or Twitter) got me thinking about how deep you can recurse on a modern system? So wrote some code. The marginally tricky bit was setting up the alternate stack to handle the signal on with sigaltstack.

#include < unistd.h >
#include < stdio.h >
#include < signal.h >
#include < stdlib.h >
#include < sys/resource.h >
#include < sys/mman.h >

static void handler();
static void recurse(void);

static int depth = 0;

int main()
{
  struct sigaction act;

  struct rlimit rlp;
  stack_t ss;

  getrlimit(RLIMIT_STACK, &rlp);
  printf("RLIMIT_STACK = %u:%u\\n", rlp.rlim_cur, rlp.rlim_max);

  act.sa_handler = handler;
  sigemptyset(&act.sa_mask);
  act.sa_flags = 0;
  act.sa_flags |= SA_RESETHAND|SA_SIGINFO|SA_ONSTACK;

  if (sigaction(SIGSEGV, &act, NULL) < 0) {
    perror("sigaction failed");
    exit(1);
  }

  if ((ss.ss_sp = mmap(NULL, SIGSTKSZ, PROT_READ | PROT_WRITE,
		      MAP_PRIVATE | MAP_ANON, -1, 0)) == MAP_FAILED) {
    perror("mmap failed");
    exit(1);
  }

  ss.ss_size = SIGSTKSZ;
  ss.ss_flags = 0;

  if (sigaltstack(&ss, NULL) < 0) {
    perror("sigaltstack failed");
    exit(1);
  }

  recurse();
}

static void recurse(void)
{
  depth++;
  recurse();
}

void handler(void)
{
  printf("depth = %u\\n", depth);
  exit(0);
}

1st attempt on a Macbook with OSX gave a number of 524030. We then moved to Solaris Nevada 110 running on one of our x86 lab system. Also tried the S10 Sparc stable server. The Sparc numbers are a lot smaller than the x86 numbers. The Sparc numbers are similar on Solaris 10 or Nevada. What a great microbenchmark this would make to base purchases on, how deep can a system recurse with no function arguments passed. Many purchasing decisions have been made on the results of benchmarks of similar relevance to the business problem in hand so lets not dismiss totally. Anyway, back to reality.

On a Solaris 10 Sparc box we get

ebusy(5.10)$ cc -o recurse recurse.c                               
ebusy(5.10)$ ./recurse
RLIMIT_STACK = 1000000000:1000000000
depth = 10416627
ebusy(5.10)$ cc  -m64 -o recurse recurse.c   
ebusy(5.10)$ ./recurse                       
RLIMIT_STACK = 1000000000:1000000000
depth = 5681792
ebusy(5.10)$ uname -a
SunOS ebusy 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Fire

and on the Nevada x86 lab system we get

exdev(5.11)$ cc  -o recurse recurse.c          
exdev(5.11)$ ./recurse
RLIMIT_STACK = 2147483647:2147483647
depth = 16812966
exdev(5.11)$ cc  -m64 -o recurse recurse.c     
exdev(5.11)$ ./recurse                         
RLIMIT_STACK = 1000000000:1000000000
depth = 62499512
exdev(5.11)$ uname -a
SunOS exdev 5.11 snv_110 i86pc i386 i86pc

I am sure there are some games to play with increasing the hard stack limit or allocating the alternate stack a huge segment of memory and recursing through that as well. However, over 62 million stack frames is adequate for most recursive situations which will complete.

Interesting that compilation with -x02 or higher leads to assembler that does nothing and the code just sits in a loop without ever calling the function below.

exdev(5.11)$ pstack `pgrep recur`
17659:  ./recurse
 0000000000400f80 recurse ()
exdev(5.11)$ 

So a few interesting questions that will have to wait for the next conf call where I don't need to pay too much attention.

Comments:

ALl the includes have gone. You build it 64 bit then you will be limited just by the amount of memory in the system.

Posted by Chris Gerhard on August 01, 2009 at 11:18 AM BST #

Thanks Chris, includes fixed. Even a 64 bit build still has an hard rlimit for the stack, though bigger. The second set of results were built 64 bit.

Posted by Clive King on August 01, 2009 at 11:38 AM BST #

Just don't ustack() it :-) . Actually, we'd stop at 2048 frames by default so long way from where you got to...

Posted by Jon Haslam on August 01, 2009 at 03:08 PM BST #

In the spirit of knowing understatement, the bit Chris neglected to impart was the need to change the process resource process.max-stack-size. With that suitably adjusted we get just over 1,000,000,000 frames, before ran out of memory of a 4GB system. Making the stack size larger than memory turns the system into a warm brick for a while as to be expected.

Posted by Clive King on August 03, 2009 at 05:13 AM BST #

Post a Comment:
Comments are closed for this entry.
About

clive

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today