dd story

One rather interesting difference between Linux and Solaris is how command like this dd if=/dev/zero of=/dev/null bs=900M count=1 behave, especially if box has smth like 256M of RAM. Solaris behaves rather naturally, if swap is big enough it swaps heavily and becomes not so responsive. Linux, on another hand does it unbelievably fast, and it easily handles sizes even bigger than total RAM + swap. I got curious, how it happens.

There was no significant magic in userland (in file-utils, where dd came from), other than page alignment of buffer. So I expected the worse and looked into the kernel. And my guess was correct: in drivers/char/mem.c there was a code like this

/\*
 \* For fun, we are using the MMU for this.
 \*/
static inline size_t read_zero_pagealigned(char __user \* buf, size_t size)
{
	struct mm_struct \*mm;
        struct vm_area_struct \* vma;
	unsigned long addr=(unsigned long)buf;

	mm = current->mm;
	/\* Oops, this was forgotten before. -ben \*/
	down_read(&mm->mmap_sem);

	/\* For private mappings, just map in zero pages. \*/
	for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) {
		unsigned long count;

		if (vma->vm_start > addr || (vma->vm_flags & VM_WRITE) == 0)
			goto out_up;
		if (vma->vm_flags & (VM_SHARED | VM_HUGETLB))
			break;
		count = vma->vm_end - addr;
		if (count > size)
			count = size;

		zap_page_range(vma, addr, count, NULL);
        	zeromap_page_range(vma, addr, count, PAGE_COPY);

so Linux does nothing, but just mapping COW kernel's zero page all over the place into the userland's address space. To see difference for yourself, add conv=ibm argument to dd.

PS: FreeBSD behaves in same way Solaris does, it seems it was one of Linus easter eggs :).

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

nike

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today