X

Oracle Linux, virtualization , Enterprise and Cloud Management Cloud technology musings

  • October 16, 2011

Containers on Linux

At Oracle OpenWorld we talked about Linux Containers. Here is an example of getting a Linux container going with Oracle Linux 6.1, UEK2 beta and btrfs. This is just an example, not released, production, bug-free... for those that don't read README files ;-)

This container example is using the existing Linux cgroups features in the mainline kernel (and also in UEK, UEK2) and lxc tools to create the environments.

Example assumptions :

- Host OS is Oracle Linux 6.1 with UEK2 beta.

- using btrfs filesystem for containers (to make use of snapshot capabilities)

- mounting the fs in /container

- use Oracle VM templates as a base environment

- Oracle Linux 5 containers

I have a second disk on my test machine (/dev/sdb) which I will use for this exercise.

# mkfs.btrfs  -L container  /dev/sdb
# mount
/dev/mapper/vg_wcoekaersrv4-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
/dev/mapper/vg_wcoekaersrv4-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/mapper/loop0p2 on /mnt type ext3 (rw)
/dev/mapper/loop1p2 on /mnt2 type ext3 (rw)
/dev/sdb on /container type btrfs (rw)



lxc tools installed...


# rpm -qa|grep lxc
lxc-libs-0.7.5-2.x86_64
lxc-0.7.5-2.x86_64
lxc-devel-0.7.5-2.x86_64

lxc tools come with template config files :

# ls /usr/lib64/lxc/templates/

lxc-altlinux lxc-busybox lxc-debian lxc-fedora lxc-lenny
lxc-ol4 lxc-ol5 lxc-opensuse lxc-sshd lxc-ubuntu

I created one for Oracle Linux 5 : lxc-ol5.



Download Oracle VM template for OL5 from http://edelivery.oracle.com/linux. I used OVM_EL5U5_X86_PVM_10GB.

We want to be able to create 1 environment that can be used in both container and VM mode to avoid duplicate effort.

Untar the VM template.

# tar zxvf OVM_EL5U5_X86_PVM_10GB.tar.gz

These are the steps needed (to be automated in the future)...

Copy the content of the VM virtual disk's root filesystem into a btrfs subvolume in order to easily clone the base template.

My template configure script defines :

template_path=/container/ol5-template



- create subvolume ol5-template on /containers

# btrfs subvolume create /container/ol5-template
Create subvolume '/container/ol5-template'

- loopback mount the Oracle VM template System image / partition
# kpartx -a System.img 
# kpartx -l System.img
loop0p1 : 0 192717 /dev/loop0 63
loop0p2 : 0 21607425 /dev/loop0 192780
loop0p3 : 0 4209030 /dev/loop0 21800205

I need to mount the 2nd partition of the virtual disk image, kpartx will set up loopback devices for each of the virtual disk partitions. So let's mount loop0p2 which will contain the Oracle Linux 5 / filesystem of the template.
# mount /dev/mapper/loop0p2 /mnt
# ls /mnt
bin boot dev etc home lib lost+found media misc mnt opt proc
root sbin selinux srv sys tftpboot tmp u01 usr var

Great, now we have the entire template / filesystem available. Let's copy this into our subvolume. This subvolume will then become the basis for all OL5 containers.
# cd /mnt
# tar cvf - * | ( cd /container/ol5-template ; tar xvf ; )

In the near future we will put some automation around the above steps.
# pwd
/container/ol5-template
# ls
bin boot dev etc home lib lost+found media misc mnt opt proc
root sbin selinux srv sys tftpboot tmp u01 usr var

From this point on, the lxc-create script, using the template config as an argument, should be able to automatically create a snapshot and set up the filesystem correctly.
# lxc-create -n ol5test1 -t ol5
Cloning base template /container/ol5-template to /container/ol5test1 ...
Create a snapshot of '/container/ol5-template' in '/container/ol5test1'
Container created : /container/ol5test1 ...
Container template source : /container/ol5-template
Container config : /etc/lxc/ol5test1
Network : eth0 (veth) on virbr0
'ol5' template installed
'ol5test1' created
# ls /etc/lxc/ol5test1/
config fstab
# ls /container/ol5test1/
bin boot dev etc home lib lost+found media misc mnt opt proc
root sbin selinux srv sys tftpboot tmp u01 usr var

Now that it's created and configured, we should be able to just simply start it :
# lxc-start -n ol5test1
INIT: version 2.86 booting
Welcome to Enterprise Linux Server
Press 'I' to enter interactive startup.
Setting clock (utc): Sun Oct 16 06:08:27 EDT 2011 [ OK ]
Loading default keymap (us): [ OK ]
Setting hostname ol5test1: [ OK ]
raidautorun: unable to autocreate /dev/md0
Checking filesystems
[ OK ]
mount: can't find / in /etc/fstab or /etc/mtab
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling /etc/fstab swaps: [ OK ]
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting sysstat: Calling the system activity data collector (sadc):
[ OK ]
Starting background readahead: [ OK ]
Flushing firewall rules: [ OK ]
Setting chains to policy ACCEPT: nat mangle filter [ OK ]
Applying iptables firewall rules: [ OK ]
Loading additional iptables modules: no [FAILED]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0:
Determining IP information for eth0... done.
[ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]
Enabling ondemand cpu frequency scaling: [ OK ]
Starting irqbalance: [ OK ]
Starting portmap: [ OK ]
FATAL: Could not load /lib/modules/2.6.39-100.0.12.el6uek.x86_64/modules.dep: No such file or directory
Starting NFS statd: [ OK ]
Starting RPC idmapd: Error: RPC MTAB does not exist.
Starting system message bus: [ OK ]
Starting o2cb: [ OK ]
Can't open RFCOMM control socket: Address family not supported by protocol
Mounting other filesystems: [ OK ]
Starting PC/SC smart card daemon (pcscd): [ OK ]
Starting HAL daemon: [FAILED]
Starting hpiod: [ OK ]
Starting hpssd: [ OK ]
Starting sshd: [ OK ]
Starting cups: [ OK ]
Starting xinetd: [ OK ]
Starting crond: [ OK ]
Starting xfs: [ OK ]
Starting anacron: [ OK ]
Starting atd: [ OK ]
Starting yum-updatesd: [ OK ]
Starting Avahi daemon... [FAILED]
Starting oraclevm-template...
Regenerating SSH host keys.
Stopping sshd: [ OK ]
Generating SSH1 RSA host key: [ OK ]
Generating SSH2 RSA host key: [ OK ]
Generating SSH2 DSA host key: [ OK ]
Starting sshd: [ OK ]
Regenerating up2date uuid.
Setting Oracle validated configuration parameters.
Configuring network interface.
Network device: eth0
Hardware address: 52:19:C0:EF:78:C4
Do you want to enable dynamic IP configuration (DHCP) (Y|n)?
...

This will run the well-known Oracle VM template configure scripts and set up the container the same way as it would an Oracle VM guest.



The session that runs lxc-start is the local console. It is best to run this session inside screen so you can disconnect and reconnect.



At this point,I can use lxc-console to log into the local console of the container, or, since the container has its internal network up and running and sshd is running, I can also just ssh into the guest.
# lxc-console -n ol5test1 -t 1
Enterprise Linux Enterprise Linux Server release 5.5 (Carthage)
Kernel 2.6.39-100.0.12.el6uek.x86_64 on an x86_64
host login:

I can simple get out of the console entering ctrl-a q.



From inside the container :
# mount
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
# /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 52:19:C0:EF:78:C4
inet addr:192.168.122.225 Bcast:192.168.122.255 Mask:255.255.255.0
inet6 addr: fe80::5019:c0ff:feef:78c4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:141 errors:0 dropped:0 overruns:0 frame:0
TX packets:19 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8861 (8.6 KiB) TX bytes:2476 (2.4 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:560 (560.0 b) TX bytes:560 (560.0 b)
# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2124 656 ? Ss 06:08 0:00 init [3]
root 397 0.0 0.0 1780 596 ? Ss 06:08 0:00 syslogd -m 0
root 400 0.0 0.0 1732 376 ? Ss 06:08 0:00 klogd -x
root 434 0.0 0.0 2524 368 ? Ss 06:08 0:00 irqbalance
rpc 445 0.0 0.0 1868 516 ? Ss 06:08 0:00 portmap
root 469 0.0 0.0 1920 740 ? Ss 06:08 0:00 rpc.statd
dbus 509 0.0 0.0 2800 576 ? Ss 06:08 0:00 dbus-daemon --system
root 578 0.0 0.0 10868 1248 ? Ssl 06:08 0:00 pcscd
root 610 0.0 0.0 5196 712 ? Ss 06:08 0:00 ./hpiod
root 615 0.0 0.0 13520 4748 ? S 06:08 0:00 python ./hpssd.py
root 637 0.0 0.0 10168 2272 ? Ss 06:08 0:00 cupsd
root 651 0.0 0.0 2780 812 ? Ss 06:08 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root 660 0.0 0.0 5296 1096 ? Ss 06:08 0:00 crond
root 745 0.0 0.0 1728 580 ? SNs 06:08 0:00 anacron -s
root 753 0.0 0.0 2320 340 ? Ss 06:08 0:00 /usr/sbin/atd
root 817 0.0 0.0 25580 10136 ? SN 06:08 0:00 /usr/bin/python -tt /usr/sbin/yum-updatesd
root 819 0.0 0.0 2616 1072 ? SN 06:08 0:00 /usr/libexec/gam_server
root 830 0.0 0.0 7116 1036 ? Ss 06:08 0:00 /usr/sbin/sshd
root 2998 0.0 0.0 2368 424 ? Ss 06:08 0:00 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient-eth0.leases -pf /var/run/dhc
root 3102 0.0 0.0 5008 1376 ? Ss 06:09 0:00 login -- root
root 3103 0.0 0.0 1716 444 tty2 Ss+ 06:09 0:00 /sbin/mingetty tty2
root 3104 0.0 0.0 1716 448 tty3 Ss+ 06:09 0:00 /sbin/mingetty tty3
root 3105 0.0 0.0 1716 448 tty4 Ss+ 06:09 0:00 /sbin/mingetty tty4
root 3138 0.0 0.0 4584 1436 tty1 Ss 06:11 0:00 -bash
root 3167 0.0 0.0 4308 936 tty1 R+ 06:12 0:00 ps aux

From the host :
# lxc-info -n ol5test1
state: RUNNING
pid: 16539
# lxc-kill -n ol5test1
# lxc-monitor -n ol5test1
'ol5test1' changed state to [STOPPING]
'ol5test1' changed state to [STOPPED]

So creating more containers is trivial. Just keep running lxc-create.
# lxc-create -n ol5test2 -t ol5
# btrfs subvolume list /container
ID 297 top level 5 path ol5-template
ID 299 top level 5 path ol5test1
ID 300 top level 5 path ol5test2

lxc-tools will be uploaded to the uek2 beta channel to start playing with this.

Oracle Linux 4 example

Here is the same principle for Oracle Linux 4. Using the template create script lxc-ol4. I started out using the OVM_EL4U7_X86_PVM_4GB template and followed the same steps.

# kpartx -a System.img 
# kpartx -l System.img
loop0p1 : 0 64197 /dev/loop0 63
loop0p2 : 0 8530515 /dev/loop0 64260
loop0p3 : 0 4176900 /dev/loop0 8594775
# mount /dev/mapper/loop0p2 /mnt
# cd /mnt
# btrfs subvolume create /container/ol4-template
Create subvolume '/container/ol4-template'
# tar cvf - * | ( cd /container/ol4-template ; tar xvf - ; )
# lxc-create -n ol4test1 -t ol4
Cloning base template /container/ol4-template to /container/ol4test1 ...
Create a snapshot of '/container/ol4-template' in '/container/ol4test1'
Container created : /container/ol4test1 ...
Container template source : /container/ol4-template
Container config : /etc/lxc/ol4test1
Network : eth0 (veth) on virbr0
'ol4' template installed
'ol4test1' created
# lxc-start -n ol4test1
INIT: version 2.85 booting
/etc/rc.d/rc.sysinit: line 80: /dev/tty5: Operation not permitted
/etc/rc.d/rc.sysinit: line 80: /dev/tty6: Operation not permitted
Setting default font (latarcyrheb-sun16): [ OK ]
Welcome to Enterprise Linux
Press 'I' to enter interactive startup.
Setting clock (utc): Sun Oct 16 09:34:56 EDT 2011 [ OK ]
Initializing hardware... storage network audio done [ OK ]
raidautorun: unable to autocreate /dev/md0
Configuring kernel parameters: error: permission denied on key 'net.core.rmem_default'
error: permission denied on key 'net.core.rmem_max'
error: permission denied on key 'net.core.wmem_default'
error: permission denied on key 'net.core.wmem_max'
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.core_uses_pid = 1
fs.file-max = 327679
kernel.msgmni = 2878
kernel.msgmax = 8192
kernel.msgmnb = 65536
kernel.sem = 250 32000 100 142
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.sysrq = 1
fs.aio-max-nr = 3145728
net.ipv4.ip_local_port_range = 1024 65000
kernel.shmmax = 4398046511104
[FAILED]
Loading default keymap (us): [ OK ]
Setting hostname ol4test1: [ OK ]
Remounting root filesystem in read-write mode: [ OK ]
mount: can't find / in /etc/fstab or /etc/mtab
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling swap space: [ OK ]
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting sysstat: [ OK ]
Setting network parameters: error: permission denied on key 'net.core.rmem_default'
error: permission denied on key 'net.core.rmem_max'
error: permission denied on key 'net.core.wmem_default'
error: permission denied on key 'net.core.wmem_max'
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.core_uses_pid = 1
fs.file-max = 327679
kernel.msgmni = 2878
kernel.msgmax = 8192
kernel.msgmnb = 65536
kernel.sem = 250 32000 100 142
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.sysrq = 1
fs.aio-max-nr = 3145728
net.ipv4.ip_local_port_range = 1024 65000
kernel.shmmax = 4398046511104
[FAILED]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]
Starting portmap: [ OK ]
Starting NFS statd: [FAILED]
Starting RPC idmapd: Error: RPC MTAB does not exist.
Mounting other filesystems: [ OK ]
Starting lm_sensors: [ OK ]
Starting cups: [ OK ]
Generating SSH1 RSA host key: [ OK ]
Generating SSH2 RSA host key: [ OK ]
Generating SSH2 DSA host key: [ OK ]
Starting sshd: [ OK ]
Starting xinetd: [ OK ]
Starting crond: [ OK ]
Starting xfs: [ OK ]
Starting anacron: [ OK ]
Starting atd: [ OK ]
Starting system message bus: [ OK ]
Starting cups-config-daemon: [ OK ]
Starting HAL daemon: [ OK ]
Starting oraclevm-template...
Regenerating SSH host keys.
Stopping sshd: [ OK ]
Generating SSH1 RSA host key: [ OK ]
Generating SSH2 RSA host key: [ OK ]
Generating SSH2 DSA host key: [ OK ]
Starting sshd: [ OK ]
Regenerating up2date uuid.
Setting Oracle validated configuration parameters.
Configuring network interface.
Network device: eth0
Hardware address: D2:EC:49:0D:7D:80
Do you want to enable dynamic IP configuration (DHCP) (Y|n)?
...
...
# lxc-console -n ol4test1
Enterprise Linux Enterprise Linux AS release 4 (October Update 7)
Kernel 2.6.39-100.0.12.el6uek.x86_64 on an x86_64
localhost login:

Join the discussion

Comments ( 6 )
  • Joe Hoot Monday, October 17, 2011

    Very Cool Wim. I realize that this is a quick mechanism to test something out on a developer machine (cgroups are very cool. I really need to learn them better.. much better than old limits mechanism). However, for production use, how might you see this fitting in (I'm trying to compare to Xen). Would there ever be a way to "pause" a container or migrate that container to another host? What about shared resources? Is it possible to have a common rootvg and have multiple containers pointing to that same rootvg (just use different /var, /etc/, ...) I'm thinking you could just have symlinks in your /container/ol5-template/usr/ -> /usr/, as an example? I guess this might break any chrooting a bit (if that gets used), but seems like it might be doable.


  • Wim Monday, October 17, 2011

    Well, they're different in many ways.

    If you don't mind a single kernel image for all your containers and the fact that you have to bring down the box to upgrade it (although with ksplice on oracle linux you would be able to avoid that :-) so ksplice makes containers very cool on linux)...

    If you run many isolated apps or want that isolation yet you want no over head of extra scheduling and easy sharing of system resources, then containers is a good thing. That's why it's so widely used in the ISP world. you can host many containers on a server. With hypervisor based virtualization, the amount of sharing goes down and there's a level of overhead for scheduling VMs (not much but some). Of course, you have a greater level of isolation and much greater flexibility in terms of what you run in a VM versus a container.

    So it's really a toss up. Basically, if you run one OS, don't need live migrate or so, this is a bit better a model. One is certainly not a replacement of the other, supporting both is quite useful.


  • guest Wednesday, January 4, 2012

    hi wcoekaer ,

    I want to contribute by making a small correction to your tutorial.

    To avoid execution error

    on the line that says:

    # tar cvf - * | ( cd /container/ol5-template ; tar xvf ; )

    should say:

    # tar cvf - * | ( cd /container/ol5-template ; tar xvf - ; )


  • Jon Senger Tuesday, March 20, 2012

    Great tutorial Wim, thanks! One thing to add to the comments for anybody else doing this: libvirt isn't default in the OL6 install, so that has to get installed/started ala-carte before starting the container.


  • Joe Hoot Thursday, May 17, 2012

    Wim,

    I started thinking a bit more about this. Do you have any quality tutorials (or any documentation outside of the kernel source readme on it) that might explain the best way to get started with cgroups? I'm not as concerned with lxc, but more so with just standard cgroups. I understand there is a libcgroups (or something like that) which has some configuration in /etc/ to allow the user of cgroups more easily (I think).

    If you have anything to contribute on this, I'd appreciate it.

    Thanks,

    Joe


  • Joseph Hoot Tuesday, May 22, 2012

    Wim,

    I was doing some more research on cgroups and found the following resources that I think are really good. In fact, I got to see the RedHat Summit presentation last year in Boston, and it was great to see the live demos (I don't recall seeing Linda Wang there though):

    * http://www.redhat.com/summit/2011/presentations/summit/in_the_weeds/friday/WangKozdemba_f_1130_cgroups14.pdf

    * http://linux.oracle.com/documentation/EL6/Red_Hat_Enterprise_Linux-6-Resource_Management_Guide-en-US.pdf

    * http://www.oracle.com/technetwork/articles/servers-storage-admin/resource-controllers-linux-1506602.html

    So what I've found is that it doesn't appear that libcgroup is available as an rpm in OL5, which is the only hangup that I currently have. I don't know if it has prereqs that are only available in OL6 or not, but if it doesn't then I would imagine that I can just compile it from sources... seems to be a typical configure/make/make install.. But before I go to that level, do you know if you've seen oracle-compiled rpms that are available in any yum repos? I didn't see them in the OracleLinux/OL5 repos at all.

    Thanks,

    Joe


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.