Friday Nov 03, 2006

Solaris 网络虚拟化: Firewall 部分

Solaris 的 Containers(AKA zones) 的模型 在 Solaris 10 中就出现了, CPU, 内存,Disk 空间的虚拟化已经可以提供出基本的 OS 级的虚拟环境, 但是网络部分还是没能完善。 Solaris 10 的zone 提供了独立的name space(logic IP 地址, TCP/UDP/SCTP 端口, ), 但是基本的说, 一个系统中 Zones 共享这一个 TCP/IP 协议栈。 - Routing - ARP - Firewall policy - Statitics 都是共享着的, 这也就意味着: - Zone 与 Zone 之间有安全漏洞: 一个Zone 的网络配置和信息可能(或一定)为别的Zone 开到 - 配置问题: 为 一个 Zone 做配置(如路由配置)可能会使得别的zone 也受影响甚至不能工作 The Old Solaris 10 Network Model Zone A Zone B +_________________++________________ | APPLICATION || APPLICATION | | || | | TCP UDP || TCP UDP | |_________________||________________| __________|________________|________ | | | IP/ARP/IPSEC/Firewall | |___________________________________| ______|_______ _____|______ | NIC1 | | NIC2 | |------------| |----------| The New Solaris 10 Network Model Zone A Zone B +_________________++________________ | APPLICATION | | APPLICATION | | | | | | TCP UDP | | TCP UDP | |_________________| |________________| __________|________ ________|_______ | | | | | IP/ARP/ | | IP/ARP/ | | IPSEC/Firewall | | IPSEC/Firewall | |_________________| |________________| ______|_______ _____|______ | NIC1 | | NIC2 | |------------| |----------| 我设计和实现了Solaris Firewall 的虚拟化。 Soaris 的 Firewall 有 pfhook framework(neti 和 hook) 以及 firewall 引擎 ipf 组成。 Architecture: IP +---------+ module: neti Firewall | | +----------------------------------------------+ +-----------+ | | | | | | | | | | | | | | | | | | | | -----------> net_register/unregister | | | | | net_lookup <------ | | | | net_release | | | | | | | | | | net_walk | | | | | | | | | | -----------> net_register_family/unregister | | | | | -----------> net_register_event/unregister | | | | | \\ \\ net_register_hook/unregister <------ | | | \\-- | | | | | | | | \\ | net_getifname <------ | | | | \\ | net_getmtu <------ | | | | \\ | net_getpmtuenabled <------ | | | | | | net_lifaddr <------ | | | | \\ | net_phygetnext <------ | | | | \\ | net_phylookup <------ | | | | \\ | net_lifgetnext <------ | | | | \\ | net_inject <------ | | | | \\ | net_routeto <------ | | | | | | net_ispartialchecksum <------ | | | | | | net_isvalidchecksum <------ | | | | \\ | | | | | | | \\| | | | | | +-------------------\^\^\^\^\^----------------------+ | | | | || | (neti, hook interaction) | | | | module: hook || | | | | | +-------------------<<>>|----------------------+ | | | | | | | | | | -----------> hook_run | | | | | | | | | | | +----------------------------------------------+ +-----------+ | | +---------+ Note: A few external functions in module hook called by neti hook_family_add/remove hook_event_add/remove hook_register/unregister The steps: 1. neti, and hook initialization 2. ip stack in ip_ddi_init - call ip_neti_init() - call net_register(&ipv4info) - call net_register(&ipv6info) - create task queue for eventq_queue_out eventq_queue_in eventq_queue_nic - call ipv4_hook_init() - call net_register_family(ipv4, &ipv4root) - call net_register_event(ipv4, &ip4_physical_in_event) - call net_register_event(ipv4, &ip4_physical_out_event) - call net_register_event(ipv4, &ip4_physical_forwarding_event) - call net_register_event(ipv4, &ip4_loopback_in_event) - call net_register_event(ipv4, &ip4_loopback_out_event) - call net_register_event(ipv4, &ip4_nic_events) - call ipv6_hook_init() - call net_register_family(ipv6, &ipv6root) - call net_register_event(ipv6, &ip6_physical_in_event) - call net_register_event(ipv6, &ip6_physical_out_event) - call net_register_event(ipv6, &ip6_physical_forwarding_event) - call net_register_event(ipv6, &ip6_loopback_in_event) - call net_register_event(ipv6, &ip6_loopback_out_event) - call net_register_event(ipv6, &ip6_nic_events) 3. arp - call arp_hook_init() - call net_register_family(arp, &arproot) - call net_register_event(arp, &arp_physical_in_event) - call net_register_event(arp, &arp_physical_out_event) - call net_register_event(arp, &arp_nic_events) 4. ipf iplattach: - ipf_ipv4 = net_lookup(NHF_INET); - net_register_hook(ipf_ipv4, NH_NIC_EVENTS, &ipfhook_nicevents) - net_register_hook(ipf_ipv4, NH_PHYSICAL_IN, &ipfhook_in) - net_register_hook(ipf_ipv4, NH_PHYSICAL_OUT, &ipfhook_out) - net_register_hook(ipf_ipv4, NH_LOOPBACK_IN, &ipfhook_in) - net_register_hook(ipf_ipv4, NH_LOOPBACK_OUT, &ipfhook_out) - ipf_ipv6 = net_lookup(NHF_INET6); - net_register_hook(ipf_ipv6, NH_NIC_EVENTS, &ipfhook_nicevents) - net_register_hook(ipf_ipv6, NH_PHYSICAL_IN, &ipfhook_in) - net_register_hook(ipf_ipv6, NH_PHYSICAL_OUT, &ipfhook_out) - net_register_hook(ipf_ipv6, NH_LOOPBACK_IN, &ipfhook_in) - net_register_hook(ipf_ipv6, NH_LOOPBACK_OUT, &ipfhook_out) 5. - when packets come in/out, nic event happens, IP calls hook_run() The data structures relationship looks like: netd_head -- \\ \\ net_data_t(ip4) ip6 arp \\ ->+------------+ ---------> +------------+ -------------> +------------+ \\ |net_info | |net_info | |net_info | | | | | | | | | | netd_hooks___ | netd_hooks___ | netd_hooks___ | +------------+ \\ +------------+ \\ +------------+ \\ | neti \\ \\ \\ / | | | / | | | / | | | | | | | | | \\|/ \\|/ \\|/ \\ familylist -------> +----------+ ----------> +----------+ -------------> +----------+ \\ | | | | | | \\ | | | | | | | hook_family_int_t | ____ | | | | | +----------+ \\ +----------+ +----------+ | \\ | +------+ | hook_event_int_t | |-------+----+ hook_int_t | hook | | | | | +------+ +----+ | | | | | +------+ | hook_event_int_t | |-------+----+ hook_int_t | | | | | | +------+ +----+ | | | | | +------+ | hook_event_int_t | |-------+----+ hook_int_t / | | | | / +------+ +----+ / 虚拟化之后的过程是: 1. kernel module neti: has the neti_stack_t, and neti_stack_init(), which malloc the local storage; 2. kernel module hook has the hook_stack_t, and hook_stack_init(), which malloc the local storage; 3. arp module changes: move the following from arp_ddi_init to arp_stack_init - call arp_neti_init(as) - call net_register(&arpinfo) - call arp_hook_init() - call net_register_family(arp, &arproot) - call net_register_event(arp, &arp_physical_in_event) - call net_register_event(arp, &arp_physical_out_event) - call net_register_event(arp, &arp_nic_events) 4. ip module: Move the following from ip_ddi_init to ip_stack_init, to make the - call ip_neti_init(pfs) - call net_register(&ipv4info) - call net_register(&ipv6info) - create task queue for eventq_queue_out eventq_queue_in eventq_queue_nic - call ipv4_hook_init() - call net_register_family(ipv4, &ipv4root) - call net_register_event(ipv4, &ip4_physical_in_event) - call net_register_event(ipv4, &ip4_physical_out_event) - call net_register_event(ipv4, &ip4_physical_forwarding_event) - call net_register_event(ipv4, &ip4_loopback_in_event) - call net_register_event(ipv4, &ip4_loopback_out_event) - call net_register_event(ipv4, &ip4_nic_events) - call ipv6_hook_init() - call net_register_family(ipv6, &ipv6root) - call net_register_event(ipv6, &ip6_physical_in_event) - call net_register_event(ipv6, &ip6_physical_out_event) - call net_register_event(ipv6, &ip6_physical_forwarding_event) - call net_register_event(ipv6, &ip6_loopback_in_event) - call net_register_event(ipv6, &ip6_loopback_out_event) - call net_register_event(ipv6, &ip6_nic_events) 下面的函数接口增加了一个 netstack_t \* net_register(..., netstack_t \*) net_lookup(..., netstack_t \*) net_walk(..., netstack_t \*) 其他函数保持不变: net_unregister(...) net_release(...) net_register_family(...) net_unregister_family(...) net_register_family(...) net_unregister_family(...) net_register_hook(...) net_unregister_hook(...) net_getifname(...) net_getmtu(...) net_getpmtuenabled(...) net_lifaddr(...) net_phygetnext(...) net_phylookup(...) net_lifgetnext(...) net_inject(...) net_routeto(...) net_ispartialchecksum(...) net_isvalidchecksum(...) 相应的这些数据结构和原型也有了一些变化: ------------------------------------------------------------------------------------- Old: 118 typedef struct net_info { 119 int neti_version; 120 char \*neti_protocol; 121 int (\*neti_getifname)(phy_if_t, char \*, const size_t); 122 int (\*neti_getmtu)(phy_if_t, lif_if_t); 123 int (\*neti_getpmtuenabled)(void); 124 int (\*neti_getlifaddr)(phy_if_t, lif_if_t, size_t, 125 net_ifaddr_t [], void \*); 126 phy_if_t (\*neti_phygetnext)(phy_if_t); 127 phy_if_t (\*neti_phylookup)(const char \*); 128 lif_if_t (\*neti_lifgetnext)(phy_if_t, lif_if_t); 129 int (\*neti_inject)(inject_t, net_inject_t \*); 130 phy_if_t (\*neti_routeto)(struct sockaddr \*); 131 int (\*neti_ispartialchecksum)(mblk_t \*); 132 int (\*neti_isvalidchecksum)(mblk_t \*); 133 } net_info_t; New: 118 typedef struct net_info { 119 int neti_version; 120 char \*neti_protocol; 121 int (\*neti_getifname)(phy_if_t, char \*, const size_t, netstack_t\*); 122 int (\*neti_getmtu)(phy_if_t, lif_if_t); 123 int (\*neti_getpmtuenabled)(netstack_t \*); 124 int (\*neti_getlifaddr)(phy_if_t, lif_if_t, size_t, 125 net_ifaddr_t [], void \*); 126 phy_if_t (\*neti_phygetnext)(phy_if_t, netstack_t \*); 127 phy_if_t (\*neti_phylookup)(const char \*, netstack_t \*); 128 lif_if_t (\*neti_lifgetnext)(phy_if_t, lif_if_t); 129 int (\*neti_inject)(inject_t, net_inject_t \*, netstack_t \*); 130 phy_if_t (\*neti_routeto)(struct sockaddr \*, netstack_t \*); 131 int (\*neti_ispartialchecksum)(mblk_t \*); 132 int (\*neti_isvalidchecksum)(mblk_t \*); 133 } net_info_t; ------------------------------------------------------------------------------------- Old: 139 struct net_data { 140 LIST_ENTRY(net_data) netd_list; 141 net_info_t netd_info; 142 int netd_refcnt; 143 hook_family_int_t \*netd_hooks; 144 }; New: 139 struct net_data { 140 LIST_ENTRY(net_data) netd_list; 141 net_info_t netd_info; 142 int netd_refcnt; 143 hook_family_int_t \*netd_hooks; 144 void \* netd_netstack; 145 }; 146 ------------------------------------------------------------------------------------- Old: 147 typedef struct injection_s { 148 net_inject_t inj_data; 149 boolean_t inj_isv6; 150 } injection_t; New: 148 typedef struct injection_s { 149 net_inject_t inj_data; 150 boolean_t inj_isv6; 151 void \* inj_ptr; 152 } injection_t; ------------------------------------------------------------------------------------- Old: 165 extern net_data_t net_register(const net_info_t \*); New: 180 extern net_data_t net_register(const net_info_t \*, netstack_t \*); ------------------------------------------------------------------------------------- Old: 167 extern net_data_t net_lookup(const char \*); New: 182 extern net_data_t net_lookup(const char \*, netstack_t \*); ------------------------------------------------------------------------------------- Old: 169 extern net_data_t net_walk(net_data_t); New: 184 extern net_data_t net_walk(net_data_t, netstack_t \*);

Friday Sep 15, 2006

IP instances for Solaris

Exclusive IP instances for Solaris Zones.

Tuesday Jun 13, 2006

Solaris packet filtering hooks (pfhooks)

  Solaris 10 防护墙 IP Filter 是基于 open source ipfilter 的。Sun 做了一些必要的有益的针对Solaris 的优化, 增加了一些 feature 比如完整的IPv6 的支持, IPv4/IPv6 pools, IPv6 fragment支持等)

  在 Solaris 10 基于 STREAMS 的网络框架里, Solaris 防护墙是由两个内核模块 pfil  + ipf 实现的。

这主要带来了两个问题:

  1. 性能差.
  2. 不能过滤 loopback traffic.  这个问题变得相当突出, 因为 Solaris container 之间的通信就是基于 loopback的。


  pfhooks 是内嵌于TCP、IP, ARP 协议站中的, 这就很好地解决了这两个问题:

    删除了 pfil, 提高了系统的性能;    loopback 的 traffic 在经过 IP 的时候也可以经由 pfhooks 到 ipf 做过滤, 第二个问题得以解决。


  更详细的说明请见 pfhook white paper.

Thursday Jan 19, 2006

Solaris Ethernet bridging

  We are kicking off one project named 'Ethernet bridging' to make Solaris virtualization ready, particularly ready to be Xen domain0. Basically an Ethernet bridge enable Xen domain0 to work as a 802.1D bridge to the other Xen user domains.

  In paralell, we kicked off one CDDL open source project for it, which is hosted in
  http://www.opensolaris.org/os/project/ethbridge

Tuesday Nov 01, 2005

Virutalization 技术

  Virutalization 技术有望成为计算机界的下一个big thing。传统地一个计算机系统(平台)就对应一个物理意义的计算机(server/workstation/PC, 或者还包括上面的软件)。
基本上说 virtualization 就是虚拟地提供出计算的运行环境(平台)。比如一台server可以提供出 n 个彼此之间逻辑上完全独立的计算机系统, 或多个server提供为一个计算机系统(不是重点)。

  类比的几个例子:
    1. Mutliprogramming 技术把一个计算机系统 virutalize 化以支持multitask;
    2. 一个物理信道可以时分/频分出多个信道;
    3. 一个人在公司同时做多个project(或多个人做一个项目);

  试着提供一个系列的blogs来介绍 virtulization 的历史、用途、技术,以及 Solaris virtualization 的内容。

支持王垠之理由

  自由思想与自由选择无论如何都应当是社会之主流根本。我能够理解社会之多样、 人众以致混杂。 但社会和教育难道不应当足够包容吗?
  科学应当是朴实的, 学问也要踏实的做。 王垠因为追求纯学术而退学本身有何过错?

Saturday Sep 24, 2005

支持王垠

在bbs上看到了王垠退学,我表示钦佩和支持。这是理想主义的呐喊。

Monday Aug 08, 2005

IPFilter code merge

Solaris firewall will see the up-to-date features in open source IPFilter soon

  IPFilter is the a mature and robust firewall traditionally popular on BSD like systems, though it is a mutliple-OS product. With the release of Solaris 10, IPFilter is the default firewall instead of the previous SunScreen, However, it is based on the open source ip_fil4.0.x, which was available at the time when the Solaris networking team was evaluating/choosing the firewall and then tuning it against the Solaris operating environment.

  With time, the open source IPFilter has evolved much and now ip_fil4.1.8 is available. We are planning a bidirectional code merge project to update the Solaris IPFilter on ip_fil4.1.8. Additionally the open source community can benefit from our work too in that quite a few bug fixes, several of which are pretty critical, and one or two features, say, IPv6 enabled ippool will be integrated into the open source version.

Friday Jun 24, 2005

IPFilter 的启动和关闭

  IPFilter 是 Solaris 10 自带的防火墙。在 Solaris 10 上IPFilter 的启动和关闭是由 SMF 管理的, 这与以前有了一些变化。
  具体的说, SMF是靠管理与IPFilter相关的两个 services(pfil & ipfilter) 来实现IPFilter的管理。可以用下面的命令看这两个service的property

#> svcprop pfil
#> svcprop ipfilter

 当pfil和ipfilter都处于 online 状态的时候,Solaris的防火墙才起作用。缺省安装pfil是online的, 但ipfilter是offline的, 所以ipfilter并不起作用。这就是为什么管理员即使 在 /etc/ipf/ipf.conf 配置了 rules,但是IPFilter系统仍然不起作用的原因了。(重启也无用)

启动它很简单:
#>svcadm enable ipfilter
关闭它也得用SMF来完成。
#>svcadm disable ipfilter

注意:通过SMF启动和关闭ipfilter是永久有效的,即使reboot也会保留。

Thursday Jun 02, 2005

Apache on AMD64 + Solairs10

  I have got to know quite a few customers are very interested in running Apache on the platform AMD64 + Solaris10. That is great!
  I am doing a investigation. Mainly I am concerned in the robustness, performance, and compatibilities. What's your Concerns? Pls tell me your ideas.
  Apache 在 AMD64 + Solaris 10 上。你有什么主意吗?:) 一块讨论讨论吧。

Friday May 27, 2005

联想做房地产!?

  联想正走向多元化! 似乎总能(或许是在)赶着时代的步伐。
  从“技工贸”的短视, Legend -> Lenovo 的虚浮追求,PDA/手机繁荣时代的赶时髦,挺进房地产业的投机主义,到做IBM PC 的垃圾回收员。
  我怀疑一个毫无灵魂,赤膊上阵的巨人夸父,还能坚持多远?

Monday May 23, 2005

IPFilter status

  IPFilter is very close to finish IPv6 support in Solaris10. I am intending to putback the IPv6 code to onnv(The developed Solaris11) in a couple weeks. After 4 weeks' soak time, Solaris10 update will see the IPv6 packet filtering works. :)

  In addition to functionalities available in IPv4, IPFilter can distinguish the traffice by matching extension header not existant in IPv4. NAT, the main usage in IPv4, is not available any more.
NAT is mainly one solution of IP address shortage, there is no such requirement in IPv6. So simply we skip the feature.

  IP pool is modified to IPv6 enabled from the userland command through the kernel module. Pools of IPv4/IPv6/IPv4&6 address are allowed, which lead to easy management.

  I am wondering if it make much sense to make IPFilter SNMP managable and then easily centralized management. Also I am interested in the idea of GUI interface for the IPFilter. Pls make comments. :)

Sunday May 22, 2005

To Solaris novices -- Solaris 学习资料经验谈

 不知不觉中, 用Unix(Solaris/Linux) n 年了, 回顾这几年的学习过程, 还是有些体会。今天我(斗胆,有些心虚)谈一下学习 Solaris 的资料问题。

1. 大部分Solaris的书、资料、都是垃圾, 特别是国内一些写者写的;(来板砖我也这么说)  它们不是抄来抄去, 就是一知半解的乱说一气。

2. 我的观点是读好书多遍强于读多本书

3. To Solaris novices and intermediate programmers
  《Advanced Programming in the Unix Environment》    by Richard Stevens
   这本书10年前的经典有些老了, 但还是提供了大部分的Unix programming的精华,难能可贵的是Mr. Stevens超强的表述阐析能力, 让我们的很容易的把握住重点和微妙之处。 可惜大侠已逝, 我们是没法看到APUE的第二版了。
   (此外还强烈推荐 Richard 的其余的所有的书,都是Unix/Networking 的经典啊)

   好在我们还有《Solaris Systems Programming》      by Rich Teer
   Rich 是一个independant UNIX consultant、OpenSolaris CAB 的五成员之一。 他的 best sellor 提供了更新、更 Solaris-specific 的参考。 看看作者和Acknowledgement list这些Unix界的大牛, 像Casper Dik 这样的Unix Networking/Security 专家作 reviewers, 让人对这本书的权威性有十足的信心。
   Rich Teer's homepage: http://www.rite-group.com/rich/
   上两本书是 general-purpose 的 Solaris Programming 必备参考书。 存在没有这两本书的Solaris C/C++ programming guru 吗?! 别开玩笑了。:)

4. 你要GUI programming? Kernel programming? Networking programming?
   Multithreaded programming? Performance tuning for large scale middleware?
   ...

   这都是一些专题了。 希望有些帮助:
 
   GUI programming:     Sorry, i have no idea on it
   Kernel programming: 
                       Are you serious?
                       Try to be a OS guru first?
   STREAMS programming:
                       <Unix System V Networking Programming> by Stephen Rago
                        Outdated but informtive
   Device driver:                            
     最有用的material 可以下载得到。
     http://developers.sun.com/prodtech/solaris/reference/docs/index.html

  我觉得最快的方式是上 Training 课程, 当然自己学也可以,你有的学了。:)

4. Compilers and Tools
  你需要足够的工具:
  a. GNU 有一套 toolchain 可以选择. Solaris Companion CD(free的)上就有比较次就是了。 编程有 error/warning,提示的信息云山雾罩的经历有吧?更不用说效率了。
  b. Solaris 安装盘安装了除 compiler之外的其他的所以工具 as, ld, ... 可就是没有compiler. 好东西是要花银子的。 不过值。
     你有两个选择:
     \* 只用compiler: C/C++/Fortran 95 的都有
     \* Sun 有集成的开发工具包名曰 Sun Studio10(取代了以前的Sun Workshop, sun Forte)
       并且有Solaris on SPARC, Solarisx86 on AMD64/IA32, Linux on IA32等版本
       含有 compiler
            IDE的传统内容, editor/debugger/project manager/...
            test/performance anlysis: 可是物超所值的额外收获
  c. 还有,你要对学会用两个工具:DTrace and MDB
        这俩是Solaris guru的必修。 非常有用啊。 在此不多述了。

5. Script programming/web programming/java programming?
   I do not know. 我只会C/C++。 :(

 

About

yukun

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today