linux/include/net
Vasiliy Kulikov c319b4d76b net: ipv4: add IPPROTO_ICMP socket kind
This patch adds IPPROTO_ICMP socket kind.  It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges.  In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping.  In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).

Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/

A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.

Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.

ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.

ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).

socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range".  It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets.  Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.

The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).

Userspace ping util & patch for it:
http://openwall.info/wiki/people/segoon/ping

For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro.  A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/

Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.

All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.

PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:08:13 -04:00
..
9p Fix common misspellings 2011-03-31 11:26:23 -03:00
bluetooth Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-05-05 13:32:35 -04:00
caif caif: code cleanup 2011-04-11 15:08:47 -07:00
irda Fix common misspellings 2011-03-31 11:26:23 -03:00
iucv Fix common misspellings 2011-03-31 11:26:23 -03:00
netfilter net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
netns net: ipv4: add IPPROTO_ICMP socket kind 2011-05-13 16:08:13 -04:00
phonet net: dont hold rtnl mutex during netlink dump callbacks 2011-05-02 15:26:28 -07:00
sctp sctp: Store a flowi in transports to provide persistent keying. 2011-05-08 14:05:14 -07:00
tc_act net/sched: add ACT_CSUM action to update packets checksums 2010-08-20 01:42:59 -07:00
act_api.h pkt_sched: gen_kill_estimator() rcu fixes 2010-06-11 18:37:08 -07:00
addrconf.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
af_ieee802154.h
af_rxrpc.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
af_unix.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
ah.h ipsec: update MAX_AH_AUTH_LEN to support sha512 2011-01-13 21:48:25 -08:00
arp.h arp: allow to invalidate specific ARP entries 2011-01-10 16:10:37 -08:00
atmclip.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
ax25.h
ax88796.h
cfg80211.h {mac|nl}80211: Add station connected time 2011-04-12 16:58:47 -04:00
checksum.h
cipso_ipv4.h
cls_cgroup.h Merge commit 'v2.6.36-rc7' into core/rcu 2010-10-07 09:43:45 +02:00
compat.h net: Add sendmmsg socket system call 2011-05-05 11:10:14 -07:00
datalink.h
dcbevent.h net_dcb: add application notifiers 2010-12-31 10:47:46 -08:00
dcbnl.h dcbnl: add support for retrieving peer configuration - cee 2011-03-02 21:58:55 -08:00
dn.h decnet: Convert to use flowidn where applicable. 2011-03-12 15:08:55 -08:00
dn_dev.h decnet: RCU conversion and get rid of dev_base_lock 2010-11-08 13:50:08 -08:00
dn_fib.h decnet: Convert to use flowidn where applicable. 2011-03-12 15:08:55 -08:00
dn_neigh.h
dn_nsp.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
dn_route.h decnet: Convert to use flowidn where applicable. 2011-03-12 15:08:55 -08:00
dsa.h
dsfield.h
dst.h net: Make dst_alloc() take more explicit initializations. 2011-04-28 22:25:59 -07:00
dst_ops.h net: Implement read-only protection and COW'ing of metrics. 2011-01-26 20:51:05 -08:00
esp.h
ethoc.h
fib_rules.h fib_rules: __rcu annotates ctarget 2010-10-27 11:37:32 -07:00
flow.h net: Order ports in same order as addresses in flow objects. 2011-03-31 18:03:35 -07:00
garp.h garp: remove last synchronize_rcu() call 2011-05-12 17:46:56 -04:00
gen_stats.h Fix common misspellings 2011-03-31 11:26:23 -03:00
genetlink.h include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument 2011-02-03 20:47:08 -08:00
gre.h PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol) 2010-08-21 23:05:39 -07:00
icmp.h inetpeer: Move ICMP rate limiting state into inet_peer entries. 2011-02-04 15:59:53 -08:00
ieee80211_radiotap.h mac80211: add MCS information to radiotap 2011-01-28 15:44:29 -05:00
ieee802154.h
ieee802154_netdev.h
if_inet6.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
inet6_connection_sock.h inet: Pass flowi to ->queue_xmit(). 2011-05-08 15:28:28 -07:00
inet6_hashtables.h
inet_common.h inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage() 2010-07-12 20:21:46 -07:00
inet_connection_sock.h inet: Pass flowi to ->queue_xmit(). 2011-05-08 15:28:28 -07:00
inet_ecn.h net: return operator cleanup 2010-09-23 14:33:39 -07:00
inet_frag.h fragment: add fast path for in-order fragments 2010-06-30 13:44:29 -07:00
inet_hashtables.h tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() 2010-10-21 13:06:43 +02:00
inet_sock.h inet: Decrease overhead of on-stack inet_cork. 2011-05-06 15:37:57 -07:00
inet_timewait_sock.h net: optimize INET input path further 2010-12-09 20:05:58 -08:00
inetpeer.h inet: constify ip headers and in6_addr 2011-04-22 11:04:14 -07:00
ip.h ipv4: Pass explicit daddr arg to ip_send_reply(). 2011-05-10 13:32:46 -07:00
ip6_checksum.h
ip6_fib.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
ip6_route.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
ip6_tunnel.h tunnels: add _rcu annotations 2010-10-25 13:09:45 -07:00
ip_fib.h ipv4: Call fib_select_default() only when actually necessary. 2011-04-14 15:05:22 -07:00
ip_vs.h ipvs: Remove all remaining references to rt->rt_{src,dst} 2011-05-12 18:24:46 -04:00
ipcomp.h
ipconfig.h
ipip.h tunnels: add __rcu annotations 2010-10-27 11:37:32 -07:00
ipv6.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
ipx.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
iw_handler.h Fix common misspellings 2011-03-31 11:26:23 -03:00
lapb.h
lib80211.h lib80211: remove unused host_build_iv option 2010-07-26 15:09:04 -04:00
llc.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
mac80211.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-05-05 13:32:35 -04:00
mip6.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
mld.h
ndisc.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
neighbour.h Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-11-19 13:13:47 -08:00
net_namespace.h ipvs: move struct netns_ipvs 2011-03-15 09:36:50 +09:00
netdma.h
netevent.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
netlabel.h
netlink.h netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros 2011-02-01 15:20:14 +01:00
netrom.h
nexthop.h
nl802154.h
p8022.h
ping.h net: ipv4: add IPPROTO_ICMP socket kind 2011-05-13 16:08:13 -04:00
pkt_cls.h net: Fix range checks in tcf_valid_offset(). 2010-12-21 12:43:16 -08:00
pkt_sched.h Fix common misspellings 2011-03-31 11:26:23 -03:00
protocol.h net: change netdev->features to u32 2011-01-24 15:32:47 -08:00
psnap.h
raw.h include/net/raw.h: Convert raw_seq_private macro to inline 2010-09-08 13:42:22 -07:00
rawv6.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
red.h sched: remove unused backlog in RED stats 2011-01-12 19:00:39 -08:00
regulatory.h cfg80211: Fix regulatory bug with multiple cards and delays 2010-11-22 15:48:51 -05:00
request_sock.h
rose.h rose: Add length checks to CALL_REQUEST parsing 2011-03-27 17:59:04 -07:00
route.h ipv4: Kill rt->rt_{src, dst} usage in IP GRE tunnels. 2011-05-04 12:55:07 -07:00
rtnetlink.h rtnl: make link af-specific updates atomic 2010-11-27 22:56:08 -08:00
sch_generic.h net_sched: fix THROTTLED/RUNNING race 2011-03-24 00:13:14 -07:00
scm.h scm: lower SCM_MAX_FD 2010-11-24 11:16:43 -08:00
slhc_vj.h
snmp.h snmp: SNMP_UPD_PO_STATS_BH() always called from softirq 2011-03-21 18:12:54 -07:00
sock.h Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-04-11 13:44:25 -07:00
stp.h
tcp.h tcp: Remove debug macro of TCP_CHECK_TIMER 2011-02-20 11:10:14 -08:00
tcp_states.h
timewait_sock.h timewait_sock: Create and use getpeer op. 2010-12-01 18:09:13 -08:00
transp_v6.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
udp.h udp: Switch to ip_finish_skb 2011-03-01 12:35:03 -08:00
udplite.h udp: Switch to ip_finish_skb 2011-03-01 12:35:03 -08:00
wext.h
wimax.h net: Remove __KERNEL__ cpp checks from include/net 2011-04-24 10:54:56 -07:00
wpan-phy.h Fix common misspellings 2011-03-31 11:26:23 -03:00
x25.h X25 remove bkl in subscription ioctls 2010-11-28 11:12:20 -08:00
x25device.h
xfrm.h Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-3.6 2011-05-11 14:26:58 -04:00