linux

Commit Graph

Author	SHA1	Message	Date
Michael Chan	2c61d2117e	bnxt_en: Add helper functions to get firmware CP ring ID. On the new 57500 chips, getting the associated CP ring ID associated with an RX ring or TX ring is different than before. On the legacy chips, we find the associated ring group and look up the CP ring ID. On the 57500 chips, each RX ring and TX ring has a dedicated CP ring even if they share the MSIX. Use these helper functions at appropriate places to get the CP ring ID. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	50e3ab7836	bnxt_en: Allocate completion ring structures for 57500 series chips. On 57500 chips, the original bnxt_cp_ring_info struct now refers to the NQ. bp->cp_nr_rings refer to the number of NQs on 57500 chips. There are now 2 pointers for the CP rings associated with RX and TX rings. Modify bnxt_alloc_cp_rings() and bnxt_free_cp_rings() accordingly. With multiple CP rings per NAPI, we need to add a pointer in bnxt_cp_ring_info struct to point back to the bnxt_napi struct. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	41e8d79837	bnxt_en: Modify the ring reservation functions for 57500 series chips. The ring reservation functions have to be modified for P5 chips in the following ways: - bnxt_cp_ring_info structs map to internal NQs as well as CP rings. - Ring groups are not used. - 1 CP ring must be available for each RX or TX ring. - number of RSS contexts to reserve is multiples of 64 RX rings. - RFS currently not supported. Also, RX AGG rings are only used for jumbo frames, so we need to unconditionally call bnxt_reserve_rings() in __bnxt_open_nic() to see if we need to reserve AGG rings in case MTU has changed. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	9c1fabdf42	bnxt_en: Adjust MSIX and ring groups for 57500 series chips. Store the maximum MSIX capability in PCIe config. space earlier. When we call firmware to query capability, we need to compare the PCIe MSIX max count with the firmware count and use the smaller one as the MSIX count for 57500 (P5) chips. The new chips don't use ring groups. But previous chips do and the existing logic limits the available rings based on resource calculations including ring groups. Setting the max ring groups to the max rx rings will work on the new chips without changing the existing logic. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	697197e5a1	bnxt_en: Re-structure doorbells. The 57500 series chips have a new 64-bit doorbell format. Use a new bnxt_db_info structure to unify the new and the old 32-bit doorbells. Add a new bnxt_set_db() function to set up the doorbell addreses and doorbell keys ahead of time. Modify and introduce new doorbell helpers to help abstract and unify the old and new doorbells. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	e38287b72e	bnxt_en: Add 57500 new chip ID and basic structures. 57500 series is a new chip class (P5) that requires some driver changes in the next several patches. This adds basic chip ID, doorbells, and the notification queue (NQ) structures. Each MSIX is associated with an NQ instead of a CP ring in legacy chips. Each NQ has up to 2 associated CP rings for RX and TX. The same bnxt_cp_ring_info struct will be used for the NQ. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	1b9394e5a2	bnxt_en: Configure context memory on new devices. Call firmware to configure the DMA addresses of all context memory pages on new devices requiring context memory. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	98f04cf0f1	bnxt_en: Check context memory requirements from firmware. New device requires host context memory as a backing store. Call firmware to check for context memory requirements and store the parameters. Allocate host pages accordingly. We also need to move the call bnxt_hwrm_queue_qportcfg() earlier so that all the supported hardware queues and the IDs are known before checking and allocating context memory. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:32 -07:00
Michael Chan	66cca20abc	bnxt_en: Add new flags to setup new page table PTE bits on newer devices. Newer chips require the PTU_PTE_VALID bit to be set for every page table entry for context memory and rings. Additional bits are also required for page table entries for all rings. Add a flags field to bnxt_ring_mem_info struct to specify these additional bits to be used when setting up the pages tables as needed. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:31 -07:00
Michael Chan	6fe1988685	bnxt_en: Refactor bnxt_ring_struct. Move the DMA page table and vmem fields in bnxt_ring_struct to a new bnxt_ring_mem_info struct. This will allow context memory management for a new device to re-use some of the existing infrastructure. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:31 -07:00
Michael Chan	74706afa71	bnxt_en: Update interrupt coalescing logic. New firmware spec. allows interrupt coalescing parameters, such as maximums, timer units, supported features to be queried. Update the driver to make use of the new call to query these parameters and provide the legacy defaults if the call is not available. Replace the hard-coded values with these parameters. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:31 -07:00
Michael Chan	1dfddc41ae	bnxt_en: Add maximum extended request length fw message support. Support the max_ext_req_len field from the HWRM_VER_GET_RESPONSE. If this field is valid and greater than the mailbox size, use the short command format to send firmware messages greater than the mailbox size. Newer devices use this method to send larger messages to the firmware. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:31 -07:00
Michael Chan	36e53349b6	bnxt_en: Add additional extended port statistics. Latest firmware spec. has some additional rx extended port stats and new tx extended port stats added. We now need to check the size of the returned rx and tx extended stats and determine how many counters are valid. New counters added include CoS byte and packet counts for rx and tx. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:31 -07:00
Michael Chan	31d357c069	bnxt_en: Update firmware interface spec. to 1.10.0.3. Among the new changes are trusted VF support, 200Gbps support, and new API to dump ring information on the new chips. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:44:31 -07:00
David S. Miller	9e983c5898	Merge branch 'selftests-pmtu-Add-test-choice-and-captures' Stefano Brivio says: ==================== selftests: pmtu: Add test choice and captures This series adds a couple of features useful for debugging: 1/2 allows selecting single tests and 2/2 adds optional traffic captures. Semantics for current invocation of test script are preserved. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:37:28 -07:00
Stefano Brivio	bb059fb204	selftests: pmtu: Add optional traffic captures for single tests If --trace is passed as an option and tcpdump is available, capture traffic for all relevant interfaces to per-test pcap files named <test>_<interface>.pcap. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:37:28 -07:00
Stefano Brivio	55bbc8ff49	selftests: pmtu: Allow selection of single tests As number of tests is growing, it's quite convenient to allow single tests to be run. Display usage when the script is run with any invalid argument, keep existing semantics when no arguments are passed so that automated runs won't break. Instead of just looping on the list of requested tests, if any, check first that they exist, and go through them in a nested loop to keep the existing way to display test descriptions. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:37:28 -07:00
Heiner Kallweit	2527e4037f	r8169: remove unneeded call to netif_stop_queue in rtl8169_net_suspend netif_device_detach() stops all tx queues already, so we don't need this call. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:35:03 -07:00
Heiner Kallweit	34bc009543	r8169: simplify rtl8169_set_magic_reg Simplify this function, no functional change intended. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:34:34 -07:00
Arnd Bergmann	44eb385bc5	octeontx2-af: remove unused cgx_fwi_link_change The newly added driver causes a warning about a function that is not used anywhere: drivers/net/ethernet/marvell/octeontx2/af/cgx.c:320:12: error: 'cgx_fwi_link_change' defined but not used [-Werror=unused-function] Remove it for now, until a user gets added. If we want to use this function from another module, we also need a declaration in a header file, which is currently missing, so it would have to change anyway. Fixes: `1463f382f5` ("octeontx2-af: Add support for CGX link management") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:31:54 -07:00
Ryan C Goodfellow	5948185b97	nfp: devlink port split support for 1x100G CXP NIC This commit makes it possible to use devlink to split the 100G CXP Netronome into two 40G interfaces. Currently when you ask for 2 interfaces, the math in src/nfp_devlink.c:nfp_devlink_port_split calculates that you want 5 lanes per port because for some reason eth_port.port_lanes=10 (shouldn't this be 12 for CXP?). What we really want when asking for 2 breakout interfaces is 4 lanes per port. This commit makes that happen by calculating based on 8 lanes if 10 are present. Signed-off-by: Ryan C Goodfellow <rgoodfel@isi.edu> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Greg Weeks <greg.weeks@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:29:55 -07:00
David S. Miller	ca0f32d5d9	Merge branch 'dpaa2-eth-code-cleanup' Ioana Ciornei says: ==================== dpaa2-eth: code cleanup There are no functional changes in this patch set, only some cleanup changes such as: unused parameters, uninitialized variables and unnecessary Kconfig dependencies. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:23:20 -07:00
Ioana Radulescu	b948c8c6a7	dpaa2-eth: remove unused FD field According to the hardware ArchDef, the PTV1 field in FD[CTRL] is ignored by WRIOP, so setting it for Tx FDs is pointless. Remove all references to it from the code. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:23:19 -07:00
Ioana Ciornei	b00c898c00	dpaa2-eth: mark unused parameter in dpaa2_eth_tx_conf The ch parameter is never used in the dpaa2_eth_tx_conf function but since its prototype must match the type defined in the consume field of struct dpaa2_eth_fq, just mark it as __always_unused. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:23:19 -07:00
Ioana Ciornei	fdb6ca9e46	dpaa2-eth: remove unused priv parameter The priv parameter is never used in the build_linear_skb and drain_channel function. Remove it from the function definitions. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:23:19 -07:00
Ioana Ciornei	85b7a342ba	dpaa2-eth: fix uninitialized variable warnings All 3 cases of possible uninitialized variables are false positives since they are used only as output parameters. Nonetheless, fix the warnings. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:23:19 -07:00
Ioana Ciornei	3233c1514f	dpaa2-eth: make dpaa2_eth_set_dist_key static The dpaa2_eth_set_dist_key function is only used in a single file. Make it static. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:23:19 -07:00
Ioana Radulescu	b12cef51b5	dpaa2-eth: Fix Kconfig dependencies Both ARCH_LAYERSCAPE and COMPILE_TEST dependencies are already implied through the FSL_MC_BUS dep, so there's no need to state it explicitly. Also, the fsl-mc bus depends on COMPILE_TEST only for some architectures (arm, arm64, ppc, x86), so it's not correct to claim build support unconditionally. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:23:19 -07:00
Ivan Khoronzhuk	5b3a5a14f8	net: ethernet: ti: cpsw: use for mcast entries only host port In dual-emac mode the cpsw driver sends directed packets, that means that packets go to the directed port, but an ALE lookup is performed to determine untagged egress only. It means that on tx side no need to add port bit for ALE mcast entry mask, and basically ALE entry for port identification is needed only on rx side. So, add only host port in dual_emac mode as used directed transmission, and no need in one more port. For single port boards and switch mode all ports used, as usual, so no changes for them. Also it simplifies farther changes. In other words, mcast entries for dual-emac should behave exactly like unicast. It also can help avoid leaking packets between ports with same vlan on h/w level if ports could became members of same vid. So now, for instance, if mcast address 33:33:00:00:00:01 is added then entries in ALE table: vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x1 vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x1 Instead of: vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x3 vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x5 With the same considerations, set only host port for unregistered mcast for dual-emac mode in case of IFF_ALLMULTI is set, exactly like it's done in cpsw_ale_set_allmulti(). Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:22:12 -07:00
David S. Miller	ba722f9b6f	Merge branch 'net-ethernet-ti-cpsw-fix-mcast-packet-lost' Ivan Khoronzhuk says: ==================== net: ethernet: ti: cpsw fix mcast packet lost The patchset omits redundant refresh of mcast address table and prevents mcast packet lost. Based on net-next/master tested on am572x evm ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:21:28 -07:00
Ivan Khoronzhuk	5da1948969	net: ethernet: ti: cpsw: fix lost of mcast packets while rx_mode update Whenever kernel or user decides to call rx mode update, it clears every multicast entry from forwarding table and in some time adds it again. This time can be enough to drop incoming multicast packets. That's why clear only staled multicast entries and update or add new one afterwards. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:21:28 -07:00
Ivan Khoronzhuk	58bdeac8b0	net: ethernet: ti: cpsw_ale: use const for API having pointer on mac address It allows to use function under callbacks with same const qualifier of mac address for farther changes. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:21:27 -07:00
David S. Miller	5985d5631d	Merge branch 'net-phy-improve-and-simplify-state-machine' Heiner Kallweit says: ==================== net: phy: improve and simplify state machine Improve / simplify handling of states PHY_RUNNING and PHY_RESUMING in phylib state machine. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:06:38 -07:00
Heiner Kallweit	eb4c470a15	net: phy: simplify handling of PHY_RESUMING in state machine Simplify code for handling state PHY_RESUMING, no functional change intended. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:06:38 -07:00
Heiner Kallweit	74fb5e25a3	net: phy: improve handling of PHY_RUNNING in state machine Handling of state PHY_RUNNING seems to be more complex than it needs to be. If not polling, then we don't have to do anything, we'll receive an interrupt and go to state PHY_CHANGELINK once the link goes down. If polling and link is down, we don't have to go the extra mile over PHY_CHANGELINK and call phy_read_status() again but can set status PHY_NOLINK directly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:06:38 -07:00
Roopa Prabhu	0813e95760	vxlan: support NTF_USE refresh of fdb entries This makes use of NTF_USE in vxlan driver consistent with bridge driver. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:02:57 -07:00
Justin.Lee1@Dell.com	9771b8ccdf	net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command The new command (NCSI_CMD_SEND_CMD) is added to allow user space application to send NC-SI command to the network card. Also, add a new attribute (NCSI_ATTR_DATA) for transferring request and response. The work flow is as below. Request: User space application -> Netlink interface (msg) -> new Netlink handler - ncsi_send_cmd_nl() -> ncsi_xmit_cmd() Response: Response received - ncsi_rcv_rsp() -> internal response handler - ncsi_rsp_handler_xxx() -> ncsi_rsp_handler_netlink() -> ncsi_send_netlink_rsp () -> Netlink interface (msg) -> user space application Command timeout - ncsi_request_timeout() -> ncsi_send_netlink_timeout () -> Netlink interface (msg with zero data length) -> user space application Error: Error detected -> ncsi_send_netlink_err () -> Netlink interface (err msg) -> user space application Signed-off-by: Justin Lee <justin.lee1@dell.com> Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:00:59 -07:00
Heiner Kallweit	6384e48323	net: phy: trigger state machine immediately in phy_start_machine When starting the state machine there may be work to be done immediately, e.g. if the initial state is PHY_UP then the state machine may trigger an autonegotiation. Having said that I see no need to wait a second until the state machine is run first time. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 22:00:06 -07:00
David S. Miller	a75d1801a9	Merge branch 'veth-XDP-stats-improvement' Toshiaki Makita says: ==================== veth: XDP stats improvement ndo_xdp_xmit in veth did not update packet counters as described in [1]. Also, current implementation only updates counters on tx side so rx side events like XDP_DROP were not collected. This series implements the missing accounting as well as support for ethtool per-queue stats in veth. Patch 1: Update drop counter in ndo_xdp_xmit. Patch 2: Update packet and byte counters for all XDP path, and drop counter on XDP_DROP. Patch 3: Support per-queue ethtool stats for XDP counters. Note that counters are maintained on per-queue basis for XDP but not otherwise (per-cpu and atomic as before). This is because 1) tx path in veth is essentially lockless so we cannot update per-queue stats on tx, and 2) rx path is net core routine (process_backlog) which cannot update per-queue based stats when XDP is disabled. On the other hand there are real rxqs and napi handlers for veth XDP, so update per-queue stats on rx for XDP packets, and use them to calculate tx counters as well, contrary to the existing non-XDP counters. [1] https://patchwork.ozlabs.org/cover/953071/#1967449 ==================== Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:58:46 -07:00
Toshiaki Makita	d397b9682c	veth: Add ethtool statistics support for XDP Expose per-queue stats for ethtool -S. As there are only rx queues, and rx queues are used only when XDP is used, per-queue counters are only rx XDP ones. Example: $ ethtool -S veth0 NIC statistics: peer_ifindex: 11 rx_queue_0_xdp_packets: 28601434 rx_queue_0_xdp_bytes: 1716086040 rx_queue_0_xdp_drops: 28601434 rx_queue_1_xdp_packets: 17873050 rx_queue_1_xdp_bytes: 1072383000 rx_queue_1_xdp_drops: 17873050 Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:58:46 -07:00
Toshiaki Makita	4195e54aaf	veth: Account for XDP packet statistics on rx side On XDP path veth has napi handler so we can collect statistics on per-queue basis for XDP. By this change now we can collect XDP_DROP drop count as well as packets and bytes coming through ndo_xdp_xmit. Packet counters shown by "ip -s link", sysfs stats or /proc/net/dev is now correct for XDP. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:58:46 -07:00
Toshiaki Makita	2131479df6	veth: Account for packet drops in ndo_xdp_xmit Use existing atomic drop counter. Since drop path is really an exceptional case here, I'm thinking atomic ops would not hurt the performance. XDP packets and bytes are not counted in ndo_xdp_xmit, but will be accounted on rx side by the following commit. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:58:46 -07:00
Hoang Le	acad76a5f6	tipc: support binding to specific ip address when activating UDP bearer INADDR_ANY is hard-coded when activating UDP bearer. So, we could not bind to a specific IP address even with replicast mode using - given remote ip address instead of using multicast ip address. In this commit, we fixed it by checking and switch to use appropriate local ip address. before: $netstat -plu Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address udp 0 0 0.0.0.0:6118 0.0.0.0:* after: $netstat -plu Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address udp 0 0 10.0.0.2:6118 0.0.0.0:* Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:56:56 -07:00
David S. Miller	1986647c2f	mlx5e-updates-2018-10-10 IPoIB netlink support and mlx5e pre-allocated netdevice initialization IP link was broken due to the changes in IPoIB for the rdma_netdev support after commit `cd565b4b51` ("IB/IPoIB: Support acceleration options callbacks"). This patchset fixes IPoIB pkey creation and removal using rtnetlink by adding support in both IPoIB ULP layer and mlx5 layer: From Jason and Denis: 1) Introduces changes in the RDMA netdev code in order to allow allocation of the netdev to be done by the rtnl netdev code. 2) Reworks IPoIB initialization to use the two step rdma_netdev creation. From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting pre-allocated netdevs: 3) Adds support to initialize/cleanup netdevs that are not created by mlx5 driver. 4) Change mlx5e netdevice layer to accept the pre-allocated netdevice queue number. 5) Initialize mlx5e generic structures in one place to be used for all netdevs types NIC/representors/IPoIB (both mlx5 allocated and pre-allocted). -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJbvqHhAAoJEEg/ir3gV/o+TRQH+wYWgvi5AbqYlGpT8xSho8Dg 5tuFE9AbYDGLLWptZgtYXsBsTaltGVqKnwWy//dAuuSI7LvqtjeKqq0G1JKau70L /49+vv8NTQ6vov2yY95yzzEo5B1/zVcYEl9dmJl4y6Xlwc+YIteewReVqRj+tucR xO7ghLWc1o4Pl7gWBAcOULyOsZc+CtG8ElwEaBROXTtYRpsR7FKn7vN+WdWZ2VOG MjtCUaG0vZlK1vaF76J3P9bL2V3y6o6i6S8RZwg1kPxO4jnrzyGrkxWMOX6f32KB JAUcLVlJW5wckcRAKfTwLxsFMrc9vLBPHyj4XneXVWGKh8/dGUGY5gdGIJV5Jlc= =m0tM -----END PGP SIGNATURE----- Merge tag 'mlx5e-updates-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5e-updates-2018-10-10 IPoIB netlink support and mlx5e pre-allocated netdevice initialization IP link was broken due to the changes in IPoIB for the rdma_netdev support after commit `cd565b4b51` ("IB/IPoIB: Support acceleration options callbacks"). This patchset fixes IPoIB pkey creation and removal using rtnetlink by adding support in both IPoIB ULP layer and mlx5 layer: From Jason and Denis: 1) Introduces changes in the RDMA netdev code in order to allow allocation of the netdev to be done by the rtnl netdev code. 2) Reworks IPoIB initialization to use the two step rdma_netdev creation. From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting pre-allocated netdevs: 3) Adds support to initialize/cleanup netdevs that are not created by mlx5 driver. 4) Change mlx5e netdevice layer to accept the pre-allocated netdevice queue number. 5) Initialize mlx5e generic structures in one place to be used for all netdevs types NIC/representors/IPoIB (both mlx5 allocated and pre-allocted). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:49:56 -07:00
David S. Miller	3325cf9e51	Merge branch 'defza-fddi' Maciej W. Rozycki says: ==================== FDDI: DEC FDDIcontroller 700 TURBOchannel adapter support This is an update to <http://patchwork.ozlabs.org/patch/342737/>. I believe I have addressed all the requests made in the previous review round. There is still one `checkpatch.pl' warning remaining: WARNING: quoted string split across lines + pr_info("%s: ROM rev. %.4s, firmware rev. %.4s, RMC rev. %.4s, " + "SMT ver. %u\n", fp->name, rom_rev, fw_rev, rmc_rev, smt_ver); total: 0 errors, 1 warnings, 2458 lines checked however I think the value of staying within 80 columns is higher than the value of having the string on a single line. This is because with all the formatting specifiers there it is not directly greppable based on the final output produced to the kernel log on one hand, e.g.: tc2: ROM rev. 1.0, firmware rev. 1.2, RMC rev. A, SMT ver. 1 while it can be easily tracked down by grepping for an obvious substring such as "RMC rev" on the other. The issue with MMIO barriers I discussed in the course of the original review turned out mostly irrelevant to this driver, because as I have learnt in a recent Alpha/Linux discussion starting here: <https://marc.info/?i=alpine.LRH.2.02.1808161556450.13597%20()%20file01%20!%20intranet%20!%20prod%20!%20int%20!%20rdu2%20!%20redhat%20!%20com> our MMIO API mandates the `readX' and `writeX' accessors to be strongly ordered with respect to each other, even if that is not implicitly enforced by hardware. Consequently I have removed all the explicit ordering barriers and instead submitted a fix for MIPS MMIO implementation, which currently does not guarantee strong ordering (the MIPS architecture does not define bus ordering rules except in terms of SYNC barriers), as recorded here: <https://patchwork.linux-mips.org/project/linux-mips/list/?series=1538>. Enforcing strong MMIO ordering can be costly however and is often unnecessary, e.g. when using PIO to access network frame data in onboard packet memory. I have therefore retained the information that would be lost by the removal of barriers, by defining accessor wrappers suffixed by `_o' and `_u', for accesses that have to be ordered and can be unordered respectively. If we ever have an API defined for weakly-ordered MMIO accesses, then these wrappers can be redefined accordingly. Right now they all expand to the respective `_relaxed' accessors, because, again, enforcing the ordering WRT DMA transfers can be costly and we don't need it here except in one place, where I chose to use explicit `dma_rmb' instead. Similarly I have replaced the completion barriers with a read back from the respective MMIO location (all adapter MMIO registers can be read with no side effects incurred), which will serve its purpose on the basis of MMIO being strongly ordered (although a read from TURBOchannel is going to be slower than `iob', making the delay incurred unnecessarily longer). And last but not least, I have split off the SMT Tx network tap support to a separate change, 2/2 in this series, so that it does not block the driver proper and can be discussed separately. I think it has value in that it makes the view of the outgoing network traffic complete, as if one actually physically tapped into the outgoing line of the ring, between the station being examined and its downstream neighbour. Without this part only traffic passed from applications through the whole protocol stack can be captured and this is only a part of the view. With the `dev_queue_xmit_nit' interface now exported it's only `ptype_all' that remains private, and to define a properly abstracted API I propose to provide am exported `dev_nit_active' predicate that tells whether any taps are active. This predicate is then used accordingly. NB if there is a long-term maintenance concern about the `dev_nit_active' predicate, then well, corresponding inline code currently present in `xmit_one' has to be maintained anyway, and if the resulting changes require `defza' to be updated accordingly, then I am going to handle it; after some 20 years with Linux it's not that I am going to disappear anywhere anytime. And once I am dead, which is inevitably going to happen sooner or later, then the driver can simply be ripped from the kernel. Though I suspect that at that point no DECstation Linux users may survive anymore, even though hardware, being as sturdy as it is, likely will. I have a patch for `tcpdump' to actually decode SMT frames, which I plan to upstream sometime. Here's a sample of SMT traffic captured through the `defza' driver in a small network of 4 stations and no concentrators, printed in the most verbose mode: 01:16:59.138381 4f 00:60:b0:58:41:e7 00:60:b0:58:41:e7 73: SMT NIF ann vid:1 tid:00000270 sid:00-00-00-60-b0-58-41-e7 len:40: UNA: 00 00 00 06 0d 1a 02 ae StationDescr: 00 01 02 00 StationState: 00 00 30 00 MACFrameStatusFunctions.3: 00 00 00 01 01:17:00.332750 4f 08:00:2b:a3:a3:29 08:00:2b:a3:a3:29 73: SMT NIF ann vid:1 tid:0000013b sid:00-00-08-00-2b-a3-a3-29 len:40: UNA: 00 00 00 06 0d 1a 82 e7 StationDescr: 00 01 02 00 StationState: 00 00 30 00 MACFrameStatusFunctions.3: 00 00 00 01 01:17:00.354479 4f 00:60:b0:58:40:75 00:60:b0:58:40:75 73: SMT NIF ann vid:1 tid:0000029c sid:00-00-00-60-b0-58-40-75 len:40: UNA: 00 00 10 00 d4 74 b6 ae StationDescr: 00 01 02 00 StationState: 00 00 31 00 MACFrameStatusFunctions.3: 00 00 00 01 01:17:00.442175 4f 00:60:b0:58:41:e7 Broadcast 73: SMT NIF req vid:1 tid:00000271 sid:00-00-00-60-b0-58-41-e7 len:40: UNA: 00 00 00 06 0d 1a 02 ae StationDescr: 00 01 02 00 StationState: 00 00 30 00 MACFrameStatusFunctions.3: 00 00 00 01 01:17:00.448657 41 08:00:2b:a3:a3:29 00:60:b0:58:41:e7 73: SMT NIF rsp vid:1 tid:00000271 sid:00-00-08-00-2b-a3-a3-29 len:40: UNA: 00 00 00 06 0d 1a 82 e7 StationDescr: 00 01 02 00 StationState: 00 00 30 00 MACFrameStatusFunctions.3: 00 00 00 01 01:17:01.015152 4f 08:00:2b:a3:a3:29 Broadcast 73: SMT NIF req vid:1 tid:0000013c sid:00-00-08-00-2b-a3-a3-29 len:40: UNA: 00 00 00 06 0d 1a 82 e7 StationDescr: 00 01 02 00 StationState: 00 00 30 00 MACFrameStatusFunctions.3: 00 00 00 01 01:17:01.111644 41 08:00:2b:2e:6d:75 08:00:2b:a3:a3:29 73: SMT NIF rsp vid:1 tid:0000013c sid:00-00-08-00-2b-2e-6d-75 len:40: UNA: 00 00 10 00 d4 c5 c5 94 StationDescr: 00 01 01 00 StationState: 00 00 11 00 MACFrameStatusFunctions.2: 00 00 00 01 01:17:04.814603 4f 08:00:2b:2e:6d:75 Broadcast 73: SMT NIF req vid:1 tid:0000013c sid:00-00-08-00-2b-2e-6d-75 len:40: UNA: 00 00 10 00 d4 c5 c5 94 StationDescr: 00 01 01 00 StationState: 00 00 11 00 MACFrameStatusFunctions.2: 00 00 00 01 01:17:04.814939 4f 08:00:2b:2e:6d:75 Broadcast 73: SMT NIF req vid:1 tid:0000013c sid:00-00-08-00-2b-2e-6d-75 len:40: UNA: 00 00 10 00 d4 c5 c5 94 StationDescr: 00 01 01 00 StationState: 00 00 11 00 MACFrameStatusFunctions.2: 00 00 00 01 01:17:04.820960 4f 08:00:2b:2e:6d:75 08:00:2b:2e:6d:75 73: SMT NIF ann vid:1 tid:0000013b sid:00-00-08-00-2b-2e-6d-75 len:40: UNA: 00 00 10 00 d4 c5 c5 94 StationDescr: 00 01 01 00 StationState: 00 00 11 00 MACFrameStatusFunctions.2: 00 00 00 01 Questions, comments? Otherwise, please apply. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:46:07 -07:00
Maciej W. Rozycki	9f9a742db4	FDDI: defza: Support capturing outgoing SMT traffic DEC FDDIcontroller 700 (DEFZA) uses a Tx/Rx queue pair to communicate SMT frames with adapter's firmware. Any SMT frame received from the RMC via the Rx queue is queued back by the driver to the SMT Rx queue for the firmware to process. Similarly the firmware uses the SMT Tx queue to supply the driver with SMT frames which are queued back to the Tx queue for the RMC to send to the ring. When a network tap is attached to an FDDI interface handled by `defza' any incoming SMT frames captured are queued to our usual processing of network data received, which in turn delivers them to any listening taps. However the outgoing SMT frames produced by the firmware bypass our network protocol stack and are therefore not delivered to taps. This in turn means that taps are missing a part of network traffic sent by the adapter, which may make it more difficult to track down network problems or do general traffic analysis. Call `dev_queue_xmit_nit' then in the SMT Tx path, having checked that a network tap is attached, with a newly-created `dev_nit_active' helper wrapping the usual condition used in the transmit path. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:46:06 -07:00
Maciej W. Rozycki	61414f5ec9	FDDI: defza: Add support for DEC FDDIcontroller 700 TURBOchannel adapter Add support for the DEC FDDIcontroller 700 (DEFZA), Digital Equipment Corporation's first-generation FDDI network interface adapter, made for TURBOchannel and based on a discrete version of what eventually became Motorola's widely used CAMEL chipset. The CAMEL chipset is present for example in the DEC FDDIcontroller TURBOchannel, EISA and PCI adapters (DEFTA/DEFEA/DEFPA) that we support with the `defxx' driver, however the host bus interface logic and the firmware API are different in the DEFZA and hence a separate driver is required. There isn't much to say about the driver except that it works, but there is one peculiarity to mention. The adapter implements two Tx/Rx queue pairs. Of these one pair is the usual network Tx/Rx queue pair, in this case used by the adapter to exchange frames with the ring, via the RMC (Ring Memory Controller) chip. The Tx queue is handled directly by the RMC chip and resides in onboard packet memory. The Rx queue is maintained via DMA in host memory by adapter's firmware copying received data stored by the RMC in onboard packet memory. The other pair is used to communicate SMT frames with adapter's firmware. Any SMT frame received from the RMC via the Rx queue must be queued back by the driver to the SMT Rx queue for the firmware to process. Similarly the firmware uses the SMT Tx queue to supply the driver with SMT frames that must be queued back to the Tx queue for the RMC to send to the ring. This solution was chosen because the designers ran out of PCB space and could not squeeze in more logic onto the board that would be required to handle this SMT frame traffic without the need to involve the driver, as with the later DEFTA/DEFEA/DEFPA adapters. Finally the driver does some Frame Control byte decoding, so to avoid magic numbers some macros are added to <linux/if_fddi.h>. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:46:06 -07:00
Serhey Popovych	df52eab23d	tun: Consistently configure generic netdev params via rtnetlink Configuring generic network device parameters on tun will fail in presence of IFLA_INFO_KIND attribute in IFLA_LINKINFO nested attribute since tun_validate() always return failure. This can be visualized with following ip-link(8) command sequences: # ip link set dev tun0 group 100 # ip link set dev tun0 group 100 type tun RTNETLINK answers: Invalid argument with contrast to dummy and veth drivers: # ip link set dev dummy0 group 100 # ip link set dev dummy0 type dummy # ip link set dev veth0 group 100 # ip link set dev veth0 group 100 type veth Fix by returning zero in tun_validate() when @data is NULL that is always in case since rtnl_link_ops->maxtype is zero in tun driver. Fixes: `f019a7a594` ("tun: Implement ip link del tunXXX") Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-15 21:40:31 -07:00
Jakub Kicinski	0b592b5a01	tools: bpftool: add map create command Add a way of creating maps from user space. The command takes as parameters most of the attributes of the map creation system call command. After map is created its pinned to bpffs. This makes it possible to easily and dynamically (without rebuilding programs) test various corner cases related to map creation. Map type names are taken from bpftool's array used for printing. In general these days we try to make use of libbpf type names, but there are no map type names in libbpf as of today. As with most features I add the motivation is testing (offloads) :) Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-10-15 16:39:21 -07:00
Alexei Starovoitov	2f1d774f7d	Merge branch 'bpftool_sockmap' John Fastabend says: ==================== The first patch adds support for attaching programs to maps. This is needed to support sock{map\|hash} use from bpftool. Currently, I carry around custom code to do this so doing it using standard bpftool will be great. The second patch adds a compat mode to ignore non-zero entries in the map def. This allows using bpftool with maps that have a extra fields that the user knows can be ignored. This is needed to work correctly with maps being loaded by other tools or directly via syscalls. v3: add bash completion and doc updates for --mapcompat ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-10-15 16:13:15 -07:00

1 2 3 4 5 ...

785574 Commits All Branches Search

785574 Commits

All Branches