linux

Commit Graph

Author	SHA1	Message	Date
Kelvin Cao	3df54c870f	ntb_hw_switchtec: Allow using Switchtec NTB in multi-partition setups Allow using Switchtec NTB in setups that have more than two partitions. Note: this does not enable having multi-host communication, it only allows for a single NTB link between two hosts in a network that might have more than two. Use following logic to determine the NT peer partition: 1) If there are 2 partitions, and the target vector is set in the Switchtec configuration, use the partition specified in target vector. 2) If there are 2 partitions and target vector is unset use the only other partition as specified in the NT EP map. 3) If there are more than 2 partitions and target vector is set use the other partition specified in target vector. 4) If there are more than 2 partitions and target vector is unset, this is invalid and report an error. Signed-off-by: Kelvin Cao <kelvin.cao@microsemi.com> [logang@deltatee.com: commit message fleshed out] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2018-01-28 22:17:23 -05:00
Jon Mason	2dd0f6a64a	NTB: switchtec_ntb: Add new line on appropriate printks Trivial addition of "\n" to the dev_* prints where necessary Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2018-01-28 22:17:23 -05:00
Colin Ian King	c5ec8b451a	NTB: switchtec_ntb: fix spelling mistake: "peforming" -> "performing" Trivial fix to spelling mistake in dev_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-By: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2018-01-28 22:17:23 -05:00
Dave Jiang	3f7756728e	ntb: remove Intel Atom NTB driver support Removing dead code since this is not being used. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2018-01-28 22:17:23 -05:00
Greg Kroah-Hartman	0ed08f829b	ntb: remove unneeded DRIVER_LICENSE #defines There is no need to #define the license of the driver, just put it in the MODULE_LICENSE() line directly as a text string. This allows tools that check that the module license matches the source code license to work properly, as there is no need to unwind the unneeded dereference, especially when the string is defined just a few lines above the usage of it. Reported-and-reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Allen Hubbe <Allen.Hubbe@emc.com> Cc: Gary R Hook <gary.hook@amd.com> Cc: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2018-01-28 22:17:23 -05:00
Doug Meyer	140eb52277	NTB: ntb_hw_switchtec: Fix peer BAR bug in switchtec_ntb_init_shared_mw This resolves a bug which may incorrectly configure the peer host's LUT for shared memory window access. The code was using the local host's first BAR number, rather than the peer hosts's first BAR number, to determine what peer NT control register to program. The bug will cause the Switchtec NTB link to work only if both peers have the same first NTB BAR configured. In all other configurations, the link will not come up, failing silently. When both hosts have the same first BAR, the configuration works only because the first BAR numbers happent to be the same. When the hosts do not have the same first BAR, then the LUT translation will not be configured in the correct peer LUT and will not give the peer the shared memory window access required for the link to operate. Signed-off-by: Doug Meyer <dmeyer@gigaio.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Fixes: 678784a44ae8 ("NTB: switchtec_ntb: Initialize hardware for memory windows") Signed-off-by: Jon Mason <jdmason@kudzu.us>	2018-01-28 22:17:22 -05:00
Kees Cook	e99e88a9d2	treewide: setup_timer() -> timer_setup() This converts all remaining cases of the old setup_timer() API into using timer_setup(), where the callback argument is the structure already holding the struct timer_list. These should have no behavioral changes, since they just change which pointer is passed into the callback with the same available pointers after conversion. It handles the following examples, in addition to some other variations. Casting from unsigned long: void my_callback(unsigned long data) { struct something ptr = (struct something )data; ... } ... setup_timer(&ptr->my_timer, my_callback, ptr); and forced object casts: void my_callback(struct something ptr) { ... } ... setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr); become: void my_callback(struct timer_list t) { struct something ptr = from_timer(ptr, t, my_timer); ... } ... timer_setup(&ptr->my_timer, my_callback, 0); Direct function assignments: void my_callback(unsigned long data) { struct something ptr = (struct something )data; ... } ... ptr->my_timer.function = my_callback; have a temporary cast added, along with converting the args: void my_callback(struct timer_list t) { struct something ptr = from_timer(ptr, t, my_timer); ... } ... ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback; And finally, callbacks without a data assignment: void my_callback(unsigned long data) { ... } ... setup_timer(&ptr->my_timer, my_callback, 0); have their argument renamed to verify they're unused during conversion: void my_callback(struct timer_list unused) { ... } ... timer_setup(&ptr->my_timer, my_callback, 0); The conversion is done with the following Coccinelle script: spatch --very-quiet --all-includes --include-headers \ -I ./arch/x86/include -I ./arch/x86/include/generated \ -I ./include -I ./arch/x86/include/uapi \ -I ./arch/x86/include/generated/uapi -I ./include/uapi \ -I ./include/generated/uapi --include ./include/linux/kconfig.h \ --dir . \ --cocci-file ~/src/data/timer_setup.cocci @fix_address_of@ expression e; @@ setup_timer( -&(e) +&e , ...) // Update any raw setup_timer() usages that have a NULL callback, but // would otherwise match change_timer_function_usage, since the latter // will update all function assignments done in the face of a NULL // function initialization in setup_timer(). @change_timer_function_usage_NULL@ expression _E; identifier _timer; type _cast_data; @@ ( -setup_timer(&_E->_timer, NULL, _E); +timer_setup(&_E->_timer, NULL, 0); \| -setup_timer(&_E->_timer, NULL, (_cast_data)_E); +timer_setup(&_E->_timer, NULL, 0); \| -setup_timer(&_E._timer, NULL, &_E); +timer_setup(&_E._timer, NULL, 0); \| -setup_timer(&_E._timer, NULL, (_cast_data)&_E); +timer_setup(&_E._timer, NULL, 0); ) @change_timer_function_usage@ expression _E; identifier _timer; struct timer_list _stl; identifier _callback; type _cast_func, _cast_data; @@ ( -setup_timer(&_E->_timer, _callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, &_callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, _callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, &_callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)_callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)&_callback, _E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, &_callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, &_callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E); +timer_setup(&_E._timer, _callback, 0); \| _E->_timer@_stl.function = _callback; \| _E->_timer@_stl.function = &_callback; \| _E->_timer@_stl.function = (_cast_func)_callback; \| _E->_timer@_stl.function = (_cast_func)&_callback; \| _E._timer@_stl.function = _callback; \| _E._timer@_stl.function = &_callback; \| _E._timer@_stl.function = (_cast_func)_callback; \| _E._timer@_stl.function = (_cast_func)&_callback; ) // callback(unsigned long arg) @change_callback_handle_cast depends on change_timer_function_usage@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _origtype; identifier _origarg; type _handletype; identifier _handle; @@ void _callback( -_origtype _origarg +struct timer_list t ) { ( ... when != _origarg _handletype _handle = -(_handletype )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg \| ... when != _origarg _handletype _handle = -(void )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg \| ... when != _origarg _handletype _handle; ... when != _handle _handle = -(_handletype )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg \| ... when != _origarg _handletype _handle; ... when != _handle _handle = -(void )_origarg; +from_timer(_handle, t, _timer); ... when != _origarg ) } // callback(unsigned long arg) without existing variable @change_callback_handle_cast_no_arg depends on change_timer_function_usage && !change_callback_handle_cast@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _origtype; identifier _origarg; type _handletype; @@ void _callback( -_origtype _origarg +struct timer_list t ) { + _handletype _origarg = from_timer(_origarg, t, _timer); + ... when != _origarg - (_handletype )_origarg + _origarg ... when != _origarg } // Avoid already converted callbacks. @match_callback_converted depends on change_timer_function_usage && !change_callback_handle_cast && !change_callback_handle_cast_no_arg@ identifier change_timer_function_usage._callback; identifier t; @@ void _callback(struct timer_list t) { ... } // callback(struct something handle) @change_callback_handle_arg depends on change_timer_function_usage && !match_callback_converted && !change_callback_handle_cast && !change_callback_handle_cast_no_arg@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _handletype; identifier _handle; @@ void _callback( -_handletype _handle +struct timer_list t ) { + _handletype _handle = from_timer(_handle, t, _timer); ... } // If change_callback_handle_arg ran on an empty function, remove // the added handler. @unchange_callback_handle_arg depends on change_timer_function_usage && change_callback_handle_arg@ identifier change_timer_function_usage._callback; identifier change_timer_function_usage._timer; type _handletype; identifier _handle; identifier t; @@ void _callback(struct timer_list t) { - _handletype _handle = from_timer(_handle, t, _timer); } // We only want to refactor the setup_timer() data argument if we've found // the matching callback. This undoes changes in change_timer_function_usage. @unchange_timer_function_usage depends on change_timer_function_usage && !change_callback_handle_cast && !change_callback_handle_cast_no_arg && !change_callback_handle_arg@ expression change_timer_function_usage._E; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type change_timer_function_usage._cast_data; @@ ( -timer_setup(&_E->_timer, _callback, 0); +setup_timer(&_E->_timer, _callback, (_cast_data)_E); \| -timer_setup(&_E._timer, _callback, 0); +setup_timer(&_E._timer, _callback, (_cast_data)&_E); ) // If we fixed a callback from a .function assignment, fix the // assignment cast now. @change_timer_function_assignment depends on change_timer_function_usage && (change_callback_handle_cast \|\| change_callback_handle_cast_no_arg \|\| change_callback_handle_arg)@ expression change_timer_function_usage._E; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type _cast_func; typedef TIMER_FUNC_TYPE; @@ ( _E->_timer.function = -_callback +(TIMER_FUNC_TYPE)_callback ; \| _E->_timer.function = -&_callback +(TIMER_FUNC_TYPE)_callback ; \| _E->_timer.function = -(_cast_func)_callback; +(TIMER_FUNC_TYPE)_callback ; \| _E->_timer.function = -(_cast_func)&_callback +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -_callback +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -&_callback; +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -(_cast_func)_callback +(TIMER_FUNC_TYPE)_callback ; \| _E._timer.function = -(_cast_func)&_callback +(TIMER_FUNC_TYPE)_callback ; ) // Sometimes timer functions are called directly. Replace matched args. @change_timer_function_calls depends on change_timer_function_usage && (change_callback_handle_cast \|\| change_callback_handle_cast_no_arg \|\| change_callback_handle_arg)@ expression _E; identifier change_timer_function_usage._timer; identifier change_timer_function_usage._callback; type _cast_data; @@ _callback( ( -(_cast_data)_E +&_E->_timer \| -(_cast_data)&_E +&_E._timer \| -_E +&_E->_timer ) ) // If a timer has been configured without a data argument, it can be // converted without regard to the callback argument, since it is unused. @match_timer_function_unused_data@ expression _E; identifier _timer; identifier _callback; @@ ( -setup_timer(&_E->_timer, _callback, 0); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, _callback, 0L); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E->_timer, _callback, 0UL); +timer_setup(&_E->_timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, 0); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, 0L); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_E._timer, _callback, 0UL); +timer_setup(&_E._timer, _callback, 0); \| -setup_timer(&_timer, _callback, 0); +timer_setup(&_timer, _callback, 0); \| -setup_timer(&_timer, _callback, 0L); +timer_setup(&_timer, _callback, 0); \| -setup_timer(&_timer, _callback, 0UL); +timer_setup(&_timer, _callback, 0); \| -setup_timer(_timer, _callback, 0); +timer_setup(_timer, _callback, 0); \| -setup_timer(_timer, _callback, 0L); +timer_setup(_timer, _callback, 0); \| -setup_timer(_timer, _callback, 0UL); +timer_setup(_timer, _callback, 0); ) @change_callback_unused_data depends on match_timer_function_unused_data@ identifier match_timer_function_unused_data._callback; type _origtype; identifier _origarg; @@ void _callback( -_origtype _origarg +struct timer_list unused ) { ... when != _origarg } Signed-off-by: Kees Cook <keescook@chromium.org>	2017-11-21 15:57:07 -08:00
Dave Jiang	4201a9918c	ntb: intel: remove b2b memory window workaround for Skylake NTB The workaround code is never used because Skylake NTB does not need it. Reported-by: Allen Hubbe <allen.hubbe@dell.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:54:47 -05:00
Bhumika Goyal	3a814a04e6	NTB: make idt_89hpes_cfg const Make these const as they are only used during a copy operation. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:13 -05:00
Logan Gunthorpe	87d11e645e	NTB: switchtec_ntb: Add memory window support The Switchtec hardware has two types of memory windows: LUTs and Direct. The first area in each BAR is for LUT windows and the remaining area is for the direct region. The total number of LUT entries is set by a configuration setting in hardware and they all must be the same size. (This is fixed by switchtec_ntb to be 64K.) switchtec_ntb enables the LUTs only for the first BAR and enables the highest power of two possible. Seeing the LUTs are at the beginning of the BAR, the direct memory window's alignment is affected. Therefore, the maximum direct memory window size can not be greater than the number of LUTs times 64K. The direct window in other BARs will not have this restriction as the LUTs will not be enabled there. LUTs will only be exposed through the NTB API if the use_lut_mw parameter is set. Seeing the Switchtec hardware, by default, configures BARs to be 4G a module parameter is given to limit the size of the advertised memory windows. Higher layers tend to allocate the maximum BAR size and this has a tendency to fail when they try to allocate 4GB of contiguous memory. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	b9a4acac28	NTB: switchtec_ntb: Implement scratchpad registers Seeing there is no dedicated hardware for this, we simply add these as entries in the shared memory window. Thus, we could support any number of them but 128 seems like enough, for now. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	6619bf9549	NTB: switchtec_ntb: Implement doorbell registers Pretty straightforward implementation of doorbell registers. The shift and mask were setup in an earlier patch and this just hooks up the appropriate portion of the IDB register as the local doorbells and the opposite portion of ODB as the peer doorbells. The DB mask is protected by a spinlock to avoid concurrent read-modify-write accesses. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	0ee28f26f3	NTB: switchtec_ntb: Add link management switchtec_ntb checks for a link by looking at the shared memory window. If the magic number is correct and the other side indicates their link is enabled then we take the link to be up. Whenever we change our local link status we send a msg to the other side to check whether it's up and change their status. The current status is maintained in a flag so ntb_is_link_up can return quickly. We utilize Switchtec's link status notifier to also check link changes when the switch notices a port changes state. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	e099b45b7c	NTB: switchtec_ntb: Add skeleton NTB driver Add a skeleton NTB driver which will be filled out in subsequent patches. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	3dd4db475c	NTB: switchtec_ntb: Initialize hardware for doorbells and messages Set up some hardware registers and creates interrupt service routines for the doorbells and messages. There are 64 doorbells in the switch that are shared between all partitions. The upper 4 doorbells are also shared with the messages and are therefore not used. Thus, this provides 28 doorbells for each partition. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	ec0467ccbd	NTB: switchtec_ntb: Initialize hardware for memory windows Add the code to initialize the memory windows in the hardware. This includes setting up the requester ID table, and figuring out which BAR corresponds to which memory window. (Seeing the switch can be configured with any number of BARs.) Also, seeing the device doesn't have hardware for scratchpads or determining the link status, we create a shared memory window that has these features. A magic number with a version component will be used to determine if the other side's driver is actually up. The shared memory window also informs the other side of the size and count of the local memory windows. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	33dea5aae0	NTB: switchtec_ntb: Introduce initial NTB driver Seeing the Switchtec NTB hardware shares the same endpoint as the management endpoint we utilize the class_interface API to register an NTB driver for every Switchtec device in the system that has the NTB class code. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:12 -05:00
Logan Gunthorpe	980c41c86b	NTB: Ensure ntb_mw_get_align() is only called when the link is up With Switchtec hardware it's impossible to get the alignment parameters for a peer's memory window until the peer's driver has configured its windows. Strictly speaking, the link doesn't have to be up for this, but the link being up is the only way the client can tell that the other side has been configured. This patch converts ntb_transport and ntb_perf to use this function after the link goes up. This simplifies these clients slightly because they no longer have to store the alignment parameters. It also tweaks ntb_tool so that peer_mw_trans will print zero if it is run before the link goes up. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-11-18 20:37:11 -05:00
Dave Jiang	f3fd2afed8	ntb: transport shouldn't disable link due to bogus values in SPADs It seems that under certain scenarios the SPAD can have bogus values caused by an agent (i.e. BIOS or other software) that is not the kernel driver, and that causes memory window setup failure. This should not cause the link to be disabled because if we do that, the driver will never recover again. We have verified in testing that this issue happens and prevents proper link recovery. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Fixes: `84f766855f` ("ntb: stop link work when we do not have memory")	2017-08-01 13:31:44 -04:00
Logan Gunthorpe	bc240eec4b	ntb: use correct mw_count function in ntb_tool and ntb_transport After converting to the new API, both ntb_tool and ntb_transport are using ntb_mw_count to iterate through ntb_peer_get_addr when they should be using ntb_peer_mw_count. This probably isn't an issue with the Intel and AMD drivers but this will matter for any future driver with asymetric memory window counts. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Fixes: `443b9a14ec` ("NTB: Alter MW API to support multi-ports devices")	2017-07-17 12:56:15 -04:00
Gary R Hook	32e0f5bfa5	ntb: Add error path/handling to Debug FS entry creation If a failure occurs when creating Debug FS entries, unroll all of the work that's been done. Signed-off-by: Gary R Hook <gary.hook@amd.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:08 -04:00
Gary R Hook	8407dd6c16	ntb: Add more debugfs support for ntb_perf testing options The ntb_perf tool uses module parameters to control the characteristics of its test. Enable the changing of these options through debugfs, and eliminating the need to unload and reload the module to make changes and run additional tests. Add a new module parameter that forces the DMA channel selection onto the same node as the NTB device (default: true). - seg_order: Size of the NTB memory window; power of 2. - run_order: Size of the data buffer; power of 2. - use_dma: Use DMA or memcpy? Default: 0. - on_node: Only use DMA channel(s) on the NTB node. Default: true. Signed-off-by: Gary R Hook <gary.hook@amd.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:08 -04:00
Gary R Hook	0b93a6dbec	ntb: Remove debug-fs variables from the context structure The Debug FS entries manage themselves; we don't need to hang onto them in the context structure. Signed-off-by: Gary R Hook <gary.hook@amd.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Gary R Hook	e9410ff810	ntb: Add a module option to control affinity of DMA channels The DMA channel(s)/memory used to transfer data to an NTB device may not be required to be on the same node as the device. Add a module parameter that allows any candidate channel (aside from node assocation) and allocated memory to be used. Signed-off-by: Gary R Hook <gary.hook@amd.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Serge Semin	bf2a952d31	NTB: Add IDT 89HPESxNTx PCIe-switches support IDT 89HPESxNTx device series is PCIe-switches, which support Non-Transparent bridging between domains connected to the device ports. Since new NTB API exposes multi-port interface and messaging API, the IDT NT-functions can be now supported in the kernel. This driver adds the following functionality: 1) Multi-port NTB API to have information of possible NT-functions activated in compliance with available device ports. 2) Memory windows of direct and look up table based address translation with all possible combinations of BARs setup. 3) Traditional doorbell NTB API. 4) One-on-one messaging NTB API. There are some IDT PCIe-switch setups, which must be done before any of the NTB peers started. It can be performed either by system BIOS via IDT SMBus-slave interface or by pre-initialized IDT PCIe-switch EEPROM: 1) NT-functions of corresponding ports must be activated using SWPARTxCTL and SWPORTxCTL registers. 2) BAR0 must be configured to expose NT-function configuration registers map. 3) The rest of the BARs must have at least one memory window configured, otherwise the driver will just return an error. Temperature sensor of IDT PCIe-switches can be also optionally activated by BIOS or EEPROM. (See IDT documentations for details of how the pre-initialization can be done) Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Logan Gunthorpe	48ea02184a	ntb_hw_intel: Style fixes: open code macros that just obfuscate code As per a comments in [1] by Greg Kroah-Hartman, the ndev_* macros should be cleaned up. This makes it more clear what's actually going on when reading the code. [1] http://www.spinics.net/lists/linux-pci/msg56904.html Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Logan Gunthorpe	0f9bfb979a	ntb_hw_amd: Style fixes: open code macros that just obfuscate code As per a comments in [1] by Greg Kroah-Hartman, the ndev_* macros should be cleaned up. This makes it more clear what's actually going on when reading the code. [1] http://www.spinics.net/lists/linux-pci/msg56904.html Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Serge Semin	bc3e49adc2	NTB: Add Messaging NTB API Some IDT NTB-capable PCIe-switches have message registers to communicate with peer devices. This patch adds new NTB API callback methods, which can be used to utilize these registers functionality: ntb_msg_count(); - get number of message registers ntb_msg_inbits(); - get bitfield of inbound message registers status ntb_msg_outbits(); - get bitfield of outbound message registers status ntb_msg_read_sts(); - read the inbound and outbound message registers status ntb_msg_clear_sts(); - clear status bits of message registers ntb_msg_set_mask(); - mask interrupts raised by status bits of message registers. ntb_msg_clear_mask(); - clear interrupts mask bits of message registers ntb_msg_read(midx, *pidx); - read message register with specified index, additionally getting peer port index which data received from ntb_msg_write(midx, pidx); - write data to the specified message register sending it to the passed peer device connected over a pidx port ntb_msg_event(); - notify driver context of a new message event Of course there is hardware which doesn't support Message registers, so this API is made optional. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Serge Semin	d67288a395	NTB: Alter Scratchpads API to support multi-ports devices Even though there is no any real NTB hardware, which would have both more than two ports and Scratchpad registers, it is logically correct to have Scratchpad API accepting a peer port index as well. Intel/AMD drivers utilize Primary and Secondary topology to split Scratchpad between connected root devices. Since port-index API introduced, Intel/AMD NTB hardware drivers can use device port to determine which Scratchpad registers actually belong to local and peer devices. The same approach can be used if some potential hardware in future will be multi-port and have some set of Scratchpads. Here are the brief of changes in the API: ntb_spad_count() - return number of Scratchpads per each port ntb_peer_spad_addr(pidx, sidx) - address of Scratchpad register of the peer device with pidx-index ntb_peer_spad_read(pidx, sidx) - read specified Scratchpad register of the peer with pidx-index ntb_peer_spad_write(pidx, sidx) - write data to Scratchpad register of the peer with pidx-index Since there is hardware which doesn't support Scratchpad registers, the corresponding API methods are now made optional. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Serge Semin	443b9a14ec	NTB: Alter MW API to support multi-ports devices Multi-port NTB devices permit to share a memory between all accessible peers. Memory Windows API is altered to correspondingly initialize and map memory windows for such devices: ntb_mw_count(pidx); - number of inbound memory windows, which can be allocated for shared buffer with specified peer device. ntb_mw_get_align(pidx, widx); - get alignment and size restriction parameters to properly allocate inbound memory region. ntb_peer_mw_count(); - get number of outbound memory windows. ntb_peer_mw_get_addr(widx); - get mapping address of an outbound memory window If hardware supports inbound translation configured on the local ntb port: ntb_mw_set_trans(pidx, widx); - set translation address of allocated inbound memory window so a peer device could access it. ntb_mw_clear_trans(pidx, widx); - clear the translation address of an inbound memory window. If hardware supports outbound translation configured on the peer ntb port: ntb_peer_mw_set_trans(pidx, widx); - set translation address of a memory window retrieved from a peer device ntb_peer_mw_clear_trans(pidx, widx); - clear the translation address of an outbound memory window Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Serge Semin	4e8c11b7fd	NTB: Alter link-state API to support multi-port devices Multi-port devices permit the NTB connections between multiple domains, so a local device can have NTB link being up with one peer and being down with another. NTB link-state API is appropriately altered to return a bitfield of the link-states between the local device and possible peers. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Serge Semin	1e5301196a	NTB: Add indexed ports NTB API There is some NTB hardware, which can combine more than just two domains over NTB. For instance, some IDT PCIe-switches can have NTB-functions activated on more than two-ports. The different domains are distinguished by ports they are connected to. So the new port-related methods are added to the NTB API: ntb_port_number() - return local port ntb_peer_port_count() - return number of peers local port can connect to ntb_peer_port_number(pdix) - return port number by it index ntb_peer_port_idx(port) - return port index by it number Current test-drivers aren't changed much. They still support two-ports devices for the time being while multi-ports hardware drivers aren't added. By default port-related API is declared for two-ports hardware. So corresponding hardware drivers won't need to implement it. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-07-06 11:30:07 -04:00
Allen Hubbe	88931ec3dc	ntb: no sleep in ntb_async_tx_submit Do not sleep in ntb_async_tx_submit, which could deadlock. This reverts commit "8c874cc140d667f84ae4642bb5b5e0d6396d2ca4" Fixes: `8c874cc140` ("NTB: Address out of DMA descriptor issue with NTB") Reported-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-06-19 14:24:41 -04:00
Dave Jiang	5eb449e15d	ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits Fixing doorbell register length to 32bits per spec. On Skylake NTB, the doorbell registers are 32bit write only registers. The source for the doorbell is a 64bit register that shows the interrupt bits. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Fixes: `783dfa6cc4` ("ntb: Adding Skylake Xeon NTB support") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-06-19 14:24:41 -04:00
Logan Gunthorpe	8e8496e0e9	ntb_transport: fix bug calculating num_qps_mw A divide by zero error occurs if qp_count is less than mw_count because num_qps_mw is calculated to be zero. The calculation appears to be incorrect. The requirement is for num_qps_mw to be set to qp_count / mw_count with any remainder divided among the earlier mws. For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1 will have 3 qps per window and mws 2 through 4 will have 2 qps per window. Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher than when mw_num >= qp_count. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Fixes: `e26a5843f7` ("NTB: Split ntb_hw_intel and ntb_transport drivers") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-06-19 14:24:41 -04:00
Logan Gunthorpe	cb827ee6cc	ntb_transport: fix qp count bug In cases where there are more mw's than spads/2-2, the mw count gets reduced to match the limitation. ntb_transport also tries to ensure that there are fewer qps than mws but uses the full mw count instead of the reduced one. When this happens, the math in 'ntb_transport_setup_qp_mw' will get confused and result in a kernel paging request bug. This patch fixes the bug by reducing qp_count to the reduced mw count instead of the full mw count. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Fixes: `e26a5843f7` ("NTB: Split ntb_hw_intel and ntb_transport drivers") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-06-19 14:24:41 -04:00
Gary R Hook	94fc795454	ntb: Correct modinfo usage statement for ntb_perf The order parameters are powers of 2; adjust the usage information to use correct mathematical representations. Signed-off-by: Gary R Hook <gary.hook@amd.com> Fixes: `8a7b6a778a` ("ntb: ntb perf tool") Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-06-19 14:24:41 -04:00
Dave Jiang	939ada5fb5	ntb: ntb_hw_intel: link_poll isn't clearing the pending status properly On Skylake hardware, the link_poll isn't clearing the pending interrupt bit. Adding a new function for SKX that handles clearing of status bit the right way. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Fixes: `783dfa6c` ("ntb: Adding Skylake Xeon NTB support") Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-02-16 23:11:26 -05:00
Thomas VanSelus	8fcd0950c0	ntb_transport: Pick an unused queue Fix typo causing ntb_transport_create_queue to select the first queue every time, instead of using the next free queue. Signed-off-by: Thomas VanSelus <tvanselus@xes-inc.com> Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Fixes: `fce8a7bb5` ("PCI-Express Non-Transparent Bridge Support") Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-02-16 23:11:26 -05:00
Dave Jiang	9644347c52	ntb: ntb_perf missing dmaengine_unmap_put In the normal I/O execution path, ntb_perf is missing a call to dmaengine_unmap_put() after submission. That causes us to leak unmap objects. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Fixes: `8a7b6a77` ("ntb: ntb perf tool") Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-02-16 23:11:26 -05:00
Allen Hubbe	dd62245e73	NTB: ntb_transport: fix debugfs_remove_recursive The call to debugfs_remove_recursive(qp->debugfs_dir) of the sub-level directory must not be later than debugfs_remove_recursive(nt_debugfs_dir) of the top-level directory. Otherwise, the sub-level directory will not exist, and it would be invalid (panic) to attempt to remove it. This removes the top-level directory last, after sub-level directories have been cleaned up. Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com> Fixes: `e26a5843f` ("NTB: Split ntb_hw_intel and ntb_transport drivers") Signed-off-by: Jon Mason <jdmason@kudzu.us>	2017-02-16 23:10:52 -05:00
Steve Wahl	dfb7d24c5a	ntb_transport: Remove unnecessary call to ntb_peer_spad_read The results were previously ignored, anyway. Signed-off-by: Steve Wahl <Steve.Wahl@dell.com> Fixes: `e26a5843f7` Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-12-23 16:11:07 -05:00
Christophe JAILLET	28734e8f69	NTB: Fix 'request_irq()' and 'free_irq()' inconsistancy 'request_irq()' and 'free_irq()' should have the same 'dev_id'. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-12-23 16:11:03 -05:00
Dave Jiang	09e71a6f13	ntb: fix SKX NTB config space size register offsets The offsets for the SZ registers are wrong. Updated. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reported-by: Sandeep Mann <sandeep@purestorage.com> Tested-by: Zachary Ross <zacharyx.ross@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-12-23 16:10:54 -05:00
Shyam Sundar S K	b17faba03f	ntb_transport: Limit memory windows based on available, scratchpads When the underlying NTB H/W driver advertises more memory windows than the number of scratchpads available to setup MW's, it is likely that we may end up filling the remaining memory windows with garbage. So to avoid that, lets limit the memory windows that transport driver can setup based on the available scratchpads. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-12-23 16:10:35 -05:00
Shyam Sundar S K	872deb2103	NTB: Register and offset values fix for memory window Due to incorrect limit and translation register values, NTB link was going down when the memory window was setup. Made appropriate changes as per spec. Fix limit register values for BAR1, which was overlapping with the BAR23 address. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-12-23 16:09:18 -05:00
Xiangliang Yu	e5b0d2d1ba	NTB: add support for hotplug feature AMD NTB support hotplug under B2B mode. NTB will trigger link up/down interrupt event when doing plug add/remove, this patch implements the two interrupt event to support B2B hotplug function. Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-12-23 16:09:15 -05:00
Dave Jiang	783dfa6cc4	ntb: Adding Skylake Xeon NTB support The Skylake Xeon NTB hardware has made some changes to the register name, offset, and the way doorbells work. Adding driver support for the new hardware. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-12-23 16:09:10 -05:00
Dan Carpenter	819baf8859	ntb_perf: potential info leak in debugfs This is a static checker warning, not something I'm desperately concerned about. But snprintf() returns the number of bytes that would have been copied if there were space. We really care about the number of bytes that actually were copied so we should use scnprintf() instead. It probably won't overrun, and in that case we may as well just use sprintf() but these sorts of things make static checkers and code reviewers happier. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-11-13 16:48:30 -05:00
Dave Jiang	25ea9f2bf5	ntb: ntb_hw_intel: init peer_addr in struct intel_ntb_dev The peer_addr member of intel_ntb_dev is not set, therefore when acquiring ntb_peer_db and ntb_peer_spad we only get the offset rather than the actual physical address. Adding fix to correct that. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-11-13 16:48:29 -05:00
Nicholas Mc Guire	cdc08982a5	ntb: make DMA_OUT_RESOURCE_TO HZ independent schedule_timeout_* takes a timeout in jiffies but the code currently is passing in a constant which makes this timeout HZ dependent, so pass it through msecs_to_jiffies() to fix this up. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-11-13 16:48:29 -05:00
Nicholas Mc Guire	c0a88032ef	ntb_transport: make DMA_OUT_RESOURCE_TO HZ independent schedule_timeout_* takes a timeout in jiffies but the code currently is passing in a constant which makes this timeout HZ dependent, so pass it through msecs_to_jiffies() to fix this up. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-11-13 16:48:29 -05:00
Wei Yongjun	49b89de41f	NTB: ntb_hw_intel: Fix typo in module parameter descriptions Fix typo in module parameter descriptions. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-11-13 16:48:29 -05:00
Wei Yongjun	cedecbc5e0	ntb_pingpong: Fix db_init parameter description Fix 'db_init' parameter description. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-11-13 16:48:29 -05:00
Dave Jiang	72203572af	ntb: add DMA error handling for RX DMA Adding support on the rx DMA path to allow recovery of errors when DMA responds with error status and abort all the subsequent ops. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: linux-ntb@googlegroups.com Signed-off-by: Vinod Koul <vinod.koul@intel.com>	2016-08-08 08:11:42 +05:30
Dave Jiang	9cabc2691e	ntb: add DMA error handling for TX DMA Adding support on the tx DMA path to allow recovery of errors when DMA responds with error status and abort all the subsequent ops. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: linux-ntb@googlegroups.com Signed-off-by: Vinod Koul <vinod.koul@intel.com>	2016-08-08 08:11:42 +05:30
Allen Hubbe	95f1464f69	NTB: ntb_hw_intel: use local variable pdev Clean up duplicated expression by replacing it with the equivalent local variable pdev. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:34:13 -04:00
Allen Hubbe	4089527388	NTB: ntb_hw_intel: show BAR size in debugfs info It will be useful to know the hardware configured BAR size to diagnose issues with NTB memory windows. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:33:47 -04:00
Logan Gunthorpe	35539b54ac	ntb_perf: clear link_is_up flag when the link goes down. When the link goes down, the link_is_up flag did not return to false. This could have caused some subtle corner case bugs when the link goes up and down quickly. Once that was fixed, there was found to be a race if the link was brought down then immediately up. The link_cleanup work would occasionally be scheduled after the next link up event. This would cancel the link_work that was supposed to occur and leave ntb_perf in an unusable state. To fix this we get rid of the link_cleanup work and put the actions directly in the link_down event. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:08 -04:00
Logan Gunthorpe	20572ee1c5	ntb_pingpong: Add a debugfs file to get the ping count This commit adds a debugfs 'count' file to ntb_pingpong. This is so testing with ntb_pingpong can be automated beyond just checking the logs for pong messages. The count file returns a number which increments every pong. The counter can be cleared by writing a zero. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:07 -04:00
Logan Gunthorpe	bfcaa39652	ntb_tool: Add link status and files to debugfs In order to more successfully script with ntb_tool it's useful to have a link file to check the link status so that the script doesn't use the other files until the link is up. This commit adds a 'link' file to the debugfs directory which reads boolean (Y or N) depending on the link status. Writing to the file change the link state using ntb_link_enable or ntb_link_disable. A 'link_event' file is also provided so an application can block until the link changes to the desired state. If the user writes a 1, it will block until the link is up. If the user writes a 0, it will block until the link is down. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:07 -04:00
Logan Gunthorpe	717146a2a8	ntb_tool: Postpone memory window initialization for the user In order to make the interface closer to the raw NTB API, this commit changes memory windows so they are not initialized on link up. Instead, the 'peer_trans' debugfs files are introduced. When read, they return information provided by ntb_mw_get_range. When written, they create a buffer and initialize the memory window. The value written is taken as the requested size of the buffer (which is then rounded for alignment). Writing a value of zero frees the buffer and tears down the memory window translation. The 'peer_mw' file is only created once the memory window translation is setup by the user. Additionally, it was noticed that the read and write functions for the 'peer_mw*' files should have checked for a NULL pointer. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:07 -04:00
Logan Gunthorpe	26dc638ae6	ntb_perf: Wait for link before running test Instead of returning immediately with an error when the link is down, wait for the link to come up (or the user sends a SIGINT). This is to make scripting ntb_perf easier. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:07 -04:00
Logan Gunthorpe	58fd0f3b15	ntb_perf: Return results by reading the run file Instead of having to watch logs, allow the results to be retrieved by reading back the run file. This file will return "running" when the test is running and nothing if no tests have been run yet. It returns 1 line per thread, and will display an error message if the corresponding thread returns an error. With the above change, the pr_info calls that returned the results are then changed to pr_debug calls. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:07 -04:00
Logan Gunthorpe	da573eaa3a	ntb_perf: Improve thread handling to increase robustness This commit accomplishes a few things: 1) Properly prevent multiple sets of threads from running at once using a mutex. Lots of race issues existed with the thread_cleanup. 2) The mutex allows us to ensure that threads are finished before tearing down the device or module. 3) Don't use kthread_stop when the threads can exit by themselves, as this is counter-indicated by the kthread_create documentation. Threads now wait for kthread_stop to occur. 4) Writing to the run file now blocks until the threads are complete. The test can then be safely interrupted by a SIGINT. Also, while I was at it: 5) debugfs_run_write shouldn't return 0 in the early check cases as this could cause debugfs_run_write to loop undesirably. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:06 -04:00
Logan Gunthorpe	fd2ecd885b	ntb_perf: Schedule based on time not on performance When debugging performance problems, if some issue causes the ntb hardware to be significantly slower than expected, ntb_perf will hang requiring a reboot because it only schedules once every 4GB. Instead, schedule based on jiffies so it will not hang the CPU if the transfer is slow. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:06 -04:00
Logan Gunthorpe	19645a0771	ntb_transport: Check the number of spads the hardware supports I'm working on hardware that currently has a limited number of scratchpad registers and ntb_ndev fails with no clue as to why. I feel it is better to fail early and provide a reasonable error message then to fail later on. The same is done to ntb_perf, but it doesn't currently require enough spads to actually fail. I've also removed the unused SPAD_MSG and SPAD_ACK enums so that MAX_SPAD accurately reflects the number of spads used. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:06 -04:00
Logan Gunthorpe	8b71d28506	ntb_tool: Add memory window debug support We allocate some memory window buffers when the link comes up, then we provide debugfs files to read/write each side of the link. This is useful for debugging the mapping when writing new drivers. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:06 -04:00
Logan Gunthorpe	4aae977721	ntb_perf: Allow limiting the size of the memory windows On my system, dma_alloc_coherent won't produce memory anywhere near the size of the BAR. So I needed a way to limit this. It's pretty much copied straight from ntb_transport. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:06 -04:00
Dave Jiang	a754a8fcaf	NTB: allocate number transport entries depending on size of ring size Currently we only allocate a fixed default number of descriptors for the tx and rx side. We should dynamically resize it to the number of descriptors resides in the transport rings. We should know the number of transmit descriptors at initializaiton. We will allocate the default number of descriptors for receive side and allocate additional ones when we know the actual max entries for receive. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <allen.hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:21:05 -04:00
Logan Gunthorpe	625f0802e8	ntb_tool: BUG: Ensure the buffer size is large enough to return all spads On hardware with 32 scratchpad registers the spad field in ntb tool could chop off the end. The maximum buffer size is increased from 256 to 15 times the number or scratchpads. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:05:31 -04:00
Logan Gunthorpe	c792eba12c	ntb_tool: Fix infinite loop bug when writing spad/peer_spad file If you tried to write two spads in one line, as per the example: root@peer# echo '0 0x01010101 1 0x7f7f7f7f' > $DBG_DIR/peer_spad then the CPU would freeze in an infinite loop. This wasn't immediately obvious but 'pos' was not incrementing the buffer, so after reading the second pair of values, 'pos' would once again be 3 and it would re-read the second pair of values ad infinitum. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-08-05 10:05:31 -04:00
Allen Hubbe	4f1b50c3e3	NTB: Remove _addr functions from ntb_hw_amd Kernel zero day testing warned about address space confusion. A virtual iomem address was used where a physical address is expected. The offending functions implement an optional part of the api, so they are removed. They can be added later, after testing. Fixes: `a1b3695820` Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Acked-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-26 11:44:33 -04:00
Dave Jiang	838850ee0b	NTB: Fix incorrect clean up routine in ntb_perf The clean up routine when we failed to allocate kthread is not cleaning up all the threads, only the same one over and over again. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-21 19:28:30 -04:00
Dave Jiang	ddc8f6feec	NTB: Fix incorrect return check in ntb_perf kthread_create_no_node() returns error pointers, never NULL. Fix check so it handles error correctly. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-21 19:28:25 -04:00
Sudip Mukherjee	2572c7fb4e	ntb: fix possible NULL dereference kmalloc can fail and we should check for NULL before using the pointer returned by kmalloc. Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-17 20:52:15 -04:00
Dave Jiang	ee5f750f1c	ntb: add missing setup of translation window The perf tool is missing the setup of translation window. Adding call to setup the translation window for backed memory. Signed-off-by: John Kading <john.kading@gd-ms.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-17 20:50:15 -04:00
Dave Jiang	84f766855f	ntb: stop link work when we do not have memory Instead of keep trying to go through the init routine when we aren't able to allocate memory, we should just stop and go down. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-17 20:38:40 -04:00
Dave Jiang	e902133162	ntb: stop tasklet from spinning forever during shutdown. We can leave tasklet spinning forever if we disable the tasklet during qp shutdown and the tasklets are still being kicked off. This hopefully should avoid that race condition. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reported-by: Alex Depoutovitch <alex@pernixdata.com> Tested-by: Alex Depoutovitch <alex@pernixdata.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-17 20:38:40 -04:00
Arnd Bergmann	1985a88107	ntb: perf test: fix address space confusion The ntb driver assigns between pointers an __iomem tokens, and also casts them to 64-bit integers, which results in compiler warnings on 32-bit systems: drivers/ntb/test/ntb_perf.c: In function 'perf_copy': drivers/ntb/test/ntb_perf.c:213:10: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] vbase = (u64)(u64 )mw->vbase; ^ drivers/ntb/test/ntb_perf.c:214:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] dst_vaddr = (u64)(u64 )dst; ^ This adds __iomem annotations where needed and changes the temporary variables to iomem pointers to avoid casting them to u64. I did not see the problem in linux-next earlier, but it show showed up in 4.5-rc1. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Dave Jiang <dave.jiang@intel.com> Fixes: `8a7b6a778a` ("ntb: ntb perf tool") Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-03-17 20:38:40 -04:00
Allen Hubbe	03beaec80d	NTB: Fix macro parameter conflict with field name If the parameter given to the macro is replaced throughout the macro as it is evaluated. The intent is that the macro parameter should replace the only the first parameter to container_of(). However, the way the macro was written, it would also inadvertantly replace a structure field name. If a parameter of any other name is given to the macro, it will fail to compile, if the structure does not contain a field of the same name. At worst, it will compile, and hide improper access of an unintended field in the structure. Change the macro parameter name, so it does not conflict with the structure field name. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-01-21 19:53:10 -05:00
Xiangliang Yu	a1b3695820	NTB: Add support for AMD PCI-Express Non-Transparent Bridge This adds support for AMD's PCI-Express Non-Transparent Bridge (NTB) device on the Zeppelin platform. The driver connnects to the standard NTB sub-system interface, with modification to add hooks for power management in a separate patch. The AMD NTB device has 3 memory windows, 16 doorbell, 16 scratch-pad registers, and supports up to 16 PCIe lanes running a Gen3 speeds. Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-01-21 19:51:04 -05:00
Dave Jiang	8a7b6a778a	ntb: ntb perf tool Providing raw performance data via a tool that directly access data from NTB w/o any software overhead. This allows measurement of the hardware performance limit. In revision one we are only doing single direction CPU and DMA writes. Eventually we will provide bi-directional writes. The measurement using DMA engine for NTB performance measure does not measure the raw performance of DMA engine over NTB due to software overhead. But it should provide the peak performance through the Linux DMA driver. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-01-17 22:08:05 -05:00
Dave Jiang	8c874cc140	NTB: Address out of DMA descriptor issue with NTB The transport right now does not handle the case where we run out of DMA descriptors. We just fail when we do not succeed. Adding code to retry for a bit attempting to use the DMA engine instead of instantly fail to CPU copy. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-01-11 10:01:27 -05:00
Dave Jiang	703872c2c5	NTB: Clear property bits in BAR value The lower bits read from a BAR register will contain property bits that we do not care about. Clear those so that we can use the BAR values for limit and xlat registers. Reported-by: Conrad Meyer <cem@freebsd.org> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-01-11 09:51:17 -05:00
Jon Mason	179f912a39	NTB: ntb_process_tx error path bug The transmit overrun avoidance error path in ntb_process_tx accidentally swapped the first two values being passed to the tx_handler client. This could result in crashes in the ntb_netdev (or other out-of-tree NTB clients). Reported-by: Alex Depoutovitch <alex@pernixdata.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2016-01-11 09:51:17 -05:00
Arnd Bergmann	fdcb4b2e78	NTB: fix 32-bit compiler warning resource_size_t may be 32-bit wide on some architectures, which causes this warning when building the NTB code: drivers/ntb/ntb_transport.c: In function 'ntb_transport_link_work': drivers/ntb/ntb_transport.c:828:46: warning: right shift count >= width of type [-Wshift-count-overflow] The warning is harmless but can be avoided by using the upper_32_bits() macro. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `e26a5843f7` ("NTB: Split ntb_hw_intel and ntb_transport drivers") Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-11-08 16:24:43 -05:00
Dave Jiang	8b782fab4d	NTB: unify translation addresses There is no need for the upstream and downstream addresses to be different for the NTB configs. Go to using a single set of address. It is still possible to configure them differently using module parameter override however. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked and Tested-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-11-08 16:11:21 -05:00
Jon Mason	c92ba3c5d9	NTB: invalid buf pointer in multi-MW setups Order of operations issue with the QP Num and MW count, which would result in the receive buffer pointer being invalid if there are more than 1 MW. Corrected with parenthesis to enforce the proper order of operations. Reported-by: John I. Kading <John.Kading@gd-ms.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-11-08 16:11:21 -05:00
Sudip Mukherjee	70d4687d60	NTB: remove unused variable These variables were not used anywhere. So remove them. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-11-08 16:11:21 -05:00
Sudip Mukherjee	d4adee09fd	NTB: fix access of free-ed pointer We were accessing nt->mw_vec after freeing it. Fix the error path so that we free nt->mw_vec after we have finished using it. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-11-08 16:11:21 -05:00
Dave Jiang	04afde45e0	NTB: Fix issue where we may be accessing NULL ptr smatch detected an issue in the function ntb_transport_max_size() where we could be dereferencing a dma channel pointer when it is NULL. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-11-08 16:11:21 -05:00
Allen Hubbe	9a07826f99	NTB: Fix range check on memory window index The range check must exclude the upper bound. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:27:12 -04:00
Allen Hubbe	2aa2a77a48	NTB: Improve index handling in B2B MW workaround Check that b2b_mw_idx is in range of the number of memory windows when initializing the device. The workaround is considered to be in effect only if the device b2b_idx is exactly UINT_MAX, instead of any index past the last memory window. Only print B2B MW workaround information in debugfs if the workaround is in effect. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:27:12 -04:00
Dave Jiang	569410ca75	NTB: Use unique DMA channels for TX and RX Allocate two DMA channels, one for TX operation and one for RX operation, instead of having one DMA channel for everything. This provides slightly better performance, and also will make error handling cleaner later on. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:17:09 -04:00
Allen Hubbe	905921e748	NTB: Remove dma_sync_wait from ntb_async_rx The dma_sync_wait can hurt the performance of workloads mixed with both large and small frames. Large frames will be copied using the dma engine. Small frames will be copied by the cpu. The dma_sync_wait prevents the cpu and dma engine copying in parallel. In the period where the cpu is copying, the dma engine is stopped. The dma engine is not doing any useful work to copy large frames during that time, and the additional time to restart the dma engine for the next large frame. This will decrease the throughput for the portion of a workload with large frames. In the period where the dma engine is copying, the cpu is held up waiting for dma to complete. The small frames processing will be delayed until the dma is complete. The RX frames are completed in-order, and the processing of small frames takes very little time, so dma_sync_wait may have an insignificant impact on the respose time of frames. The more significant impact is to the system, because the delay in dma_sync_wait is implemented as busy non-blocking wait. This can prevent the delayed core from doing any useful work, even if it could be processing work for other drivers, unrelated to transport RX processing. After applying the earlier patch to fix out-of-order RX acknoledgement, the dma_sync_wait is no longer necessary. Remove it, so that cpu memcpy will proceed immediately for small frames, in parallel with ongoing dma for large frames. Do not hold up the cpu from doing work while dma is in progress. The prior fix will continue to ensure in-order completion of the RX frames to the upper layer, and in-order delivery of the RX acknoledgement. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:17:08 -04:00
Dave Jiang	d98ef99e37	NTB: Clean up QP stats info Make QP stats info more readable for debugging purposes. Also add an entry to indicate whether DMA is being used. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:17:08 -04:00
Dave Jiang	315100004f	NTB: Make the transport list in order of discovery The list should be added from the bottom and not the top in order to ensure the transport is provided in the same order to clients as ntb devices are discovered. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:17:08 -04:00
Dave Jiang	0a5d19d9f0	NTB: Add PCI Device IDs for Broadwell Xeon Adding PCI Device IDs for B2B (back to back), RP (root port, primary), and TB (transparent bridge, secondary) devices. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:17:08 -04:00
Dave Jiang	e74bfeedad	NTB: Add flow control to the ntb_netdev Right now if we push the NTB really hard, we start dropping packets due to not able to process the packets fast enough. We need to st:qop the upper layer from flooding us when that happens. A timer is necessary in order to restart the queue once the resource has been processed on the receive side. Due to the way NTB is setup, the resources on the tx side are tied to the processing of the rx side and there's no async way to know when the rx side has released those resources. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>	2015-09-07 15:17:08 -04:00

1 2 3 4 5

249 Commits