LNEXT processing accounts for ~15% of total cpu time in end-to-end
tty i/o; factor the lnext test/clear from the per-char i/o path.
Instead, attempt to immediately handle the literal next char if not
at the end of this received buffer; otherwise, handle the first char
of the next received buffer as the literal next char, then continue
with normal i/o.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Always pre-figure the space available in the read_buf and limit
the inbound receive request to that amount.
For compatibility reasons with the non-flow-controlled interface,
n_tty_receive_buf() will continue filling read_buf until all data
has been received or receive_room() returns 0.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert to modal receive_buf processing; factor char receive
processing for unusual termios settings out of normal per-char
i/o path.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Factor 'special' per-char processing into standalone fn,
n_tty_receive_char_special(), which handles processing for chars
marked in the char_map.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Relocate the IXANY restart tty test to code paths where the
the received char is not START_CHAR, STOP_CHAR, INTR_CHAR,
QUIT_CHAR or SUSP_CHAR.
Fixes the condition when ISIG if off and one of INTR_CHAR,
QUIT_CHAR or SUSP_CHAR does not restart i/o.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Simplify __receive_buf() into a dispatch function; perform per-char
processing for all other modes not already handled.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 20bafb3d23
'n_tty: Move buffers into n_tty_data'
broke the ppc64 build.
Include vmalloc.h for the required function declarations.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert to modal receive_buf() processing; factor receive char
processing when tty->closing into n_tty_receive_buf_closing().
Note that EXTPROC when ISTRIP or IUCLC is set continues to be
handled by n_tty_receive_char().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When EXTPROC is set without ISTRIP or IUCLC, processing is
identical to raw mode; handle this receiving mode as a special-case
of raw mode.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert to modal receive_buf() processing; factor raw mode
per-char i/o into n_tty_receive_buf_raw().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prepare for modal receive_buf() handling; factor handling for
TTY_BREAK, TTY_PARITY, TTY_FRAME and TTY_OVERRUN into
n_tty_receive_char_flagged().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reduce the monolithic n_tty_receive_char() complexity; factor the
handling of INTR_CHAR, QUIT_CHAR and SUSP_CHAR into
n_tty_receive_signal_char().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The char and flag buffer local alias pointers, p and f, are
unnecessary; remove them.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In canonical mode, an EOF which is not the first character of the line
causes read() to complete and return the number of characters read so
far (commonly referred to as EOF push). However, if the previous read()
returned because the user buffer was full _and_ the next character
is an EOF not at the beginning of the line, read() must not return 0,
thus mistakenly indicating the end-of-file condition.
The TTY_PUSH flag is used to indicate an EOF was received which is not
at the beginning of the line. Because the EOF push condition is
evaluated by a thread other than the read(), multiple EOF pushes can
cause a premature end-of-file to be indicated.
Instead, discover the 'EOF push as first read character' condition
from the read() thread itself, and restart the i/o loop if detected.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Separate the head & commit indices from the tail index to avoid
cache-line contention (so called 'false-sharing') between concurrent
threads.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since neither echo_commit nor echo_tail can change for the duration
of __process_echoes loop, substitute index comparison for the
snapshot counter.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Don't have the driver flush received echoes if no echoes were
actually output.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Byte-by-byte echo output is painfully slow, requiring a lock/unlock
cycle for every input byte.
Instead, perform the echo output in blocks of 256 characters, and
at least once per flip buffer receive. Enough space is reserved in
the echo buffer to guarantee a full block can be saved without
overrunning the echo output. Overrun is prevented by discarding
the oldest echoes until enough space exists in the echo buffer
to receive at least a full block of new echoes.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use output_lock mutex as a memory barrier when storing echo_commit.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adding data to echo_buf (via add_echo_byte()) is guaranteed to be
single-threaded, since all callers are from the n_tty_receive_buf()
path. Processing the echo_buf can be called from either the
n_tty_receive_buf() path or the n_tty_write() path; however, these
callers are already serialized by output_lock.
Publish cumulative echo_head changes to echo_commit; process echo_buf
from echo_tail to echo_commit; remove echo_lock.
On echo_buf overrun, claim output_lock to serialize changes to
echo_tail.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prepare for lockless echo_buf handling; compute current byte count
of echo_buf from head and tail indices.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of using a single index to track the current echo_buf position,
use a head index when adding to the buffer and a tail index when
consuming from the buffer. Allow these head and tail indices to wrap
at max representable value; perform modulo reduction via helper
functions when accessing the buffer.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The echo_overrun field is only assigned and never tested; remove it.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Scheduling buffer work on the same cpu as the read() thread
limits the parallelism now possible between the receive_buf path
and the n_tty_read() path.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pty driver forces ldisc flow control on, regardless of available
receive buffer space, so the writer can be woken whenever unthrottle
is called. However, this 'forced throttle' has performance
consequences, as multiple atomic operations are necessary to
unthrottle and perform the write wakeup for every input line (in
canonical mode).
Instead, short-circuit the unthrottle if the tty is a pty and perform
the write wakeup directly.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prepare to special case pty flow control; avoid forward declaration.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prepare for special handling of pty throttle/unthrottle; factor
flow control into helper functions.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prepare to factor throttle and unthrottle into helper functions;
relocate chars_in_buffer() to avoid forward declaration.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No tty driver modifies termios during throttle() or unthrottle().
Therefore, only read safety is required.
However, tty_throttle_safe and tty_unthrottle_safe must still be
mutually exclusive; introduce throttle_mutex for that purpose.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the read buffer indices are in the same cache-line, cpus will
contended over the cache-line (so called 'false sharing').
Separate the producer-published fields from the consumer-published
fields; document the locks relevant to each field.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
User-space read() can run concurrently with receiving from device;
waiting for receive_buf() to complete is not required.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
lnext escapes the next input character as a literal, and must
be reset when canonical mode changes (to avoid misinterpreting
a special character as a literal if canonical mode is changed
back again).
lnext is specifically not reset on a buffer flush so as to avoid
misinterpreting the next input character as a special character.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
n_tty has a single-producer/single-consumer input model;
use lockless publish instead.
Use termios_rwsem to exclude both consumer and producer while
changing or resetting buffer indices, eg., when flushing. Also,
claim exclusive termios_rwsem to safely retrieve the buffer
indices from a thread other than consumer or producer
(eg., TIOCINQ ioctl).
Note the read_tail is published _after_ clearing the newline
indicator in read_flags to avoid racing the producer.
Drop read_lock spinlock.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
canon_data represented the # of lines which had been copied
to the receive buffer but not yet copied to the user buffer.
The value was tested to determine if input was available in
canonical mode (and also to force input overrun if the
receive buffer was full but a newline had not been received).
However, the actual count was irrelevent; only whether it was
non-zero (meaning 'is there any input to transfer?'). This
shared count is unnecessary and unsafe with a lockless algorithm.
The same check is made by comparing canon_head with read_tail instead.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use termios_rwsem to guarantee safe access to the termios values.
This is particularly important for N_TTY as changing certain termios
settings alters the mode of operation.
termios_rwsem must be dropped across throttle/unthrottle since
those functions claim the termios_rwsem exclusively (to guarantee
safe access to the termios and for mutual exclusion).
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Storing the read_cnt creates an unnecessary shared variable
between the single-producer (n_tty_receive_buf()) and the
single-consumer (n_tty_read()).
Compute read_cnt from head & tail instead of storing.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wrap read_buf indices (read_head, read_tail, canon_head) at
max representable value, instead of at the N_TTY_BUF_SIZE. This step
is necessary to allow lockless reads of these shared variables
(by updating the variables atomically).
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prepare for replacing read_cnt field with computed value.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
N_TTY .chars_in_buffer() method requires serialized access if
the current thread is not the single-consumer, n_tty_read().
Separate the internal interface; prepare for lockless read-side.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of pushing one char per loop, pre-compute the data length
to copy and copy all at once.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>