After commit b1bbfe7 (aio / timers: On timer modification, qemu_notify
or aio_notify, 2013-08-21) FreeBSD guests report a huge slowdown.
The problem shows up as soon as FreeBSD turns out its periodic (~1 ms)
tick, but the timers are only the trigger for a pre-existing problem.
Before the offending patch, setting a timer did a timer_settime system call.
After, setting the timer exits the event loop (which uses poll) and
reenters it with a new deadline. This does not cause any slowdown; the
difference is between one system call (timer_settime and a signal
delivery (SIGALRM) before the patch, and two system calls afterwards
(write to a pipe or eventfd + calling poll again when re-entering the
event loop).
Unfortunately, the exit/enter causes the main loop to grab the iothread
lock, which in turns kicks the VCPU thread out of execution. This
causes TCG to execute the next VCPU in its round-robin scheduling of
VCPUS. When the second VCPU is mostly unused, FreeBSD runs a "pause"
instruction in its idle loop which only burns cycles without any
progress. As soon as the timer tick expires, the first VCPU runs
the interrupt handler but very soon it sets it again---and QEMU
then goes back doing nothing in the second VCPU.
The fix is to make the pause instruction do "cpu_loop_exit".
Cc: Richard Henderson <rth@twiddle.net>
Reported-by: Luigi Rizzo <rizzo@iet.unipi.it>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1384948442-24217-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
The instructions CMOVcc, FCMOVcc and F[U]COMI[P] should only be
present if the CMOV feature bit is set. Add missing feature bit
checks so we correctly fault if emulating a 486 or 586.
This fixes bug LP:1201446.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Prepares for changing cpu_single_step() argument to CPUState.
Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Also use bool type while at it.
Prepares for moving singlestep_enabled field to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
The code reorganization in commit 4a6fd938 broke handling of PREFIX_ADR.
While fixing this, tidy and comment the code so that it's more obvious
what's going on in setting both aflag and dflag.
The TARGET_X86_64 ifdef can be eliminated because CODE64 expands to the
constant zero when TARGET_X86_64 is undefined.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1369855851-21400-1-git-send-email-rth@twiddle.net
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fix EFLAGS corruption by ROR r8/r16 imm instruction located at the end
of the TB, similarly to commit 089305ac for the non-immediate case.
Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This replaces the feature-bit fields on both X86CPU and x86_def_t
structs with an array.
With this, we will be able to simplify code that simply does the same
operation on all feature words (e.g. kvm_check_features_against_host(),
filter_features_for_kvm(), add_flagname_to_bitmaps(), CPU feature-bit
property lookup/registration, and the proposed "feature-words" property)
The following field replacements were made on X86CPU and x86_def_t:
(cpuid_)features -> features[FEAT_1_EDX]
(cpuid_)ext_features -> features[FEAT_1_ECX]
(cpuid_)ext2_features -> features[FEAT_8000_0001_EDX]
(cpuid_)ext3_features -> features[FEAT_8000_0001_ECX]
(cpuid_)ext4_features -> features[FEAT_C000_0001_EDX]
(cpuid_)kvm_features -> features[FEAT_KVM]
(cpuid_)svm_features -> features[FEAT_SVM]
(cpuid_)7_0_ebx_features -> features[FEAT_7_0_EBX]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Fixed EFLAGS corruption by ROR r8/r16 instruction located at the end of the TB.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
gen_op_mov_TN_reg() loads the value in cpu_T[0], so this temporary should
be used instead of cpu_tmp0.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When starting from CC_OP_DYNAMIC, and issuing adox before adcx,
a typo used the wrong value for the resulting CC_OP.
Cc: Blue Swirl <blauwirbel@gmail.com>
Reported-by: Torbjorn Granlund <tg@gmplib.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fix various typos and misspellings. The bulk of these were found with
codespell.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The gen_icount_start/end functions are now somewhat misnamed since they
are useful for generic "start/end of TB" code, used for more than just
icount. Rename them to gen_tb_start/end.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
These correspond very closely to the insns that we're emulating.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The shift and rotate insns use movcond to set CC_OP, and thus
achieve a conditional EFLAGS setting. By discarding CC_OP in
a later flags setting insn, we can discard that movcond.
Signed-off-by: Richard Henderson <rth@twiddle.net>
We weren't computing flags for lzcnt at all. At the same time,
adjust the implementation of bsf/bsr to avoid the local branch,
using movcond instead.
Signed-off-by: Richard Henderson <rth@twiddle.net>
As this is the first of the BMI insns to be implemented,
this carries quite a bit more baggage than normal.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Add another slot in ENV and store two of the three inputs. This lets us
do less work when carry-out is not needed, and avoids the unpredictable
CC_OP after translating these insns.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Pass the data in explicitly, rather than indirectly via env.
This avoids all sorts of unnecessary register spillage.
Signed-off-by: Richard Henderson <rth@twiddle.net>
After a comparison or subtraction, the original value of the LHS will
currently be reconstructed using an addition. However, in most cases
it is already available: store it in a temp-local variable and save 1
or 2 TCG ops (2 if the result of the addition needs to be extended).
The temp-local can be declared dead as soon as the cc_op changes again,
or also before the translation block ends because gen_prepare_cc will
always make a copy before returning it. All this magic, plus copy
propagation and dead-code elimination, ensures that the temp local will
(almost) never be spilled.
Example (cmp $0x21,%rax + jbe):
Before After
----------------------------------------------------------------------------
movi_i64 tmp1,$0x21 movi_i64 tmp1,$0x21
movi_i64 cc_src,$0x21 movi_i64 cc_src,$0x21
sub_i64 cc_dst,rax,tmp1 sub_i64 cc_dst,rax,tmp1
add_i64 tmp7,cc_dst,cc_src
movi_i32 cc_op,$0x11 movi_i32 cc_op,$0x11
brcond_i64 tmp7,cc_src,leu,$0x0 discard loc11
brcond_i64 rax,cc_src,leu,$0x0
Before After
----------------------------------------------------------------------------
mov (%r14),%rbp mov (%r14),%rbp
mov %rbp,%rbx mov %rbp,%rbx
sub $0x21,%rbx sub $0x21,%rbx
lea 0x21(%rbx),%r12
movl $0x11,0xa0(%r14) movl $0x11,0xa0(%r14)
movq $0x21,0x90(%r14) movq $0x21,0x90(%r14)
mov %rbx,0x98(%r14) mov %rbx,0x98(%r14)
cmp $0x21,%r12 | cmp $0x21,%rbp
jbe ... jbe ...
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Placing the CC_OP_DYNAMIC at the join is less effective than
before the branch, as the branch will have forced global registers
to their home locations. This way we have a chance to discard
CC_SRC2 before it gets stored.
Signed-off-by: Richard Henderson <rth@twiddle.net>
A jump that ends a basic block or otherwise falls back to CC_OP_DYNAMIC
will always have to call gen_op_set_cc_op. However, not all jumps end
a basic block, so introduce a variant that does not do this.
This was partially undone earlier (i386: drop cc_op argument of gen_jcc1),
redo it now also to prepare for the introduction of src2.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Replace low-level ops with a higher-level "cmp %al, (A0)" in the case
of scas, and "cmp T0, (A0)" in the case of cmps.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It is almost unused, and it is simpler to pass a TCG value directly
to gen_shiftd_rm_T1_T3. This value is then written to t2 without
going through a temporary register.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This simplifies all the jump generation code. CCPrepare allows the
code to create an efficient brcond always, so there is no need to
duplicate the setcc and jcc code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This makes the i386 front-end able to create CCPrepare structs for all
condition, not just those that come from a single flag. In particular,
JCC_L and JCC_LE can be optimized because gen_prepare_cc is not forced
to return a result in bit 0 (unlike gen_setcc_slow).
However, for now the slow jcc operations will still go through CC
computation in a single-bit temporary, followed by a brcond if the
temporary is nonzero.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>