Hi all,
Beginning from commits close to v2.6.25-rc2, running lguest always oopses
the host kernel. Oops is at [1].
Bisection led to the following commit:
commit 37cc8d7f96
x86/early_ioremap: don't assume we're using swapper_pg_dir
At the early stages of boot, before the kernel pagetable has been
fully initialized, a Xen kernel will still be running off the
Xen-provided pagetables rather than swapper_pg_dir[]. Therefore,
readback cr3 to determine the base of the pagetable rather than
assuming swapper_pg_dir[].
static inline pmd_t * __init early_ioremap_pmd(unsigned long addr)
{
- pgd_t *pgd = &swapper_pg_dir[pgd_index(addr)];
+ /* Don't assume we're using swapper_pg_dir at this point */
+ pgd_t *base = __va(read_cr3());
+ pgd_t *pgd = &base[pgd_index(addr)];
pud_t *pud = pud_offset(pgd, addr);
pmd_t *pmd = pmd_offset(pud, addr);
Trying to analyze the problem, it seems on the guest side of lguest,
%cr3 has a different value from &swapper_pg-dir (which
is AFAIK fine on a pravirt guest):
Putting some debugging messages in early_ioremap_pmd:
/* Appears 3 times */
[ 0.000000] ***************************
[ 0.000000] __va(%cr3) = c0000000, &swapper_pg_dir = c02cc000
[ 0.000000] ***************************
After 8 hours of debugging and staring on lguest code, I noticed something
strange in paravirt_ops->set_pmd hypercall invocation:
static void lguest_set_pmd(pmd_t *pmdp, pmd_t pmdval)
{
*pmdp = pmdval;
lazy_hcall(LHCALL_SET_PMD, __pa(pmdp)&PAGE_MASK,
(__pa(pmdp)&(PAGE_SIZE-1))/4, 0);
}
The first hcall parameter is global pgdir which looks fine. The second
parameter is the pmd index in the pgdir which is suspectful.
AFAIK, calculating the index of pmd does not need a divisoin over four.
Removing the division made lguest work fine again . Patch is at [2].
I am not sure why the division over four existed in the first place. It
seems bogus, maybe the Xen patch just made the problem appear ?
[2]: The patch:
[PATCH] lguest: fix pgdir pmd index cacluation
Remove an error in index calculation which leads to removing
a not existing shadow page table (leading to a Null dereference).
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>