* README.txt: New.
* config.h (CYCLE_ACCURATE, CYCLE_STATS): New.
* configure.in (--enable-cycle-accurate, --enable-cycle-stats):
New.  Default to enabled.
* configure: Regenerate.

* cpu.h (regs_type): Add cycle tracking info.
(reset_pipeline_stats): Declare.
(halt_pipeline_stats): Declare.
(pipeline_stats): Declare.
* main.c (done): Call pipeline_stats().
* mem.h (rx_mem_ptr): Moved to here ...
* mem.c (mem_ptr): ... from here.  Rename throughout.
(mem_put_byte): Move LEDs to Port A.  Add Port B to control cycle
statistics.  Move UART to SCI4.
(mem_put_hi): Add TPU 1-2.  TPU 1 and 2 count CPU cycles.
* reg.c (init_regs): Set Rt reg to -1 (no reg).
* rx.c: Add cycle counting and statistics throughout.
(rx_get_byte): Optimize for speed.
(decode_opcode): Likewise.
(reset_pipeline_stats): New.
(halt_pipeline_stats): New.
(pipeline_stats): New.
* trace.c (sim_disasm_one): Print cycle count.

[include/opcode]
* rx.h (RX_Opcode_ID): Add nop2 and nop3 for statistics.
This commit is contained in:
DJ Delorie 2010-07-28 21:58:22 +00:00
parent d61e002c14
commit 933786524e
14 changed files with 2365 additions and 749 deletions

View File

@ -1,3 +1,7 @@
2010-07-27 DJ Delorie <dj@redhat.com>
* rx.h (RX_Opcode_ID): Add nop2 and nop3 for statistics.
2010-07-23 Naveen.H.S <naveen.S@kpitcummins.com>
Ina Pandit <ina.pandit@kpitcummins.com>

View File

@ -57,7 +57,6 @@ typedef enum
RXO_movbir, /* [s,s2] = d (signed) */
RXO_pushm, /* s..s2 */
RXO_popm, /* s..s2 */
RXO_pusha, /* &s */
RXO_xchg, /* s <-> d */
RXO_stcc, /* d = s if cond(s2) */
RXO_rtsd, /* rtsd, 1=imm, 2-0 = reg if reg type */
@ -98,6 +97,8 @@ typedef enum
RXO_jsrrel, /* pc += d */
RXO_rts,
RXO_nop,
RXO_nop2,
RXO_nop3,
RXO_scmpu,
RXO_smovu,

View File

@ -1,3 +1,30 @@
2010-07-27 DJ Delorie <dj@redhat.com>
* README.txt: New.
* config.h (CYCLE_ACCURATE, CYCLE_STATS): New.
* configure.in (--enable-cycle-accurate, --enable-cycle-stats):
New. Default to enabled.
* configure: Regenerate.
* cpu.h (regs_type): Add cycle tracking info.
(reset_pipeline_stats): Declare.
(halt_pipeline_stats): Declare.
(pipeline_stats): Declare.
* main.c (done): Call pipeline_stats().
* mem.h (rx_mem_ptr): Moved to here ...
* mem.c (mem_ptr): ... from here. Rename throughout.
(mem_put_byte): Move LEDs to Port A. Add Port B to control cycle
statistics. Move UART to SCI4.
(mem_put_hi): Add TPU 1-2. TPU 1 and 2 count CPU cycles.
* reg.c (init_regs): Set Rt reg to -1 (no reg).
* rx.c: Add cycle counting and statistics throughout.
(rx_get_byte): Optimize for speed.
(decode_opcode): Likewise.
(reset_pipeline_stats): New.
(halt_pipeline_stats): New.
(pipeline_stats): New.
* trace.c (sim_disasm_one): Print cycle count.
2010-07-07 Kevin Buettner <kevinb@redhat.com>
* gdb-if.c (sim_store_register): Add case for sim_rx_acc_regnum.

121
sim/rx/README.txt Normal file
View File

@ -0,0 +1,121 @@
The RX simulator offers two rx-specific configure options:
--enable-cycle-accurate (default)
--disable-cycle-accurate
If enabled, the simulator will keep track of how many cycles each
instruction takes. While not 100% accurate, it is very close,
including modelling fetch stalls and register latency.
--enable-cycle-stats (default)
--disable-cycle-stats
If enabled, specifying "-v" twice on the simulator command line causes
the simulator to print statistics on how much time was used by each
type of opcode, and what pairs of opcodes tend to happen most
frequently, as well as how many times various pipeline stalls
happened.
The RX simulator offers many command line options:
-v - verbose output. This prints some information about where the
program is being loaded and its starting address, as well as
information about how much memory was used and how many instructions
were executed during the run. If specified twice, pipeline and cycle
information are added to the report.
-d - disassemble output. Each instruction executed is printed.
-t - trace output. Causes a *lot* of printed information about what
every instruction is doing, from math results down to register
changes.
--ignore-*
--warn-*
--error-*
The RX simulator can detect certain types of memory corruption, and
either ignore them, warn the user about them, or error and exit.
Note that valid GCC code may trigger some of these, for example,
writing a bitfield involves reading the existing value, which may
not have been set yet. The options for * are:
null-deref - memory access to address zero. You must modify your
linker script to avoid putting anything at location zero, of
course.
unwritten-pages - attempts to read a page of memory (see below)
before it is written. This is much faster than the next option.
unwritten-bytes - attempts to read individual bytes before they're
written.
corrupt-stack - On return from a subroutine, the memory location
where $pc was stored is checked to see if anything other than
$pc had been written to it most recently.
-i -w -e - these three options change the settings for all of the
above. For example, "-i" tells the simulator to ignore all memory
corruption.
-E - end of options. Any remaining options (after the program name)
are considered to be options for the simulated program, although
such functionality is not supported.
The RX simulator simulates a small number of peripherals, mostly in
order to provide I/O capabilities for testing and such. The supported
peripherals, and their limitations, are documented here.
Memory
Memory for the simulator is stored in a hierarchical tree, much like
the i386's page directory and page tables. The simulator can allocate
memory to individual pages as needed, allowing the simulated program
to act as if it had a full 4 Gb of RAM at its disposal, without
actually allocating more memory from the host operating system than
the simulated program actually uses. Note that for each page of
memory, there's a corresponding page of memory *types* (for tracking
memory corruption). Memory is initially filled with all zeros.
GPIO Port A
PA.DR is configured as an output-only port (regardless of PA.DDR).
When written to, a row of colored @ and * symbols are printed,
reflecting a row of eight LEDs being either on or off.
GPIO Port B
PB.DR controls the pipeline statistics. Writing a 0 to PB.DR disables
statistics gathering. Writing a non-0 to PB.DR resets all counters
and enables (even if already enabled) statistics gathering. The
simulator starts with statistics enabled, so writing to PB.DR is not
needed if you want statistics on the entire program's run.
SCI4
SCI4.TDR is connected to the simulator's stdout. Any byte written to
SCI4.TDR is written to stdout. If the simulated program writes the
bytes 3, 3, and N in sequence, the simulator exits with an exit value
of N.
SCI4.SSR always returns "transmitter empty".
TPU1.TCNT
TPU2.TCNT
TPU1 and TPU2 are configured as a chained 32-bit counter which counts
machine cycles. It always runs at "ICLK speed", regardless of the
clock control settings. Writing to either of these 16-bit registers
zeros the counter, regardless of the value written. Reading from
these registers returns the elapsed cycle count, with TPU1 holding the
most significant word and TPU2 holding the least significant word.
Note that, much like the hardware, these values may (TPU2.CNT *will*)
change between reads, so you must read TPU1.CNT, then TPU2.CNT, and
then TPU1.CNT again, and only trust the values if both reads of
TPU1.CNT were the same.

View File

@ -105,3 +105,9 @@
/* Define to 1 if you have the ANSI C header files. */
#undef STDC_HEADERS
/* --enable-cycle-accurate */
#undef CYCLE_ACCURATE
/* --enable-cycle-stats */
#undef CYCLE_STATS

1891
sim/rx/configure vendored

File diff suppressed because it is too large Load Diff

View File

@ -25,6 +25,36 @@ AC_CHECK_HEADERS(getopt.h)
sinclude(../common/aclocal.m4)
AC_ARG_ENABLE(cycle-accurate,
[ --disable-cycle-accurate ],
[case "${enableval}" in
yes | no) ;;
*) AC_MSG_ERROR(bad value ${enableval} given for --enable-cycle-accurate option) ;;
esac])
AC_ARG_ENABLE(cycle-stats,
[ --disable-cycle-stats ],
[case "${enableval}" in
yes | no) ;;
*) AC_MSG_ERROR(bad value ${enableval} given for --enable-cycle-stats option) ;;
esac])
echo enable_cycle_accurate is $enable_cycle_accurate
echo enable_cycle_stats is $enable_cycle_stats
if test "x${enable_cycle_accurate}" != xno; then
AC_DEFINE([CYCLE_ACCURATE])
if test "x${enable_cycle_stats}" != xno; then
AC_DEFINE([CYCLE_STATS])
fi
else
if test "x${enable_cycle_stats}" != xno; then
AC_ERROR([cycle-stats not available without cycle-accurate])
fi
fi
# Bugs in autoconf 2.59 break the call to SIM_AC_COMMON, hack around
# it by inlining the macro's contents.
sinclude(../common/common.m4)

View File

@ -76,8 +76,24 @@ typedef struct
SI r_temp;
DI r_acc;
#ifdef CYCLE_ACCURATE
/* If set, RTS/RTSD take 2 fewer cycles. */
char fast_return;
SI link_register;
unsigned long long cycle_count;
/* Bits saying what kind of memory operands the previous insn had. */
int m2m;
/* Target register for load. */
int rt;
#endif
} regs_type;
#define M2M_SRC 0x01
#define M2M_DST 0x02
#define M2M_BOTH 0x03
#define sp 0
#define psw 16
#define pc 17
@ -219,6 +235,9 @@ extern unsigned int heaptop;
extern unsigned int heapbottom;
extern int decode_opcode (void);
extern void reset_pipeline_stats (void);
extern void halt_pipeline_stats (void);
extern void pipeline_stats (void);
extern void trace_register_changes ();
extern void generate_access_exception (void);

View File

@ -82,6 +82,8 @@ done (int exit_code)
printf ("insns: %14s\n", comma (rx_cycles));
else
printf ("insns: %u\n", rx_cycles);
pipeline_stats ();
}
exit (exit_code);
}

View File

@ -25,6 +25,7 @@ along with this program. If not, see <http://www.gnu.org/licenses/>. */
1. */
#define RDCHECK 0
#include "config.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@ -37,7 +38,7 @@ along with this program. If not, see <http://www.gnu.org/licenses/>. */
#define L1_BITS (10)
#define L2_BITS (10)
#define OFF_BITS (12)
#define OFF_BITS PAGE_BITS
#define L1_LEN (1 << L1_BITS)
#define L2_LEN (1 << L2_BITS)
@ -70,15 +71,8 @@ init_mem (void)
memset (mem_counters, 0, sizeof (mem_counters));
}
enum mem_ptr_action
{
MPA_WRITING,
MPA_READING,
MPA_CONTENT_TYPE
};
static unsigned char *
mem_ptr (unsigned long address, enum mem_ptr_action action)
unsigned char *
rx_mem_ptr (unsigned long address, enum mem_ptr_action action)
{
int pt1 = (address >> (L2_BITS + OFF_BITS)) & ((1 << L1_BITS) - 1);
int pt2 = (address >> OFF_BITS) & ((1 << L2_BITS) - 1);
@ -240,7 +234,7 @@ e ()
static char
mtypec (int address)
{
unsigned char *cp = mem_ptr (address, MPA_CONTENT_TYPE);
unsigned char *cp = rx_mem_ptr (address, MPA_CONTENT_TYPE);
return "udp"[*cp];
}
@ -254,48 +248,75 @@ mem_put_byte (unsigned int address, unsigned char value)
if (trace)
tc = mtypec (address);
m = mem_ptr (address, MPA_WRITING);
m = rx_mem_ptr (address, MPA_WRITING);
if (trace)
printf (" %02x%c", value, tc);
*m = value;
switch (address)
{
case 0x00e1:
{
case 0x0008c02a: /* PA.DR */
{
static int old_led = -1;
static char *led_on[] =
{ "\033[31m O ", "\033[32m O ", "\033[34m O " };
static char *led_off[] = { "\033[0m · ", "\033[0m · ", "\033[0m · " };
int red_on = 0;
int i;
if (old_led != value)
{
fputs (" ", stdout);
for (i = 0; i < 3; i++)
fputs (" ", stdout);
for (i = 0; i < 8; i++)
if (value & (1 << i))
fputs (led_off[i], stdout);
{
if (! red_on)
{
fputs ("\033[31m", stdout);
red_on = 1;
}
fputs (" @", stdout);
}
else
fputs (led_on[i], stdout);
fputs ("\033[0m\r", stdout);
{
if (red_on)
{
fputs ("\033[0m", stdout);
red_on = 0;
}
fputs (" *", stdout);
}
if (red_on)
fputs ("\033[0m", stdout);
fputs ("\r", stdout);
fflush (stdout);
old_led = value;
}
}
break;
case 0x3aa: /* uart1tx */
#ifdef CYCLE_STATS
case 0x0008c02b: /* PB.DR */
{
if (value == 0)
halt_pipeline_stats ();
else
reset_pipeline_stats ();
}
#endif
case 0x00088263: /* SCI4.TDR */
{
static int pending_exit = 0;
if (value == 0)
if (pending_exit == 2)
{
if (pending_exit)
{
step_result = RX_MAKE_EXITED(value);
return;
}
pending_exit = 1;
step_result = RX_MAKE_EXITED(value);
longjmp (decode_jmp_buf, 1);
}
else if (value == 3)
pending_exit ++;
else
putchar(value);
pending_exit = 0;
putchar(value);
}
break;
@ -314,19 +335,33 @@ mem_put_qi (int address, unsigned char value)
COUNT (1, 1);
}
static int tpu_base;
void
mem_put_hi (int address, unsigned short value)
{
S ("<=");
if (rx_big_endian)
switch (address)
{
mem_put_byte (address, value >> 8);
mem_put_byte (address + 1, value & 0xff);
}
else
{
mem_put_byte (address, value & 0xff);
mem_put_byte (address + 1, value >> 8);
#ifdef CYCLE_ACCURATE
case 0x00088126: /* TPU1.TCNT */
tpu_base = regs.cycle_count;
break;
case 0x00088136: /* TPU2.TCNT */
tpu_base = regs.cycle_count;
break;
#endif
default:
if (rx_big_endian)
{
mem_put_byte (address, value >> 8);
mem_put_byte (address + 1, value & 0xff);
}
else
{
mem_put_byte (address, value & 0xff);
mem_put_byte (address + 1, value >> 8);
}
}
E ();
COUNT (1, 2);
@ -388,7 +423,7 @@ mem_put_blk (int address, void *bufptr, int nbytes)
unsigned char
mem_get_pc (int address)
{
unsigned char *m = mem_ptr (address, MPA_READING);
unsigned char *m = rx_mem_ptr (address, MPA_READING);
COUNT (0, 0);
return *m;
}
@ -399,12 +434,12 @@ mem_get_byte (unsigned int address)
unsigned char *m;
S ("=>");
m = mem_ptr (address, MPA_READING);
m = rx_mem_ptr (address, MPA_READING);
switch (address)
{
case 0x3ad: /* uart1c1 */
case 0x00088264: /* SCI4.SSR */
E();
return 2; /* transmitter empty */
return 0x04; /* transmitter empty */
break;
default:
if (trace)
@ -433,15 +468,28 @@ mem_get_hi (int address)
{
unsigned short rv;
S ("=>");
if (rx_big_endian)
switch (address)
{
rv = mem_get_byte (address) << 8;
rv |= mem_get_byte (address + 1);
}
else
{
rv = mem_get_byte (address);
rv |= mem_get_byte (address + 1) << 8;
#ifdef CYCLE_ACCURATE
case 0x00088126: /* TPU1.TCNT */
rv = (regs.cycle_count - tpu_base) >> 16;
break;
case 0x00088136: /* TPU2.TCNT */
rv = (regs.cycle_count - tpu_base) >> 0;
break;
#endif
default:
if (rx_big_endian)
{
rv = mem_get_byte (address) << 8;
rv |= mem_get_byte (address + 1);
}
else
{
rv = mem_get_byte (address);
rv |= mem_get_byte (address + 1) << 8;
}
}
COUNT (0, 2);
E ();
@ -520,7 +568,7 @@ sign_ext (int v, int bits)
void
mem_set_content_type (int address, enum mem_content_type type)
{
unsigned char *mt = mem_ptr (address, MPA_CONTENT_TYPE);
unsigned char *mt = rx_mem_ptr (address, MPA_CONTENT_TYPE);
*mt = type;
}
@ -537,7 +585,7 @@ mem_set_content_range (int start_address, int end_address, enum mem_content_type
if (sz + ofs > L1_LEN)
sz = L1_LEN - ofs;
mt = mem_ptr (start_address, MPA_CONTENT_TYPE);
mt = rx_mem_ptr (start_address, MPA_CONTENT_TYPE);
memset (mt, type, sz);
start_address += sz;
@ -547,6 +595,6 @@ mem_set_content_range (int start_address, int end_address, enum mem_content_type
enum mem_content_type
mem_get_content_type (int address)
{
unsigned char *mt = mem_ptr (address, MPA_CONTENT_TYPE);
unsigned char *mt = rx_mem_ptr (address, MPA_CONTENT_TYPE);
return *mt;
}

View File

@ -25,10 +25,25 @@ enum mem_content_type {
MC_NUM_TYPES
};
enum mem_ptr_action
{
MPA_WRITING,
MPA_READING,
MPA_CONTENT_TYPE
};
void init_mem (void);
void mem_usage_stats (void);
unsigned long mem_usage_cycles (void);
/* rx_mem_ptr returns a pointer which is valid as long as the address
requested remains within the same page. */
#define PAGE_BITS 12
#define PAGE_SIZE (1 << PAGE_BITS)
#define NONPAGE_MASK (~(PAGE_SIZE-1))
unsigned char *rx_mem_ptr (unsigned long address, enum mem_ptr_action action);
void mem_put_qi (int address, unsigned char value);
void mem_put_hi (int address, unsigned short value);
void mem_put_psi (int address, unsigned long value);

View File

@ -19,6 +19,7 @@
along with this program. If not, see <http://www.gnu.org/licenses/>. */
#include "config.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@ -67,6 +68,11 @@ init_regs (void)
{
memset (&regs, 0, sizeof (regs));
memset (&oldregs, 0, sizeof (oldregs));
#ifdef CYCLE_ACCURATE
regs.rt = -1;
oldregs.rt = -1;
#endif
}
static unsigned int

File diff suppressed because it is too large Load Diff

View File

@ -19,6 +19,7 @@ You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
#include "config.h"
#include <stdio.h>
#include <stdarg.h>
#include <string.h>
@ -321,7 +322,13 @@ sim_disasm_one (void)
}
opbuf[0] = 0;
printf ("\033[33m%06x: ", mypc);
#ifdef CYCLE_ACCURATE
printf ("\033[33m %04u %06x: ", (int)(regs.cycle_count % 10000), mypc);
#else
printf ("\033[33m %06x: ", mypc);
#endif
max = print_insn_rx (mypc, & info);
for (i = 0; i < max; i++)