[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Sheflug] Re: [Sheflug] random crashes - how to prepare bug report?



>
> My linux machine is crashing randomly once every couple of days - it
> freezes up and will not respond to anything (including ctrl-alt-del, or
> ping from another machine) except the on/off switch.  The load on the
> machine is light, and the work it is doing is not particularly unusual.
>

Hmm...kernel 2.4.7-10...that'll be a redhat-modified kernel then. I'm running
2.4.13 and on a couple of occasions now its frozen solid, responding only to
the reset switch :-/ The unusual thing is that its fine if I use it every
day...on the couple of occasions I've left it running whilst I've been
elsewhere (eg, if I leave it running over the weekend when I'm not about to
use it), I've had it freeze ... which totally baffles me.

> (1) Can anyone suggest how I could gather useful information about what
> is
>    going on?
>
>  I put a line like this in /etc/syslog.conf:
>
>  *.debug;mail.none;authpriv.none;cron.none  /var/log/messages
>
>  As far as I understand it, this should get all possible debugging
>  information out of syslogd, although I'm not completely clear whether
>  any more could be squeezed out of klogd.  In any case, I'm not getting
>  any messages around the time of a crash.  I've also turned on all the
>  logging options that I can find in the processes that I am running,

Yep; this line will get all kernel messages, but if you want *everything*,
then just saying:

*.debug     /var/log/messages

will have the desired effect. Depending on mail, cron and authentication
activity though this could grow your log file a lot :) But you may get an
idea of whats going on before the freeze.

It may also be an idea to run syslogd as, say, "syslogd -m 10" ... this will
place "-- MARK --" entries in your messages file every 10 minutes (or <x>
minutes if you choose another number). You can use this do find out at what
time it freezes (this is on the assumption it freezes when your not at your
machine) and can look to see if its at the same time, and find out what was
running at that time, using cron and syslog as a guide.

>  without any helpful effect.
>
> (2) If I can get any usable information about the problem, does anyone
>    know where I should send it?
>
>  If I knew that it was a kernel problem, I'd try the linux-kernel mailing
>  list.  But that looks pretty intimidating, so I'd want to be sure I knew
>  what I was talking about first!  Also, I guess that some kind of
>  hardware problem is more likely.
>

Linux kernel mailing list. Theres a couple of people reporting freezes. I may
add mine to it as well soon, though I'm waiting for 2.4.17 to come out
(currently at -pre7) so I can check thats okay first (problem may have been
fixed in 2.4.x (x > 13).

> I'm using Red Hat 7.2, which includes the 2.4.7-10 kernel, on a
> machine with an Intel Pentium 4 CPU running at 1.5 GHz and 512M of RAM.
> Crashes occur even when I am not running X and no users are
> logged on.  The main process that I am running is the Jakarta Tomcat web
> server, which runs a Java servlet, which runs the symbolic
> mathematics program Maple as an external process.  As far as I can tell
> from the logs, when the last crash occurred, there had been no request to
> the web server for some time.  It's just possible that a request
> triggered the crash, which prevented the request from being logged, but I
> doubt it.

Well there's not a lot one can do when a box stops solid. There are other
things I've seen mentioned, including setting a terminal up on the serial
port so you can do some debugging that way, but I've not tried owt like that
yet :)

Seeing as you're toying with a request to Tomcat, if you have lots and lots
and lots of diskspace, you could do a "tcpdump -w <file>" and write all the
network traffic to a spool so you can analyse that after the machine comes
back up. This could eat up lots of disk space though, so only do if you can
afford the space and/or are really sure it is a network problem.

Chris...

-- 
\ Chris Johnson           \
 \ cej [at] nccnet.co.uk        \
  \ www.nccnet.co.uk/~cej/  ~-----------------------------------------+
   \ Redclaw chat - http://redclaw.org.uk - telnet redclaw.org.uk 2000 \____



___________________________________________________________________

Sheffield Linux User's Group - http://www.sheflug.co.uk . 
To unsubscribe from this list send mail to 
shef-lug-request@list.sheflug.org.uk with the word
"unsubscribe" in the body of the message. 

  GNU the choice of a complete generation.