Apache Segmentation Faults
If you see an error like this:
[Fri Feb 1 01:00:00 2011] [notice] child pid 12345 exit signal Segmentation fault (11)
kernel: grsec: From 126.96.36.199: Segmentation fault occurred at 0000003000003dbc in /usr/sbin/httpd[httpd:15804]
You are experiencing a Segmentation Fault. This is a generic error in all Linux systems, and is not caused by ASL. A Segmentation Fault, also referred to as a segfault or a "bus error" occurs when the systems hardware notifies the Operating system that a memory access violation has occurred, and the OS then notifies the application (via a signal) about this condition. Most processes, by default, will then terminate the process and, if so configured, will also "dump core". One symptom of this with apache is a "blank screen" or an empty response. The child process or thread was killed, and it didnt finish whatever it was doing.
Lots of things can cause a sefault. For example, with Apache it could be a bug in Apache itself, one of its core elements (like APR), PHP, a web application, an external application or even a library. To find the cause of a segfault you need to generate a core file and perform what is called a backtrace. Disabling pieces of apache isn't a good way to find the cause. segfaults are memory errors, and simple "I disabled this and it stopped" cause and effect is extremely misleading. This article will provide some guidance on one method you can use to do that and to find the actual cause of the segfault.
We recommend you contact your OS vendor if you encounter any segmentation faults as this could be caused by any number of components in your Operating System itself.
Core files allow you to diagnose what actually caused the memory fault on your system. Think of them as application logs, they will tell you exactly what was going on and when it stopped working correctly. To determine the cause, with a corer file, you must first configure your system to allow the creation of core files (this is usually disabled, as core files can use up a considerable amount of disk space).
Check that your system allows cores:
If you see "0" that means they are disabled. You will need to set them to unlimited (keep in mind that if Apache is using more memory that you have room for our your disk you will eat up all your disk space, and you will get a core for EACH segfault, so dont leave this on for days, watch for cores, when you get a few turn off coredump support in Apache). To make core dumps unlimited, run this as root:
ulimit -c unlimited
Then you need to configure the application to "dump core" when it encounters a segmentation faulty. For example, if the segfaults are occurring with Apache, configure Apache to dump core:
Put the following in /etc/httpd/conf.d/debug.conf
Make the directory as root, and check the permissions to make sure Apache can write to it
chown apache.apache /tmp/apache2-gdb-dump
chmod og+rwx /tmp/apache2-gdb-dump
Install the debuginfo packages for the applications you want to backtrace. This is so your core file will contain the debug information needed to assist you in tracking down the source of the segfault. For example, if you are backtracing just apache, then you need to install its debuginfo packages. If you are debugging something else, like PHP, you need to install its debug symbols and so on.
Here is an example for a package managed apache install:
yum install httpd-debuginfo
If your OS is missing a debuginfo package for the application you want to backtrace, file a bug report with them.
Note: if you have a source built apache, such as with cpanel, you will need to rebuild apache with debugging symbols. For cpanel, the following may work, but contact cpanel for assistance:
1) set the appropriate CFLAGS
2) Run easyapache
If this does not work for you, please contact your apache vendor for assistance.
Note: If you have apache modules loaded, please check to make sure they have the debug symbols installed as well. Contact your OS or control panel vendor for assistance if you are unsure.
Wait for a segfault, and when you get one generate a backtrace:
gdb /path/to/httpd /path/to/core --batch --quiet -ex "thread apply all bt full" > backtrace.log
And look at the backtrace for the cause. Most often the cause is a bug in Apache, or one of its supporting libraries. Please make sure you share your backtrace with your Apache vendor first so they can rule out if this an Apache bug. So far, we've only see apache bugs that cause segfaults, so if you want your segfault resolved quickly contact your Apache vendor first. They can tell you if the issue is with Apache, or with mod_security. If the issue is with Apache, we can't help you fix that.
Keep in mind that memory use changes do not cause segfaults. So if you have a webapp, for example, that uses a lot of memory and you get segfaults unfortunately the memory use isnt the cause. Its just correlational, not causational. More memory in use means more opportunities for the fault to occur. It is not the cause, it just increases the probability not the possibility. So when you have more memory in use and you segfault, and less and you don't segfault - the higher use of memory is not the cause of the segfault. Think of that as a warning that you have something else wrong. You can rule memory use out.
If Apache was working correctly, and now you have an issue, always follow the golden rule "What changed?". One way to determine a potential cause of new errors is to check your systems update logs to see if any packages were update at the time the error began. This is not always the cause of the issue (or parts of the system may have changed, and this will only tell you about those changes that the package management system can log, not all the changes on the system):
Check your yum logs:
Ask yourself "When did the issue start, and what changed then?"
This will give you some idea if a module, OS, library, etc. update may be at the cause. Rollback and see if the error goes away, if they do, then you have a pretty good idea of what caused them.
If your rollback doesnt give you a root cause, then you need to determine:
- That you dont have a bug in Apache itself. Setup your system to capture core files, install the debuginfo for Apache and its modules and do a backtrace, then you'll see the cause with Apache
- That you dont have a bug in PHP, or some other major component in Apache (same advice applies in 1, get a core and backtrace)
- Check your webapps, you'd be surprised what they can do to themselves.
- Check your htaccess files to make sure you dont have a monster on your hands, a bad one can kill the server.
Please work with your OS vendor to rule out 1-4 before opening a support case. If you have a really good feeling that a component in ASL is causing an error, we would be happy to look into it, afterall those are provided by us and we support them. If you have a case where you can rule out your OS, PHP, Apache, etc., please make sure you have a core file for the error, such as a segfault and a full backtrace. We can't do anything without a core file and backtrace.