Step by Step – How to Respond to a Security Incident
Last month we posted about 6 Common WordPress Exploits You Need To Know and the 7 Telltale signs that your site is out of your control. This week we will get a little more technical and dive into how to respond to a security incident.
The longer you ignore updates, the more likely it is that your site will be hit with a successful exploit. And when this happens, typically you’re unprepared and it becomes an emergency situation to restore the site. Then comes the analysis to identify how it happened and how to prevent it from occurring again. And then you’ll be safe – at least for the time being.
So how should you manage through the crisis?
1. It’s a good idea to keep an open line of communication with your customers as things progress with the recovery effort. Depending on how the exploit was accomplished it may also require some changes from the customer side to maintain some safety. If the entire site is compromised giving admin-level access to the external source. We should enforce a change in the passwords to the site once the site is restored. Asking the customer if they noticed anything out of the ordinary may help with the investigation.
2. Most successful recoveries are based on having a functional backup. In the past performing point in time backups were fraught with issues. Was it a real backup with transactions? Was it a zip of the folders hosting the data? Can you recover it quickly? Storing your backups on the same server that was exploited is also not a great idea.
In our experience with Pantheon Hosting Provider, backup and restoration are not always seamless or quick, or successful. On occasion, a restore for a live site may fail to complete leaving the site in a non-functional state. Scheduled daily backups may run but may fail to complete within the allotted time. Some of these may be attributed to the timing of the backup or the size of the data.
Also, the frequency of backups and the number that is kept on hand is important. If you have been on vacation and left your site running for a bit and you only have 7 days worth of backups and it was exploited 8 days ago. You won’t have a viable backup to restore from. Similarly, if you go back to 30 days (default for Pantheon scheduled backups). If you ignore your site for 31 days you will find that the restoration process will be more difficult. Look at offloading your backups if the retention is low to cloud storage or long term off-site storage. Pantheon has support for on-demand 6-month retention of backups.
3. Exporting a copy of the web, application, and database logs for review. Knowing how far back you need to go back to identify the day of the exploit will help with trimming down logs for analysis. Web server logs review mainly for POST events that received a 200 status code. If you have a shortlist of those you can further review additional transactions from those requests to identify the extent.
4. Several ways to lock down your site so that it’s no longer accessible. We can use a plugin to configure a splash page that is loaded when it’s activated. This scenario allows you to continue to log in and use your site with the exception that content is unavailable. Another option is to update DNS to a static site with details regarding the maintenance. Some Hosting Providers provide easy to setup splash pages. Once in place, you can continue to use the site by using alternative DNS names. For a simple solution, you could use the wp CLI maintenance mode command.
1. The next phase requires you to decide whether to restore your live site from backup or wait it out until we have identified the root cause. This operation is destructive and any content/files uploaded after the backup was taken will be lost. Previously we backed up the compromised live site which we can restore to a test environment to compare the data or use it to copy content from the test environment back to a restored live site.
The Pantheon Hosting platform makes it easy to create additional environments from backups. We can use this functionality to restore from a daily backup until we find one that is good. If we can’t find a clean one we can utilize the restored exploited site and attempt a recovery.
2. Hopefully, at this point, we have recovered from a valid backup. From this point on we should perform a backup of the environment and note the time. If your investigation shows that it was exploitation from a vulnerable plugin/theme we can go ahead and remediate those issues and if other items exist attempt to resolve any outstanding issues with the environment.
Perform any testing to ensure that the updates were applied successfully. Perform a backup again and note the time in case we need to store in the future to this point. In a previous post, we documented how you can verify core and plugin checksums to ensure that they match upstream. We recommend performing these checks before to see if the code was tampered with.
3. The other possibility is that a user was compromised and that was the entry point to the system. If the user has minimal access to the site we may be OK as long as they were not able to elevate access. In this case, we can easily update the content created/updated by this user and reset the account and notify the customer/user regarding the entry point.
4. Once you have your site back to normal. We can restore access to it and monitor to see if any new events occur from the same source IP. If your site was marked as malicious we need to confirm that the site is no longer delivering the same content as before. Confirm that any cache is cleared and execute testing from the third party to scan and remove the site from the list.
5. Keep tabs on the environment to see whether we get any unusual activity. *You may want to kick off another backup at the end of the day.
So there you have it – your Step by Step Guide to Responding to a Security Incident! Do you do anything differently?
Reach out to Web Teks for assistance with keeping your site updated and secure!