Automated Network Threat Response

paulwratt · January 1, 2022, 8:43am

I finally got the code uploaded, still needs some tweaks from a user point of view (I nginxd a couple of scripts, and need to figure a simple way to re-integrate them into the jobs that call them)

I decided to put up a statically named ssfw-blocklists/known_urls.txt as well, for those interested in what has been tried over a 6 month period. By changing /blob/ to /raw/ in the url you can downbload it or otherwise look at in its raw text formatting.

I am thinking of doing something similar for sshd “Failed password for invalid user” names, too.

I hope some of this data makes its way back into the community. A lot of the urls are from a list (or lists) available in “security testers”, as when you check the logs, there are very definately some “examples” that are supposed to be changed (like using 192.168.1.xxx) for the Jaws, Mozi.a and Mozi.m exploits urls (script kiddies, gotta love them, or not)

Iron_Bound · January 18, 2022, 7:19am

Did you look at Openresty? It’s nginx with a lua plugin system good for rate limiting, etc.

paulwratt · January 19, 2022, 10:45am

All other solutions have “prerequisites”. If you aren’t running a webserver, and you dont need log interaction, its one script, and that will work on Alpine Linux (the most bare bones of system full OS), with negligable impact on a single core 1Gb (or less) system.

If you do need webserver protection, at its simplest, its just one script that generates an “include” into any default webserver setup (only testing Nginx atm). There is no overhead for that webserver, or any 3rd party items needed to use it, and no “language” needed to understand it, outside of the webserver config format, if you decide to read the “include” before using it.

What I have should also function when webserver software is being used as a reverse proxy, which is how I will be using it, but that has yet to be fully tested.

That said, there are some other monitoring solutions, some of which use an AI approach. Some recent analysis has shown that “certain items” may be worthwhile if upstreamed to those aforementioned solutions (eg. IP address ASN ownership and their related IP address ranges).

Thanks for the mention of another solution by name though, unless you are in the industry (say an network admin specifically targeting entrypoint protection and/or response solutions) it can be differcult to find something good in this arena that suits your needs, unless it is “name dropped”, “recommended” or otherwise “mentioned”.

That said, I use what I have developed everyday, and because I dont use a web control panel frontend, the “extra fluff” makes analysis and management pretty simple, and it gives me the leahway to think about things like “automated range anaylysis and blocking”. I am also starting to to want to know more about why an IP address was blocked, and how I can get the extra information into what I have already developed without doing any “overhaul” on the code or the recording format. I have also recently realised that some of the checks I do can be sped up incredably by using a different approach (because I record on the filesystem, and not in a file).

From an uninitiated user point of view, dealing with different log file formats is probably the most restrictive entry point atm (outside of standard linux SSHd & Nginx default formats), but the way I write code means I could come back to this in 10 years without having used it, and know almost immediately how to get it running with something different. There is enough documentation in the code for someone else to easily be able to adapt the standalone “gerka” to handle other port monitoring (ie ftp, smtp, pop3, sql, etc).

As far as making a “super simple firewall” available to the public, I have some ideas of “refocusing” the project on the “simple” part, but most of that stems from the need to test other webservers, and other internet services, along with the need to create a “unique” presence in a field that is already quite full (especially of outdated or unsupported offerings).

To that end I might add a remote install script that checks for certain services and gives you some sort of checklist that can be modified, and then autogets, autoinstalls and autoruns everything you need. Even though I dislike systemd, there is nothing stopping me from using it to allow IPv4 blocking to function across reboots, even when a logging system is not installed, and its available by default on my current OS.

In all honesty I dont see many people using the resulting project very much at all, and those that do will probably be looking for “super liteweight firewall” type solution, and even then possibly only for certain ports, certain server software, or specific log formats, so single "Gerka"s will be more useful.

I do however see an eventual uptick in the use of the block lists over time, especially by security researchers and security analysts, then by others who found them useful in the (eg. the discontinued Internet Storm IP address blocklists). There might be some interesting data that can be expanded upon at “recored time” to help with deeper analysis over longer time periods (but who knows).

Like I said elsewhere I am also thinking about making it integrate into some web control panel frontends, but that will come later when I start using them myself (looking at PiHole and other web management frontends that it plugs-into or along-side-of)

Cheers

Paul

paulwratt · February 6, 2022, 8:25am

After fluffing around with some CIDR and AS scripts, I realised that the extra data I wanted to see is also the same extra data needed to concoct valid DShield logs for submission to the ISC (SANS Internet Storm Center).

A lot of the SSH attacks are singles from a unique IPv4, and thats it. Sometimes they also probe the web server (and prossibly other services and ports, Alpine Linux with Nginx and port 443 service off by default is very light on attack surface). However when I check these IP addresses on ISC I often see “one report from six days ago” type results.

I got hit by a sub domain probe, 420 incomming connections in lest than 1 minute, to about 5-6 urls each, and each connection was a unique sub domain. Again when I checked ISC “one report from six days ago”.

These (and other web probes) prompted me to reinvestigate the available DShield options. Turns out you can actually get single file Perl scripts, and (as it happens) Alpine Linux has Perl installed by default. This info was hard to find (dated 2004 - 2007), outside of the FrameWork and RPi Honey Pot options.

The thing is the DShield logs require(-ish) TCP or UDP transport per log entry, and SYN, DNS, etc . From the current data I already know all my stuff is TCP, and I know the ports (there are only 2 atm). The thing with Nginx logs tho, is they dont store the source port.

Part of these issues with modern attack surfaces and routes, is addressed by expanding how and what can be logged, based on the progress and outcom of the RPi Honey Pot. Which means (atm the only way forward is) I have to install the Honey Pot somewhere (its no longer RPi only BTW) and look at how and what sort of data it is collecting.

But is also means I now have a reason to move the IPv4 range block list entries over to the filesystem, and extend the data I am storing in log files based on where it came from, who logged it, and intervene with some analytic, to dump some extra data too (like was it a root or user attack on SSH, and was they user unkown - at least one person has seen the SSFW username, but I know they dont know that)

Outside of the Gerkas, the scripts and organisation of them is only slightly flawed in its current v1 state. I have come to realise the the individual webserver configuration folders require the presence of associated log analisys and clocking scripts, to maintain easy porting for new servers and services (I want to add FTP at the same time I add Apache).

However, in providing a custom log for Nginx to capture haxors, I did not it is not the same format as either the Nginx Access or Errors log files, and those are different to, with Access storing less info, but also logs Errors entries as well as my haxor entries. To make things even worse (from my point of view), because I dont have PHP setup yet, there are a lot of 405/406 “unsupported” log entries that dont make it into the Error log (POST & PUT).

Strangely there is one empty item in the default custom log in Nginx that has never been filled in all the 6 months I have been collect and analysing data.

I am concerned the providing custom Nginx Error and Access logs will break moist 3rd part collection and analysis tool and/or interfaces. Its annoying to think I will probably have to run 2 sets of log files to do DShield properly.

Nginx, especially when set up as a reverse proxy (as I have it) can handle connection management and access denial, but that does not help SSFW logging, blocking and analysis, it only mitigates the problem (and maybe? it would not help DShield logs - maybe it could? by improving log output).

It maybe that I need to write some proper kernel integration code, and look at something like RSysLog as a basis (which can handle 15 million log entries per second). It seems a bit much for a (suposedly) “super simple firewall”, but it might be useful as the basis for the upstream end point of what the OP propses.

I am also concerned about the default log rotation management of the syslogd “messages” files. 200Kb rotation and .0 naming, where as Debian/RPiOS is weekly rotation, with .1 naming, and gzipped above that (4).

I will probably need to log some query, issue and/or bug-report regarding the lack of file writing with high volume syslogd log messages on BusyBox/Alpine.

I’d prefer to scavenge a .xz package into the log archive when I do my monthly backups/rotations. Speaking of which the perils of manual grep && sid -i on log files means I lost all current SSFW log entries for the 30th of January (it was only 47 entries - and they are logs, not blocks).

Some sort of automated log management is called for. I see how some of the older DShield Perl scripts handle it (by storing the last entry they processed). The annoying part is that it really needs to be checked every 10 seconds on a default Alpine setup, because of its 200Kb byte limit. Actually maybe its just a “wakeup, do some maths, adjust sleep timer, sleep” type problem, and (on Alpine) you can test and store the size/date of messages.1 and then copy that when it changes (sort of accurate).

Anyway, for some random or unknown reason SSFragWhare: the super simple firewall with brutality bonus made it into the #devember2021 winners circle as a finalist, and it only took me about 8 hours to comprehend what I saw (guess that’s what happens when you only get 2 hours sleep in 2+ days). I think it might be down to my “beside the seaside” post that did it.

When I get some more Gerkas going, and after I have made these log changes, and got the DShield submissions running, I would like to pair up either the DShield Honey Pot on RPi, or PiHole on a RPi - actually maybe both, use the PiHole servver to watch and react to the Honey Pot server. I need a PiHole here when I change my networking over. This will (at least privately) allow me to investigate the original target of this OP - Automated Network Treat Response

Argh Well … I guess thats it for this update (lets hope the Judges find a juicey 12v/5v device for me)

Cheers

Paul

paulwratt · February 6, 2022, 1:34pm

Meh

It turns out my fears for the Nginx log file thing (above) is a non -issue.

I found the 404project Perl script in the ISC Honey Pot source tree, which takes an Apache log and sequesters certain info into a format ISC wants (which is pretty close to the DShield log format - near identical) , and both default custom Nginx logs and the Access log are the same format, and that will be why the Access also contains 404 and other errors as well.

The script looks for but does not post the actual error, so can use the haxor logs as is (even tho they are listed as 200 errors (ie not and error) because catch them in the config. It also appears that “missing item” is skipped over by the script to, so its probably a legacy Apache thing, and now irrelavant

Sweet

Strange that the (RPi) Honey Pot was/is pushed as an improvement to the logging situation, when that 404 Perl script is from 2012 (the year the RPi was born). I looked at the python honeypot script, and it even has a memcached service capture, odd, I guess without authentication it is an open target (I was just reading their protocol.txt last night).

Anyways, a couple less concerns, and one step closer to some ISC data logging …

… on with the show.

paulwratt · August 6, 2022, 4:23pm

For anyone looking to free up some log entries and help combat failed sshd attempts, I got a mention in risks’ “Sshd on port 22 spamming your logs? here's a solution” post, where I also make a case for SS:FragWhare usage as an assist. I think his approach is quite elegant, and it will work on a majority of Linux distributions (which have iptables and netfilter-persistent installed by default).

A note on that, a reminder that there is a monthly updated IPv4 blocklist archive available in the SS:FragWhare repo. You will need to screen it first, to make sure your own IP address and your hosting service is not already in there (no Linode IP ranges are present, but 99% of DigitalOcean, most MS Azure, and some Serverion ranges are).

As an update to this thread, the usefulness of what constitutes the analytics is at the point where some sort of AI anaiysis is now a logical next step, along with various control panel frontend integrations (eg PiHole). I still have not added the “bonus check” for a certain type of sshd key exchange attempt, but I have a log file for reference, so it will get added at some point.

I would really love to throw SS:FragWhare at some other server installations (besides Alpine), but the above mentioned post is the 1st time I have ever seen anyone interested in whats happening, and I know they are not using it. On top of the finacial constraints, my current situation does not lend itself towards making that happen either, besides the fact that it takes time away from the original reason for SS:FragWhare in the first place. I’ll see what I can do, I might be able to wing 4 servers in total, for a while at least.

The latest post on the project page is more inline with this thread (as opposed to being project specific), and sheds some light on the rescent ebb-and-flow of internet hacking attempts.

As a side note to the original approach I use with SS:FragWhare (the use of ip), I found out last week amongst the growing list of deprecating Linux command line tools, ip was introduced as a replacement, so you get “future proofing” by defult.