Automated Network Threat Response

paulwratt · August 3, 2021, 1:59am

here is some more technical details, for those that may be interested:

(add 63Mb of memory usage and 112Mb buff/cache usage to the above stats)
( server was rebooted on 7 days ago - 127 blocked IP addresses )

in /etc/nginx/block.conf:

        location ~ .env$ {
                try_files $uri          /.env;
        }

        location = /.env {
                access_log              /var/log/www/haxors-access.log;
                default_type            text/plain;
                index                   .env;
                limit_rate              1k;
                sendfile                on;
                aio                     threads;
        }

the aio threads offloads to a non-blocking “download” thread.

One drawback of this method is that now 400 & 405 errors are also in the new access log file (maybe they always were in the access log?). 406 errors for the same url still end up in the error log (I dont have POST set up properly yet).

All the captured try files $uri are linked to / (web server root) dot files, and those actual files are sym-linked to the “substitute content”. Although I process logs every hour for web review, I still only manually trigger the new block.conf, as after a week I only have 16 new bogus urls in the error log, 13 of which are targeting phpunit (so thats going to be a custom regex matched entry now, along with print.css).

On the other hand I do see alot more 400 & 405 in the access log, and the 2 messages logs get rotated or filled up after about 3 hours, hindsight process once an hour to block IP addresses with ip-rules seems ok, especially considering the current ram usage.

https://nginx.org/en/docs/http/ngx_http_core_module.html#aio

HaaStyleCat · August 3, 2021, 2:41am

Well, I don’t know this stuff that well, but this looks to be very interesting. Something I would like to have to increase security… than again I am paranoid. Took me a while to look up a bit here, but it seems like a great project.

paulwratt · August 3, 2021, 2:47am

On up-streaming address block requests:

Most servers are hosted in some way, so the first hop will usually be the actual physical hardware server (or final routing hardware controller), so any request from a hosted server will protect all servers behind that “gateway”.

The second hop will be something within the LAN that manages network segments, while the third hop will be probably be “network gateway” of some sort (router/firewall/DMZ/managed switch), that may or may not also be DNS/DHCP server.

So within three hops an entire LAN can be protected from bogus incoming traffic.

Each up-stream block request needs to be parsed as an internal only source, and to verify no internal networks are being blocked due to malicious user or targeted server has already been compromised.

Due to the way hosting works (at least some hostin that already has on hosted SSH pipes, and/or control panel statistics review - graphs) a pull process could be used instead of push process. Then colated pushing directly to the front facing “gateway” device/service from an already trusted source, should still be again, block address reviewed. Physical connection sould even be network segrigated through the hardware management port to help free up internal network traffic, especially in the case of high volumes ranging in DDOS limits.

A list of addresses would be smallest, however the up-stream destination could require log entries for verification, which could then be cross-referenced with previous/current up-stream activity. On large hosting networks a colation server may also handle up-stream analysis management and could also (at the same time) be used as an internal network honey pot if an internal VPN were used to handle blocking management.

The reality of the world today is, an attacker may be comming from within, and the last thing you need on a network, is a propigated domino effect that started “at your end”, but that should not stop you from “trying to clean up the intenet” at the same time.

paulwratt · August 3, 2021, 3:25am

Thanks @HaaStyleCat.

The caution here stemmed from the same server (with a different IP address) already been compromised, and the lack of ability to do anything else that “bandaid the hell out of the server”.

Like I mentioned earlier, those repeated 82 Mb/s outbound warning really infuriated me, considering how quickly the server was shutdown previously when it was compromised.

In most cases, the owness is on the final destination server to protect itself. But when that (hosted) server is only a single core CPU with limited memory and resources, (especially in the modern day) it is unrealisting to expect said server to loaded down with processes that should actually be managed upstream in the network, so the entire network is protected, not just one virtual server.

The only real issue is to get hosting providers to provide simple management process that isnt part of a Pro Teir service.

One thing I found interesting is the URL’s that are HTTP targets, and the failed user name SSHD targets.

In some ways these sorts of network “issues” remind me of how Microsoft handles things. Instead of fixing the problem at the source, they sprouted an entire industry backed certification environment and ecosystem, which proves very profitable to this day.

Coming from an 8bit/16bit platform orientation, trying to communicate over and with the modern internet, speed, size and resource limited constraints are a real issue. The upcoming DNS changes will mean that even just the KEY used in part of a handshake will be bigger than some hardwares total memory capacity.

IoT devices especially, need to be able to protect themselves better.

I have another idea for better internet accessible device connection access management called PortNames, but thats on a whole other level, and requires fundament changes to infrastructure (similar to the GnuDNS changes, but for ports, not addresses).

paulwratt · August 6, 2021, 8:56am

A Simple hits/sec Algorythim

Both my /var/log/messages logs rotate collectively on average once every 3 hours, so I have a script that runs at 58 minutes of every 3rd hour. Now although the script works fine from console, it does not work completely via crond. Mostly this is down to root permissions, with the log file having 0650, and sometimes the ip command was not working. Is crond checking the context of ALL commands in a script, not just the script permissions?. I don’t know, but once every 3 hours to block sshd fails > 4 is a bit useless (after the fact), as when they come they are between 3 & 6 per second (I saw one IP try every 4ms for 2 hours).

So it appears the lightest way to do a very basic (or simple) analysis algorythim, is to use a (shell) script to tail /var/log/messages | grep "auth.info sshd" | cut -d \: -f 4 | grep -E -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" | uniq and then process that output in a for loop, and blackhole the IP is it appears more than 4x. Here there is no problem with uniq because it will be a maximum of 10 lines, and that grep only outputs IP adresses (so a max of 15 characters per line). The biggest part of the check is still then grepping each IP out of the log file (again).

But I also added a check to see what the last tail match produced, and incrementally add a default sleep time (eg 2 sec) to produce an incremental delay between checks, up to a max check delay (eg 60 sec), so if nothing else is going on, it will still check once every minute. That should probably be every 10 secs based on other info mention elsewhere, but hey just the blocked IP address already add (50 in the last 12 hours), already means I have to wait for over 2 minutes before someone new tries to hit sshd (some interesting user names there too - someone tried every user name any net facing service might use).

Then I run this script as a background task, after adding logging and PID capture. By using date -Iseconds as log timestamps, I can easily look back over the last hour with a grep like date -Iminutes --date="@$(($(date +%s) - 3600))" (as long as I clip the ammended timezone adjustment)

So this SSHd fail IP block and the Nginx block.conf really hammer whats comming in. I have not had to even process another block.conf for a week now. ATM I manually add the odd dimwit repeatedly trying to retrieve /.env, which is about once every 3 days. Yeah there are some new urls, but they are mostly random ../phpunit/.. paths, which I’ll add to the custom include. I would really like to be able to return 0 bytes when an HTTP request explictly asks for 0 bytes, but there is not enough granularity in most web servers to do that.

Initially was really worried about how to impliment an analysis algorythm (what language), as well as trying to develop said algorythm (checking hits per second failures). However after a few nights “sleeping on the problem” its was pretty obvious.

When attacks come, they are always recorded at the end of the messages log file, so tailing that made sense, theoretically thats low overhead even on a big file. A grep from the bottom of the file to the top, with an abort after a set number, would be more optimal, but maybe tricky on heavy load systems, especially where there are other services also hammering the messages log. The crond script was fine, because it piped out entires from both message logs to a seperate file (every 3 hours) and worked on that.

While thinking about the above, my mind kept comming back to the issue of storing info, and the problem of inode thrash that would result in using “off the shelf filesystems”. So I’ll probably talk about that next …

EDIT:
Turns out >4 check is pretty brutal because for every 1 failed message, there are 2 disconnect messages

That is:
2 failed SSH root user attempts will get your IP address blocked
(because that is 6 matching IP addresses)
2 failed SSH invalid user attempts will get your IP address blocked
(because that is 8 matching IP addresses)

EDIT 2:
RAM usage is up by 3Mb, 1.5Mb for the script sh process, and 1.5Mb for the sleep sub-process. I think that is due to BusyBox multi-binary, because its the same for ash shell, crond and syslogd

EDIT 3:
log rotation is down from every 3 hours to every 7 days due to lack of sshd entries, of which there were still 144 attempts, from 105 IPv4, maximum attempts per IPv4 = 2. However (over that same 7 days) there are still about 25 IPv4 being blocked every 24 hours from Nginx logs.

EDIT 4:
grep is tweaked from . to \.

paulwratt · August 17, 2021, 4:05am

Super Simple SSHd IP blocking

So I dont have everything up on GitHub yet, and in fact there is some small redesign needed for what I have, purely for tracking whats been block from where. In the meantime I thought someone looking throught this thread might just want something fairly simple (without IP tracking/logging), so here is SS:FragWhare:

#!/bin/sh
### SS:FragWhare (run as root)
### Super Simple (fake) Fire Wall
### (Whare is Maori for "house")
### - to verify everything is working:
### ./ss-fragwhare.sh > verify.ssfw
### - to start as a "service", use:
### ./ss-fragwhare.sh false &

# default interval & increment, in seconds
I=1;

# maximum time between checks, in seconds
M=10;

# default fails before blocking an IP address
K=4;

# easier option to change "tail lines"
# default for tail is 10, dont go above 20
N=10

# where to record the Process ID
#echo "$$" > "ss-fragwhare.pid";

# verbose(-ish);
Q=true;
if [ "$1" = "false" ]; then
  Q=false;
fi;

# main loop;
O=""; S=""; T=0;
while true; do
O="$S";
# we can use `uniq` here because its only 10 lines max
S=$(tail -n $N /var/log/messages | grep "auth.info sshd" | cut -d \: -f 4 | grep -E -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" | uniq);
if [ "$S" = "$O" ]; then
# relieve pressure if no change;
  T=$(expr $T + $I);
  if [ $T -gt $M ]; then
    T=$M;
# dont do other things here, as it will exponentially use more processor time
# this script will only consume about 0.16% of CPU on a single core system
# BusyBox will increase memory usage by 1.5MB x2 (1x for script, 1x for sleep)
  fi;
  $Q && echo "$$:ss-fragwhare: waiting: $T";
else
for xADDR in $S ; do
  # minimum length of an IPv4 address is 7 
  X=$(echo "$xADDR" | wc -c);
  # but input might be broken, so double check
  Z=$(echo "$IP" | cut -d \. -f 4);
  if [ $X -lt 7 -o -z "$Z"]; then
    $Q && echo "$$:ss-fragwhare: ..hmm.. '$xADDR'"; 
  else
    # make sure this is th correct name for your system
    C=$(grep "$xADDR" /var/log/messages | wc -l);
    if [ $C -gt $K ]; then
      # silence "wrong input"
      ip route add blackhole $T/32 2>/dev/null;
      $Q && echo "$$:ss-fragwhare: added $xADDR";
    fi;
  fi;
done;
  T=$I;
fi;
sleep $T;
done;

### - to work across reboots:
### ip route show | grep blackhole > blackhole.ssfw
### cat blackhole.ssfw | xargs -n 1 -I {} ip route add {}
### - after 400-500 IPv4 are blocked, SSHd will be "quiet"

Whare - said: furry (or farry) as in ferry, not furry as in animal - is Maori for house, so SS:FW for short, implies FireWall, but is it really? After all it just blocks some IPv4 address from sshd entries in a tail /var/log/messages (10 lines) list. It does not record anything, and needs to be restarted after a reboot. It’s not a cron job, because it needs to run every second (well it could).

The only difference between this and what I use (yes I will be puuting it up on GitHub too), is that all my scripts pass to a dedicated ./block-this-ipv4.sh IPv4-address script, which means I can offload (or consolidate) some of the checks.

You can try and use ip route save 1> dump.rt and ip route restore < dump.rt, but that produces a binary format suitable for manipulation with rtnetlink format tools. Better to just use ip route show | grep blackhole > dump.blackhole, and then after reboot, use something like cat dump.blackhole | xargs -n 1 -I {} ip route add {}/32, which is human readable.

EDIT:
grep was teaked from . to \.

paulwratt · August 17, 2021, 5:40am

More observations since SSHd monitor

So after about 500+ IPv4 addresses were blocked, on the SSHd side I noticed a heavy slow down in log activity, which showed up some other issues - kex exchange errors. These are (failed) attempts to use SSH key exchanges to gain access to root, instead of trying to brute force root. So I wrote a script to deal with them, its manual, but then the IPv4 address were always in pairs from the same range, so after the 1st run, I get maybe 1x kex IPv4 block per 24 hours.

I did some manual analysis of the IPv4 for those kex attempts, and most of them are from a small group of commercial websites. They say they offer a service, but in actual fact what they are offering, is easier ways in for hackers, who join as fake clients, to gain access to IPv4 address ranges that they want to not get caught from their own systems. I can guarentee that “service” they provide, does not contact any host that is compromised.

The other issue I notice is that the average rotation time for /var/log/messages (Alpine Linux rotates on 200KB) went from 3 hours to 4 days (and growing - its only half full atm). This does allow patterns to show up over longer periods, and what I saw was some obvious stealth botnet activity (I would say a proper botnet as opposed to a kiddie script botnet). And because I process /var/log/messages* in to a folder as IPv4 addresses, I can easily see ipv4 block ranges.

ip route can handle address ranges, so I might look into that. For example, of 144 “Failed password for” SSHd IPv4 addresses I currently have in the logs, 105 of them are unique. Because the above script is pretty brutal (hence the name frag house) there are some 40 odd addresses with 2 hits each. Of these 105 addresses, there are at least 2 blocks of 5 from the same sub-net. And when I cross-reference them against the log files, there are subtle patterns in the timing of there attempts.

On the Nginx side, I have not added any more URI since the 1st of the month, there are a couple of new ones, but what has been showing up are wierd GET request that return 400 errors ‘access.log’, not the ‘error.log’. So there is something else being parsed in the header, and that may be similar to the daily \x?? strings being attempted too (without a GET/POST/HEAD/OPTION header). Somehow one of the custom $uri is causing most 404 errors to be reported in the access.log as well, so there is a “too generic” match being made somewhere, at least there is only a couple to review, and they are in a seperate include.

With everything said and done, at about 500 blocked IPv4 I was getting about 40 odd new blocks per 24 hours, while at 600 I get about 20 per hour. Thats also with around 430 captured $uri in Nginx.

Anyway, the above has allowed me to consider more on the inode issues as well …

risk · August 17, 2021, 5:44am

So if someone used a octet looking username they could ban your servers gateway address? …

You could also try something like this (example from stack overflow).

-A INPUT -p tcp --dport 2020 -m state --state NEW -m recent --set --name SSH
-A INPUT -p tcp --dport 2020 -m state --state NEW -m recent --update --seconds 120 --hitcount 8 --rttl --name SSH -j DROP

as a form of greylisting and you could whitelist on successfully connections using ipset from your profile

paulwratt · August 17, 2021, 6:04am

@risk thanks for info - stuff like that is always useful to know.

Those octets are in HTTP headers, so they already get throught to the web server, but it does not process them, 400 errors are “not supported”, so maybe if the web server is setup to support them, yeah it might have an effect.

Thanks for bringing up whitelist too, because although I have not mentioned it, thats part of what I want to add to what I already have, but I wanted to look at using ranges with ip route first, and come up with a simple way to parse an IPv4 in a range, without the need to create file entries for every part of a subnet (*.1 => *.255). In the particular case of these scripts (that use the shell for algorythms) it may be simpler to dump them all (including each subnet entry) ito a file, and use that against a grep. I’ll have to see, I’d prefer something file based due to not having to sort an ls -1 like I have to with other scripting languages (like PHP output).

Cheers, and thanks for keeping an eye on things

paulwratt · August 17, 2021, 7:59am

On inode thrash & inode non-use

Besides the use case I have, an IPv4 as a filename containing little or no content, the other big use case is stepped folder names, for either user names or domain names, as used by large web server installations.

To sum up:
inode thrash is where heavy writes of non-inode-block-size happens. On SSD and SD-card media this can wear out the “drive” very quickly (which is why you turn of update-access-time on SBC/RPi linux setups).

inode_non-use (probably has an official name) is where a standard filesystem entry soaks up at least 1 inode of size X (used to be 512 bytes, now 4096) for every directory entry, each of which contains no data.

On a system that has user names and/or domain names of the type /path/u/s/e/r/n/a/m/e/ or /path/w/w/w/./g/o/o/g/l/e/./c/o/m/, you will exponentially use more inodes the further down the folder structure you go, to the point of running out of avalable inodes on a lot of systems, which require the folder structure to be in place before you can use it.

For example, there is a lengthy discussion on how and why not to do it in Nginx (or any file/path base network server).

In my case I only really need the filename, not the space for its contents, which could be bound to its size entry in a directory (inode). I dont actually need to store any content, just a integer number which can (also) be used as a counter. Valid dates can be manipulated via regular filesystem techniques (ie touch).

After looking at the Ext2 filesystem, I am woundering if there is a way to manipulate the inode size at setup time (superblock), or even use hard links, to reduce the amount of inodes used by files (and/or _folders) in the filesystem.

I am looking at off the shelf Ext2/4 filesystems so they will integrate easily, with Ext4 also providing journaling for those that might want/need it (over Ext2). I can get away with 0 content files if I can manipulate the file size. So the only part that is getting updated is the entry in the directory inode.

At first glance, the simplest way to cut down on inode thrash is to mount the filesystem as an image file, and let the underlying filesystem handle things through its write back cache, which are (nowadays) optimized for block size writes. Meaning that only after 4096 bytes have changed per directory inode does the “medium” actually get updated.

For example I have 711 blocked IPv4 addresses as filenames, that amount to a total of 10158 bytes. At the moment I store the date they were added to ip route blackhole, but I dont need to, I can use one of the time areas for that, saving (im my case) 25 bytes of a 4096 byte inode which I dont even need for file allocation. That inode allocation could then be contributed towards further directory inode needs.

As a note here, with Ext based filesystems, a directory inode is compressed of 00 padding (mostly). Because an IPv4 address has a minimum of 7 characters, and a maximum of 15 characters, a lot of entries can be compacted into each directory inode of size 4096, and the file entries would also not need the initial file inode recorded either, further extending what can be squashed into a directory inode.

The only real downside to this type of approach (without modifying an Ext2/Ext4 driver) is the need for stat instead of cat at the filesystem level, however internally stat is called everytime you access a file anyway, so this may be a moot point.

For username/domainname examples, where each directory folder entry still needs to point to its own inode, this can be aleviated to a certain degree by an intermediate mount where the beginning of the path and the end of the path are in the physical filesystem media (as normal), while the stepped path structured mount could be copmpressed into a filesystem image mount.

In these two (basic) senarios, it is not possible to use hard links for folders, but it is possible for files of 0 content lenght. I have used these before, and did manage to remove the filename without removing what is was linked to, which is the default way soft links work.

Another possiblity, at least for files (of static content) is to link there contents into a .inode file present in that directory. But this, like the directory folder name problem, would likely require a customised driver. There is a caveat here with linked files, by default fsck (filesystem check) will try to “fix” them by default (on most modern linux setups), so if there is a custom inode layout, file size and initial file inode in a directory inode will have a bearing on that.

I am trying to think laterally about this issue, as I want to impliment IPv6 in the filesystem too (which could have an issue with ::). It maybe more sensible to create a file back zipfs or a file backed sysfs system instead, but then those would loose the journaling option too, as well as needing custom drivers, so 50/50 on those atm.

As far as creating a filesystem driver, it may even be simpler to write a fuse driver and flat-file the “media” (as a file), something that could (possibly) still be edited by a text editor. I am not sure about this though, as (at least in my case) what I am dealing with need to be handled on a kernal level anyway (the ip route endpiont at least), so having the data available to the kernel (because 99% of filesystem drivers in modern OS’s are in the kernel) will keep things fast.

(k-rap, just going into LVL4 lockdown and I have to move my vehicle, so might have to leave it here)

paulwratt · August 30, 2021, 10:23am

On 0 inode use filesystems

There are various ways to handle a 0 inode use filesystem, obviously the first detail being that no filesystem can get away with acutual 0 inode use, but in this instance(s) we are talking about file and folder entries that use 0 inodes in their data section (compared to most other filesystems).

After spending a bit of time looking over the filesystems described on OSDev.org wiki and in the bootboot project , I found SFS (SimpleFileSystem) which could be used as a basis. Ideally it needs some extensions, specifically POSIX ones (soft/hard links, attributes, time slots), at least for multi-part directory levels, but this is only for easier administration (and maybe for better security).

The SFS is considered an ideal starting point for various reasons, the main ones being:

there is no sub-directory architecture, every entry is in one root table (the index section).
it can support implied directories, where a directory entry only consists of a file endpoint.
can easily support (because of reduced filesystem structure usage) search enhancements.
the root table extends inversely from the end of the volume on the media, back towards the superblock and data sections.
can support small (1.44Mb floppies), large (4Gb volume) and huge (176 ZetaBytes) media allocations, because default blocksize and inode are described with 64bit numbers.

SFS is essentially a write once filesystem, that can support updating by default (as opposed to say ISO9660). The root table is simple enough to be relaid back to the media (in an optimised way), without the need to reorganise the associated data inodes or disturb index inodes.

The simple idea of one place for everything also describes its primary drawback, lookup speed. (At least on Linux) the VFS setup allows creation of hash to help with lookups that are already held for an open file/folder. But from the driver side, any reqular (nay advanced) search ro sort algorythim will greatly improve initial lookup of new (ie not in cache) items.

That (traditional search/sort) lookup combined with segmented hash of index entries (in the root table) would greatly speed up the physical media lookup and access times. With 0 inode use this can be stored easily in the data section, or any other free space section on the volume (after the superblock). There is already defined in the SFS specification an allowance for reserved blocks which are traditionally (in the use of SFS) used for boot loader stages. This area could be co-oped for a segmented hash lookup table if the index were optimized with contiguous entries (from a text/human point of view).

Reminder here we are specifically looking at IPv4 (later IPv6) file entries that use 0 inodes (no content), and multi-segment directory structures (/a/b/c/abc.com) as used by a service for data, user, domain segmentaion (and security). So besides my use case (IPv4 counters), there are fileserver domain path uses (FTP, Nginx, Apache, etc), user path uses (same + others), and (linux) user password segregation uses ( Owl tcb ).

That said before I do anything here I’ll be finishing the github repo for the firewall part done so far, which has been in use for 1 month as of tomorrow, and got its final tweak earlier today.

As far as dvelopement of a 0 node filesystem, I can see a suite for filesystems to cater for various use cases, as some are not 0 inode (like I am useing atm, each IPv4 recodes the date it was blocked), and others would only use them if journaling was available.

FWIW: I’ll be proto-typing on RPi OS for simplicity and portability, and testing on ATARI ST EmuTOS+MiNT (ARAnyM) (for size and speed constraints) - at least thats the plan ATM.

Cheers

Paul

FYI referecnce: an old Owl tcb article

paulwratt · October 17, 2021, 5:53am

After running the monitoring script for almost a month now (a slightly intergrated version of the script posted above), I found some quirks with logd, and after playing around with the sleep time I find I can’t eliminate the issue, only make less or more impact on the response (ie the reason the script exists, or what its doing)

My analysis shows that when sshd get hit hard, 10 seconds can allow 2-5 hits per second to be recorded in the log files, so the script is sleeping too long to be useful for what its there to do.

But at 3 or 5 seconds, it can take upto 40 mins before the tail registers as not equal to last tail. From a single IP address, even at 4 seconds delay, there will still be 50 hits before said IPv4 gets blocked.

But with 10 seconds, it can (often) take 20 mins before the tail registers as not equal to last tail. From a single IP address, even at 4 seconds delay, there will still be 20 hits before said IPv4 gets blocked.

Have I found a bug in Alpine’s logd implimentation, or simply a limitaion of it? Or is it a limitaion of the VPS I’m using? I don’t have a way to test/verify either. I would like to be able to use a RPi locally to do this, but the main reason for developing on a VPS is because I can’t guarentee power to said RPi, so I would be limited in what I could do, even when I could do it, without risking a lost OS+sdcard.

Apart from that everything is functioning well, IPv4 range blocking was a non-issue, and everything has been thoroughly tested in a live environment, so its (mostly) ready to go up on GH, to the point that I will be taking what I currently have functioning (as opposed to the overall targets of this OP project) and slipping them in the Devember 2021 challenge.

The interesting side affect of the IPv4 blocks, is the resulting IP’s seen in the logs. ie hundreds of unique IPv4 per 3 days, that only hit the sshd once per month.

A similar thing is happening with the nginx logs, just about all nginx-haxors have disappeared, except those unique IPv4 hitting /.env, boafrm, /setup.cgi?... and random \x0X string connections, all of which cull said IPv4 every hour with a seperate crond script.

One other side effect of the nginx part, is my inability to get php functioning with POST, which is a blessing ATM, because it produces lots of 403 & 402 errors in the logs, which the IPv4’s also get culled once per hour. The quirk with the \x0X strings and 400 errors, are that they don’t appear in the nginx-errors.log file, but rather in the nginx-access.log file. I dont know what to make of this whole situation, wheather it is a result of the custom regex I have in the Nginx configuration or what.

So I am hoping that if I put everything up as a Devember 2021 project, I will get a solution to some of these questions at some point, besides making this sub-project more widely available at the same time.

With regards to the use of OWL (OpenWall), from their mailing list it appears someone tried building a RPi port, so their is hope there too, and @wendell just posted:

That pertains to the initial project which resulted in this OP project, containers and non-systemd OS’s - read post #7 (my second reply) to get an indicator of what that use case is, and my next post to see what OS + container combinations are available now.

Cheers

Paul

paulwratt · October 17, 2021, 1:12pm

Update to OP points

done, extremely successful, most haxors do come back after a couple of tries
done, but not via /etc/nginx.conf, above location regex’s taxing for 1 core CPU
done, ranges are still manual atm (differcult to create valid range mask for said IPv4)

Threat de-escalation amounts to manually assessing IPv4 blackholed via a bunch of shell scripts (from both a block dir containing IPv4 and ip r directly) that do customised validation of IPv4 addresses, with the resulting output of ip route add blackhole 127.0.0.127 piped to a seperate log file (along with script origin, pid, and timestamp.

After running the first 2 Nginx location blacklist generations (that post process specific development url removal), I now find I really need a tool that can incrementally add to whats already there (because the information is now spread around multiple log files), which is some 400 specific urls + 2 dozen customized regex’s, that are logged to nginx-haxors.log.

So I have enough for a good /etc/nginx/nginx-block.conf that will be uploaded to GH. But I also think some of the custom regex’s need tweaking (maybe). That alternate log allows extra IPv4 processing, which consists of shell scripts run by www specific cron jobs, where the resulting IPv4 need to be patched into the blackhole interface by a root cron job that runs directly after it.

All the shell script are kept inside the www tree with www ownsership, meant to allow a web interface management tool via thttpd proxy through nginx port 80 (which does not exist yet). The scripts current security model in this space is security by obscurity (SBO), where their construction, location and naming conventions are designed in such a way as only be useful if used with an install script that allows SBO technique to remain unique.

This installer script is the only thing missing before the complete set of scripts can be made publicly available via GH. The shell scripts are paired with a few convention named PHP scripts and output folders to allow data analysis and log files to be easily views, which also require the install script to maintain their SBO.

After much study of web server haxor attempts, this particular type of SBO model seems like the only real practical solution ATM. This collection, along with a multi-device favicon example, will be released as a standalone project for the Devember 2021 challenge.

As a by-product of this decision, the is NO upstream interaction. And there may never be, as I have since realised its too open to abuse (and stupidity), althought I will probably look at it again after doing the new filesystem driver mentioned elsewhere in this thread (but that wont be till after the standalone project is more complete). It does seem totally doable on a private LAN, say with a PiHole box, but I dont have reliable power locally, so thats just another reason to put it off.

It is hoped that the results of the Devember 2021 challenge will produce (at least) Apache and Lighttpd block list config files as well, hopefully tested via RPiOS (ie Debian +systemd based). It is also hoped that a practical way to handle specific 403 and 402 HTTP/PHP POST attacks will also result from testing on another OS configuration.

Cheers

Paul

paulwratt · December 25, 2021, 9:18am

This is a bump for the GitHub link, and the fact I finally posted something for the #devember2021 project “thingy”.

I am not going to post a link to the Linode server, as all that will do is get you banned

paulwratt · January 1, 2022, 8:43am

I finally got the code uploaded, still needs some tweaks from a user point of view (I nginxd a couple of scripts, and need to figure a simple way to re-integrate them into the jobs that call them)

I decided to put up a statically named ssfw-blocklists/known_urls.txt as well, for those interested in what has been tried over a 6 month period. By changing /blob/ to /raw/ in the url you can downbload it or otherwise look at in its raw text formatting.

I am thinking of doing something similar for sshd “Failed password for invalid user” names, too.

I hope some of this data makes its way back into the community. A lot of the urls are from a list (or lists) available in “security testers”, as when you check the logs, there are very definately some “examples” that are supposed to be changed (like using 192.168.1.xxx) for the Jaws, Mozi.a and Mozi.m exploits urls (script kiddies, gotta love them, or not)

Iron_Bound · January 18, 2022, 7:19am

Did you look at Openresty? It’s nginx with a lua plugin system good for rate limiting, etc.

paulwratt · January 19, 2022, 10:45am

All other solutions have “prerequisites”. If you aren’t running a webserver, and you dont need log interaction, its one script, and that will work on Alpine Linux (the most bare bones of system full OS), with negligable impact on a single core 1Gb (or less) system.

If you do need webserver protection, at its simplest, its just one script that generates an “include” into any default webserver setup (only testing Nginx atm). There is no overhead for that webserver, or any 3rd party items needed to use it, and no “language” needed to understand it, outside of the webserver config format, if you decide to read the “include” before using it.

What I have should also function when webserver software is being used as a reverse proxy, which is how I will be using it, but that has yet to be fully tested.

That said, there are some other monitoring solutions, some of which use an AI approach. Some recent analysis has shown that “certain items” may be worthwhile if upstreamed to those aforementioned solutions (eg. IP address ASN ownership and their related IP address ranges).

Thanks for the mention of another solution by name though, unless you are in the industry (say an network admin specifically targeting entrypoint protection and/or response solutions) it can be differcult to find something good in this arena that suits your needs, unless it is “name dropped”, “recommended” or otherwise “mentioned”.

That said, I use what I have developed everyday, and because I dont use a web control panel frontend, the “extra fluff” makes analysis and management pretty simple, and it gives me the leahway to think about things like “automated range anaylysis and blocking”. I am also starting to to want to know more about why an IP address was blocked, and how I can get the extra information into what I have already developed without doing any “overhaul” on the code or the recording format. I have also recently realised that some of the checks I do can be sped up incredably by using a different approach (because I record on the filesystem, and not in a file).

From an uninitiated user point of view, dealing with different log file formats is probably the most restrictive entry point atm (outside of standard linux SSHd & Nginx default formats), but the way I write code means I could come back to this in 10 years without having used it, and know almost immediately how to get it running with something different. There is enough documentation in the code for someone else to easily be able to adapt the standalone “gerka” to handle other port monitoring (ie ftp, smtp, pop3, sql, etc).

As far as making a “super simple firewall” available to the public, I have some ideas of “refocusing” the project on the “simple” part, but most of that stems from the need to test other webservers, and other internet services, along with the need to create a “unique” presence in a field that is already quite full (especially of outdated or unsupported offerings).

To that end I might add a remote install script that checks for certain services and gives you some sort of checklist that can be modified, and then autogets, autoinstalls and autoruns everything you need. Even though I dislike systemd, there is nothing stopping me from using it to allow IPv4 blocking to function across reboots, even when a logging system is not installed, and its available by default on my current OS.

In all honesty I dont see many people using the resulting project very much at all, and those that do will probably be looking for “super liteweight firewall” type solution, and even then possibly only for certain ports, certain server software, or specific log formats, so single "Gerka"s will be more useful.

I do however see an eventual uptick in the use of the block lists over time, especially by security researchers and security analysts, then by others who found them useful in the (eg. the discontinued Internet Storm IP address blocklists). There might be some interesting data that can be expanded upon at “recored time” to help with deeper analysis over longer time periods (but who knows).

Like I said elsewhere I am also thinking about making it integrate into some web control panel frontends, but that will come later when I start using them myself (looking at PiHole and other web management frontends that it plugs-into or along-side-of)

Cheers

Paul

paulwratt · February 6, 2022, 8:25am

After fluffing around with some CIDR and AS scripts, I realised that the extra data I wanted to see is also the same extra data needed to concoct valid DShield logs for submission to the ISC (SANS Internet Storm Center).

A lot of the SSH attacks are singles from a unique IPv4, and thats it. Sometimes they also probe the web server (and prossibly other services and ports, Alpine Linux with Nginx and port 443 service off by default is very light on attack surface). However when I check these IP addresses on ISC I often see “one report from six days ago” type results.

I got hit by a sub domain probe, 420 incomming connections in lest than 1 minute, to about 5-6 urls each, and each connection was a unique sub domain. Again when I checked ISC “one report from six days ago”.

These (and other web probes) prompted me to reinvestigate the available DShield options. Turns out you can actually get single file Perl scripts, and (as it happens) Alpine Linux has Perl installed by default. This info was hard to find (dated 2004 - 2007), outside of the FrameWork and RPi Honey Pot options.

The thing is the DShield logs require(-ish) TCP or UDP transport per log entry, and SYN, DNS, etc . From the current data I already know all my stuff is TCP, and I know the ports (there are only 2 atm). The thing with Nginx logs tho, is they dont store the source port.

Part of these issues with modern attack surfaces and routes, is addressed by expanding how and what can be logged, based on the progress and outcom of the RPi Honey Pot. Which means (atm the only way forward is) I have to install the Honey Pot somewhere (its no longer RPi only BTW) and look at how and what sort of data it is collecting.

But is also means I now have a reason to move the IPv4 range block list entries over to the filesystem, and extend the data I am storing in log files based on where it came from, who logged it, and intervene with some analytic, to dump some extra data too (like was it a root or user attack on SSH, and was they user unkown - at least one person has seen the SSFW username, but I know they dont know that)

Outside of the Gerkas, the scripts and organisation of them is only slightly flawed in its current v1 state. I have come to realise the the individual webserver configuration folders require the presence of associated log analisys and clocking scripts, to maintain easy porting for new servers and services (I want to add FTP at the same time I add Apache).

However, in providing a custom log for Nginx to capture haxors, I did not it is not the same format as either the Nginx Access or Errors log files, and those are different to, with Access storing less info, but also logs Errors entries as well as my haxor entries. To make things even worse (from my point of view), because I dont have PHP setup yet, there are a lot of 405/406 “unsupported” log entries that dont make it into the Error log (POST & PUT).

Strangely there is one empty item in the default custom log in Nginx that has never been filled in all the 6 months I have been collect and analysing data.

I am concerned the providing custom Nginx Error and Access logs will break moist 3rd part collection and analysis tool and/or interfaces. Its annoying to think I will probably have to run 2 sets of log files to do DShield properly.

Nginx, especially when set up as a reverse proxy (as I have it) can handle connection management and access denial, but that does not help SSFW logging, blocking and analysis, it only mitigates the problem (and maybe? it would not help DShield logs - maybe it could? by improving log output).

It maybe that I need to write some proper kernel integration code, and look at something like RSysLog as a basis (which can handle 15 million log entries per second). It seems a bit much for a (suposedly) “super simple firewall”, but it might be useful as the basis for the upstream end point of what the OP propses.

I am also concerned about the default log rotation management of the syslogd “messages” files. 200Kb rotation and .0 naming, where as Debian/RPiOS is weekly rotation, with .1 naming, and gzipped above that (4).

I will probably need to log some query, issue and/or bug-report regarding the lack of file writing with high volume syslogd log messages on BusyBox/Alpine.

I’d prefer to scavenge a .xz package into the log archive when I do my monthly backups/rotations. Speaking of which the perils of manual grep && sid -i on log files means I lost all current SSFW log entries for the 30th of January (it was only 47 entries - and they are logs, not blocks).

Some sort of automated log management is called for. I see how some of the older DShield Perl scripts handle it (by storing the last entry they processed). The annoying part is that it really needs to be checked every 10 seconds on a default Alpine setup, because of its 200Kb byte limit. Actually maybe its just a “wakeup, do some maths, adjust sleep timer, sleep” type problem, and (on Alpine) you can test and store the size/date of messages.1 and then copy that when it changes (sort of accurate).

Anyway, for some random or unknown reason SSFragWhare: the super simple firewall with brutality bonus made it into the #devember2021 winners circle as a finalist, and it only took me about 8 hours to comprehend what I saw (guess that’s what happens when you only get 2 hours sleep in 2+ days). I think it might be down to my “beside the seaside” post that did it.

When I get some more Gerkas going, and after I have made these log changes, and got the DShield submissions running, I would like to pair up either the DShield Honey Pot on RPi, or PiHole on a RPi - actually maybe both, use the PiHole servver to watch and react to the Honey Pot server. I need a PiHole here when I change my networking over. This will (at least privately) allow me to investigate the original target of this OP - Automated Network Treat Response

Argh Well … I guess thats it for this update (lets hope the Judges find a juicey 12v/5v device for me)

Cheers

Paul

paulwratt · February 6, 2022, 1:34pm

Meh

It turns out my fears for the Nginx log file thing (above) is a non -issue.

I found the 404project Perl script in the ISC Honey Pot source tree, which takes an Apache log and sequesters certain info into a format ISC wants (which is pretty close to the DShield log format - near identical) , and both default custom Nginx logs and the Access log are the same format, and that will be why the Access also contains 404 and other errors as well.

The script looks for but does not post the actual error, so can use the haxor logs as is (even tho they are listed as 200 errors (ie not and error) because catch them in the config. It also appears that “missing item” is skipped over by the script to, so its probably a legacy Apache thing, and now irrelavant

Sweet

Strange that the (RPi) Honey Pot was/is pushed as an improvement to the logging situation, when that 404 Perl script is from 2012 (the year the RPi was born). I looked at the python honeypot script, and it even has a memcached service capture, odd, I guess without authentication it is an open target (I was just reading their protocol.txt last night).

Anyways, a couple less concerns, and one step closer to some ISC data logging …

… on with the show.

paulwratt · August 6, 2022, 4:23pm

For anyone looking to free up some log entries and help combat failed sshd attempts, I got a mention in risks’ “Sshd on port 22 spamming your logs? here's a solution” post, where I also make a case for SS:FragWhare usage as an assist. I think his approach is quite elegant, and it will work on a majority of Linux distributions (which have iptables and netfilter-persistent installed by default).

A note on that, a reminder that there is a monthly updated IPv4 blocklist archive available in the SS:FragWhare repo. You will need to screen it first, to make sure your own IP address and your hosting service is not already in there (no Linode IP ranges are present, but 99% of DigitalOcean, most MS Azure, and some Serverion ranges are).

As an update to this thread, the usefulness of what constitutes the analytics is at the point where some sort of AI anaiysis is now a logical next step, along with various control panel frontend integrations (eg PiHole). I still have not added the “bonus check” for a certain type of sshd key exchange attempt, but I have a log file for reference, so it will get added at some point.

I would really love to throw SS:FragWhare at some other server installations (besides Alpine), but the above mentioned post is the 1st time I have ever seen anyone interested in whats happening, and I know they are not using it. On top of the finacial constraints, my current situation does not lend itself towards making that happen either, besides the fact that it takes time away from the original reason for SS:FragWhare in the first place. I’ll see what I can do, I might be able to wing 4 servers in total, for a while at least.

The latest post on the project page is more inline with this thread (as opposed to being project specific), and sheds some light on the rescent ebb-and-flow of internet hacking attempts.

As a side note to the original approach I use with SS:FragWhare (the use of ip), I found out last week amongst the growing list of deprecating Linux command line tools, ip was introduced as a replacement, so you get “future proofing” by defult.