Step-By-Step Guides
Article created 2003-05-09 by
Rainer Gerhards.
Intrusion Detection via the Windows Event Log
The Windows event log provides
multiple evidence of potential intrusions. We will discuss what to look
for when checking the event log.
We have used Windows 2000
Server while creating this text. There may be differences for other
versions, so you might want to check if you are using a different
environment.
Detecting Failed Logons
Numerous failed logons are a
good indication that someone is trying to guess user passwords. This is
typically done using a so-called "dictionary attack", where a list
of words often used as passwords (the dictionary) is simply tried on a
given account. If the account password is carefully chosen, not only the
dictionary attack fails but there are many failed logon events. Even if
the password is contained in the dictionary, chances are quite good that
it is not in the first 5 to 10 words the attacker tries. Windows allows you to lock
out an account that has too many invalid password attempts. If the
configured threshold is reached, the account will be disabled for a
given period of time and Windows will also log event in the security
event log.
Configuring Windows
By default, Windows does not
check for those kind of attacks. It must be turned on by the
administrator. This is done in the "Account Lockout Policy" (part of
the "Account Policies"). On a single server, this is configured with
the "Local Security Settings" administrative tool. In a domain
environment, this is part of the group policy set that is to be applied.
Below is a snapshot of a typical configuration for it:
Please note that this policy
does not apply to the build-in administrator account. It will never be
locked out. This is another very good reason to rename it.
Activating this policy does not
automatically write events to the Windows Event Log when an user is
locked out by the system. To see these, you also need to turn on
auditing. This is done with the same tool, under the "Local
Policies", "Audit Policy". There, you need to enable at least
"Success" audits for the "Audit account management". This is
circled in the screen-shot below.
Please note that I have also
enabled Logon-related events in the screenshots. We will later see why I
have done this.
With these settings, we will
receive an security event 644 as soon as an account is locked out.

Screenshot of the 644 Event
Important:
our testing has shown that the 644 security event does not
occur under all Windows versions. While testing with Windows 2000
without a service pack, these events did not occur. After applying
service pack 3, they appeared. So be sure to check that the events occur
in your environment.
If the 644 event is not
generated on your systems and you are not able to patch it to the
service pack level that makes it appear, you can alternatively look into
the 693 logon failure events. When someone tries to use a locked out
account, they look as follows:

Please note the reason text.
This reason only happens when an account is locked out. However, this
reason does only occurs after the account has been locked out. The login
failure leading to the lockout still has the normal "invalid
password" text in it. As such, lockouts may be left undetected or
detected only after the incident when using this event as notification.
Not only for this reason we highly recommend to apply the most recent
service pack.
Creating Rules for MonitorWare Agent
Now that we have the proper
events present in the Windows Security Event Log, we can build
MonitorWare Agent rules to detect unusual patterns. Please keep in mind
that the rule set must be either bound to an event log monitor service
or be included from another rule set that is bound to one. Without that,
the rule set will not be executed. We will not explain this process here
in this chapter. We just focus on the filters and rule set itself.
We will create a rule that fires if our 644 event is seen:

We use an email action to notify the admin once this happens:

Of course, we could also have
done other things. Good example might be sending a syslog message to a
syslog server specifically monitoring such events. The proper action is
mainly depending on your intended result.
This rule detects attacks that
will lead to an account becoming locked out. It will also fire if a user
actually mistypes his password often enough to become locked out. This
rule does not help against attacks where the user id changes together
with the password. There are some tools out doing so.
Fortunately, we can detect
those attacks, too. The key to it is counting failed logons. If the
number of failed logons reaches a threshold within a given amount of
time, we can suspect that something is wrong. Of course, the threshold
is different for different types of machines. A web server, for example,
that is just serving web pages and where only administrators and web
authors log on, the number of failed logons should be really low. On a
busy file server, on the other hand, that threshold should probably be
much higher. As such, the actual numbers we use in our sample here
should be treated with care. They need to be replaced by some values
that match your typical environment and expectations. If in doubt,
consult your past event logs to find out what is normal.
We have two different event
ids to look at: the 529 event is generated when somebody logs onto the
machine itself. This must not be an interactive logon. It can also be a
logon via the network, via the web server, the ftp server or any other
logon that is done either by the user himself or a process on his behalf
on the local machine.
There is also the 681 event.
That event is logged whenever the security authority authenticates a
user. This event typically is logged on domain controllers when domain
users authenticate. A domain controller can log this event even when no
local logon happens afterwards. Also, as any domain controller can
authenticate a user, the 681 event can occur on every domain controller.
Thus, the amount of those events on a single domain controller can not
reliable be used to detect the threshold. On a stand-alone server, event
681 is logged together with 529.
For our needs, this means we
should monitor the 529 event if we are interested in the local failed
logon activity and the 681 if our scope is the network. In the later
case, it might be helpful to ensure that security events from all domain
controllers are passed to a central MWAgent. Only this ensures that
MWAgent has the full overview over network logon activity.
In our sample, we monitor a
stand-alone server. So our filter looks like this:

Please note the area red
encircled. This is the important part here. The "Fire Only if Event
Occurs" setting means that there must be at least 10 failed logons
within 60 seconds. If there are fewer, the filter will not apply, even
though the filter condition would otherwise apply. Similarly, the
"Minimum Wait Time" specifies that at least 120 seconds need to have
passed since the last time this filter condition fired. Again, if
the last match was more recent, the filter condition as whole does not
evaluate as true. So with the above filter, we will receive a
notification at most once every 2 minutes (120 seconds).
Obviously, the two global
filter conditions need to be adjusted to your environment.
Detecting Suspicious Configuration Changes
There are many opinions on
what a suspicious configuration change might be. In this sample, we
assume we are dealing with an already configured web server. It again is
a stand-alone server. There is not much need for configuration changes
once a machine has reached this stage. Obviously, some of the
notifications we generate here are overdone on a typical domain
controller. Nevertheless, the example should provide an idea of what to
look for.
Events we are interested in are these:
- Account Management
- 624 User Account Created
- 626 User Account Enabled
- 627 Password Change Attempted
- 628 User Account Password Set
- 629 User Account Disabled
- 630 User Account Deleted
- 631 Security Enabled Global Group Created
- 632 Security Enabled Global Group Member Added
- 633 Security Enabled Global Group Member Removed
- 634 Security Enabled Global Group Deleted
- 635 Security Enabled Local Group Created
- 636 Security Enabled Local Group Member Added
- 637 Security Enabled Local Group Member Removed
- 638 Security Enabled Local Group Deleted
- 639 Security Enabled Local Group Changed
- 641 Security Enabled Global Group Changed
- 642 User Account Changed
- 643 Domain Policy Changed
- System Events
- 512 Windows is starting up
- 513 Windows is shutting down (you will probably not see this event before the system is restarted)
- 516 Internal resources allocated for queuing of security event messages have been exhausted, leading to the loss of security event messages
- 517 The security log was cleared
- Policy Change
- 608 A user right was assigned
- 609 A user right was removed
- 610 A trust relationship with another domain was created
- 611 A trust relationship with another domain was removed
- 612 An audit policy was changed
- 768 A collision was detected between a namespace element in one forest
and a namespace element in another forest
Events in bold are uncommon on
nearly all types of machines. Depending on the role a server is playing,
events not shown in bold can occur as part of day-to-day operations. On
such servers, they should obviously not trigger alarms. Again, on a
fully configured web server in product, we would like to see neither of
them.
We create two rules in
MonitorWare Agent, one for the highly suspicious events and one for the
others. Lets start with the highly suspicious ones:

And this one holds the filter
conditions for the other suspicious events:

In order for this rule-sets to
work, we also need to tune our auditing settings. We now need to audit
"System Events" and "Policy Change, too. This will lead us to
these policy settings:

|