May 2008 Archives

Anton Chuvakin

AddThis Social Bookmark Button

Following the new “tradition” of posting a security tip of the week (mentioned here, here ; SANS jumped in as well), I decided to follow along and join the initiative. One of the bloggers called it “pay it forward” to the community.

So, Anton Logging Tip of the Day #15: Fear and Loathing in Event 567

This tip digs into a seemingly simple, but really VERY esoteric subject: monitoring file access and modification via a Windows event log. Now, some people - who never studied this subject - tend to have a very simplistic view of this: just enable Object Access auditing, then right-click on a file or directory, click Security->Advanced->Auditing and then pick what types of events will be logged and by what accessing entities (i.e. users or computers). OK, so this will produce some logs, that is for sure. But are they useful?

First, why are we doing this? We typically need to know the following when we audit file access in Windows (or any other OS for that matter) for security (monitoring and investigation) or compliance:

  • Time/date
  • Computer where it happened
  • User who touched the file
  • Application he used to access the file
  • File name + location (directory, share, etc)
  • Type of access (read, write, create, delete, etc)
  • Status (i.e. success or failure)

Can we get this from the above logs? No.

What? No!?! Really?

Yes, really. We can get some of the above, some of the time, not all of the above, all of the time. Here is an example, we are looking at event ID 560 (picture) and then at an extract from its description field.

Event:

event_log-560_1

Description (selected field):

Object Server: Security

Object Type: File

Object Name: C:\0\TestBed\simple_text_file.txt

Image File Name: C:\WINDOWS\system32\notepad.exe

Primary User Name: Anton

Primary Domain: XXXXXX

Accesses: READ_CONTROL

SYNCHRONIZE

ReadData (or ListDirectory)

WriteData (or AddFile)

AppendData (or AddSubdirectory or CreatePipeInstance)

ReadEA

WriteEA

ReadAttributes

WriteAttributes

WTH is that? Well, we know that the user ‘Anton’ has successfully read? wrote? changed attributes? did something? with a file named “C:\0\TestBed\simple_text_file.txt” using a program named “C:\WINDOWS\system32\notepad.exe.” That’s the best we can get, in this case! We may try to look at event IDs 562 and 567, but this missing information (i.e. the exact action performed) will not be added.

BTW, there will be a few more dozen (sometime hundreds!) of the 560s, 562s and 567s produced - all from just opening the text file in a notepad. The above event is notable for having BOTH “notepad” and “simple_text_file.txt” in the same event; others will have either of the two.

Anything else gets in the way? Yes, lots! MS Office will write to all files, even just opened for reading (with no user modifications to the content whatsoever), which will screw up your log monitoring efforts. If the file is on a share, more information will be missing (e.g. username might be).

So, how to use Windows event logs for file access tracking?

  1. Enable logging (as described above)
  2. Pick events 560 (most useful) and 562, 567 (useful too)
  3. Look for fun filenames that might be touched by the users (have a list of files and users handy)
  4. Figure out what programs were used to access them (this is called “Image File Name” in “WinLogSpeak”)
  5. Ponder the ‘Accesses’ section of each event until your brain turns blue :-) or until you decide whether such access is authorized or not…

Overall, this is still very useful for file access monitoring, but the process is paaaaaainful.

BTW, I am tagging all the tips on my del.icio.us feed. Here is the link: All Security Tips of the Day.

Technorati tags: , , ,


Anton Chuvakin

AddThis Social Bookmark Button

UPDATE: analysis of this poll is posted here.

So, my next poll is up - and it is fun: Which of the types of information are most useful when trying to make sense of a log entry?

Vote here!

Past polls:

  • Poll #7 “What tools do you use for Windows Event Log collection?” (analysis)
  • Poll #6 “Which logs do you LOOK at?” (analysis)
  • Poll #5 “What are your top challenges with logs?” (analysis)
  • Poll #4 “Who looks at logs in your organization?” (analysis)
  • Poll #3 “What do you do with logs?” (analysis)
  • Poll #2 “Why collect logs?” (analysis)
  • Poll #1 “Which logs do you collect?” (analysis)

  • Chris Josephes

    AddThis Social Bookmark Button

    This doesn’t look good, right?

    home2-vol.gif

    Most open source monitoring tools do filesystem health checking by comparing the current percentage of used space against a set value. If it’s is 90% full, send out a warning page; if it’s 89%, send the all clear.

    Notice that I said filesystem, and not actual disk. A single disk that’s 90% full can be a bad thing, because there are fewer free blocks available for writing, which leads to longer write times and file fragmentation. Not all filesystems are restricted to a single disk: there may be a back-end RAID solution, or the filesystem may be a shared filesystem served over NFS.

    Unfortunately, you could be the receiver of flapping alert pages where a filesystem sits between 90% and 89%, but it still performs fine. Unlike a broken Ethernet cable, the resolution for a filesystem threshold may not be so easy. Sometimes there are files that can’t be deleted, or there may not be any additional storage to allocate. You may have a filesystem that sits at 91% full for months simply because a new disk shelf won’t arrive until the next budget cycle.

    Everything comes down to disk blocks, even SAN and NAS solutions. That brings back the concern regarding fragmentation and performance. But what if your filesystem is a read-only OS image? Or what if it turns out 10% equates to 500 gigabytes on a huge disk appliance? If the filesystem is never being written to, or if the amount of writes equates to 0.001% of the entire filesystem, then where’s the fire?

    What about the inverse? What if your filesystem never reaches 90% full? Can there still be problems?

    In the above graph, nobody would have been paged by Nagios or other tools, because the filesystem never reached 90%. For the past few months it averaged 40% full, shot up to 75%, and then went back down. A newly released application was behaving incorrectly, and the issue was caught by the programmer. The next morning he stealthily re-released the application and corrected the issue. Nobody in systems administration noticed until the graph was checked in relation to another issue. If the programming error was never discovered, the filesystem would have filled up, probably at the most inconvenient time possible for a systems administrator.

    I would like to recommend to people developing filesystem or disk monitoring solutions change their way of thinking about filesystem health. Hard limits on allocated space may still be required, but those warnings should be optional. Measuring fullness makes assumptions about block structure that may not be correct.

    At the same time, the monitoring system should compare the standard deviation for the filesystem percentage over the past 24 hours, and compare it to the standard deviation for the past hour. Actually, you’d probably want to compare the first 23 hours out of 24, grab that standard deviation, and compare it to the deviation of the last hour.

    If those two deviations aren’t close, then there could be radical changes made to your filesystem that need to be addressed. Maybe files are being added or deleted, either way, it may warrant an investigation. For large filesystems in the terrabyte/petabyte range, using the percentage value may not be granular enough, so you will need to work with the actual value of free kilobytes or blocks.

    I take it back. This isn’t a recommendation to monitoring developers, this is a challenge. The first major open source monitoring guy that puts this solution together will have my undivided attention.