Friday 4 July 2014

Monitoring Systems Using Event Triggers

Monitoring Systems Using Event Triggers

Now that you know how to view, filter, and create events, let’s look at a technique using event triggers that you can use to automate the event monitoring process. With event triggers, you can configure system tasks that monitor the event logs and then take a specific action if an event occurs. For example, you can create a trigger that monitors the event logs for low disk space events and if such events occur, you can run a script that removes any temporary or unnecessary files to resolve the low disk space condition. Thus, not only can event triggers help you automate the monitoring process, the actions triggers take can also help you resolve issues as they arise to maintain system performance, ensure system integrity, and more.
Creating event triggers isn’t something you should do casually, without careful forethought. You need to have a clear plan of action—a set of goals that you hope to achieve by using event triggers. Let’s take a look at the reasons you might want to use event triggers and then look at the tools you can use to manage them.

Why Use Event Triggers?

Maintaining application and system performance is a key reason for using event triggers. For example, if an application running on a server has known issues that you usually have to resolve manually, you may be able to configure event triggers that monitor the event logs for related errors and then run scripts that take the appropriate actions to resolve the problem. Here, you would want to track down the known issues for the application by searching the event logs, asking other administrators about issues, or searching for knowledge-base articles that describe the issues. Afterward, match issues to specific events or types of events for which you can configure event triggers to monitor, and then write a script that notifies administrators of the issue or takes appropriate actions to resolve the issue. This script is then used as the task that the event trigger runs.
Another common reason for using event triggers is to help you identify application and service outages quickly, and to possibly restore normal operations. When an application or service stops, users can no longer use the resource and this can cost the organization dearly in time, money, and wasted resources. Here, you would want to search for documentation on the types of errors that can occur if the application or service isn’t responding normally. Then, searching the event logs to see if you find similar or matching events in the logs, you would note sources, event IDs, and descriptions used so that you can create event triggers to watch for the related events. Finally, you could write a script that restarts the application or takes other appropriate actions to resolve the outage.
You may also want to use event triggers to help you maintain system security and integrity. When a system is under attack, events may be written to the log files that indicate the application, component or service that is under attack. With a brute force attack, a hacker may be trying various user name and password combinations in an attempt to gain access. If you are monitoring the system under attack, you would see failed logon attempts in the security logs as the hacker attempts to gain access. A hacker may also try to bring down the system, application, or service using a denial-of-service attack. Typically hackers deny service by sending continuous streams of malformed service requests. These attempts should show up in the related application, system, or service-specific logs as errors. To combat such attacks, you could configure event triggers that watch for related events, such as account lockouts due to a series of failed logon attempts.

Getting Ready to Use Event Triggers

Before you start creating event triggers, you should consider what you hope to achieve through automated monitoring, as well as any impact the monitoring might have on the affected systems and the network as a whole. You should
  1. Identify the events you want to monitor and define the reasons for monitoring each event. Use the event logs on multiple systems and documentation of known issues and errors, such as knowledge-base articles, to help you pinpoint places to start.
  2. Specify the actions you want to take when an event occurs. Initially, write this as a list. Be sure to consider the impact any corrective actions might have on the system or the network as a whole.
  3. Write scripts or applications to handle the necessary corrective actions or user notifications. Don’t implement them as triggers yet. You should test the scripts first on an isolated network or development system to uncover any flaws in the planning.
  4. Define the event triggers and the tasks to execute, and then implement the triggers. Make sure you monitor the affected systems closely for the next several days or weeks to ensure there are no adverse affects.
  5. Maintain and remove triggers as necessary to ensure continuing operations.
Steps 1, 2, and 3 can be accomplished using the earlier discussions in this and other chapters of this book. Steps 4 and 5, however, involve the processes of defining, maintaining, and removing event triggers. These processes are handled with the following subcommands of the Eventtriggers utility:
  • Eventtriggers /create  Creates a new event trigger and sets the action to take
  • Eventtriggers /query  Displays the event triggers currently configured on a specified system
  • Eventtriggers /delete  Removes an event trigger when it is no longer needed
    Note 
    Unlike most other commands with subcommands, Eventtriggers subcommands use a forward slash (/).
The sections that follow discuss each of these subcommands and their usage.

Creating Event Triggers

Event triggers can be configured to run executable programs with the .exe extension and scripts with the .bat or .cmd extension when an event occurs. You create event triggers using
eventtriggers /create /tr Name /l LogName [Constraints] /d Description
/tk Task
where
  • Name  Sets the name of the trigger as a string of characters enclosed in quotation marks, such as “Connection Failure”.
  • LogName  Sets the name of the log to monitor. Use quotation marks if the log name contains spaces, as in “DNS Server.” The default value is asterisk (*), which specifies that all logs should be monitored.
  • Constraints  Sets the constraints that determine whether an event matches the trigger. Constraints limit the trigger’s scope according to event ID, event source, or event type, using the /Eid EventID, /So EventSource, or /T EventType parameters respectively.
    Tip 
    You can use multiple constraints as well. If you do, the event must match each constraint in order to be triggered. Thus, additional constraints narrow the scope of the trigger.
  • Task  Sets the program or script to execute. Be sure to type the full path to the program or scripts you want to run.
    Note 
    Eventtriggers does not verify file paths and if you enter an invalid file path, you will not see a warning. To pass arguments to an executable or script, enclose the file path and the command arguments in a set of double quotation marks, such as “c:\scripts\trackerror.bat system y”.
  • Description  Sets the description for the trigger and can be any string of characters. Be sure to enclose the description in double quotation marks.
Don’t be intimidated by the parade of parameters here. It’s a lot easier than it looks once you get started. Consider the following examples:
Create an event trigger that monitors all the event logs for events with the ID 9220 and then runs Record-prob.bat:
Eventtriggers /create /tr "Monitor 9220 Errors" /eid 9220 /tk
\\Mailer1\scripts\record-prob.bat
Create an event trigger that monitors the DNS Server log for events with the source as DNS and the event ID 4004 and then runs Dns-adfix.bat:
Eventtriggers /create /tr "DNS AD Fix" /l "DNS Server" /so "DNS"
/eid 4004 /tk c:\admin\scripts\dns-adfix.bat
Create an event trigger that monitors the security log for failure audit events with the source as Security:
Eventtriggers /create /tr "Failure Audit Checks" /l "Security" /so
"Security" /t Failureaudit
Event triggers are created and their associated tasks are run by default on the local computer with the permissions of the user who is currently logged on. Because this command is used primarily for administration, you will be prompted for a password before the event trigger is added. If the triggered task needs to run with different or specific user permissions, provide the Run As permissions using /u [Domain\]User [/p Password], where Domain is the optional domain name in which the user account is located, User is the name of the user account whose permissions you want to use, and Password is the optional password for the user account, such as
Eventtriggers /create /u adatam\wrstanek /p R4Runner! /tr "Exchange
Monitor" /l "Application" /so "MSExchangeMTA" /t warning /tk c:\ admin
\scripts\exe-errlog.bat
As necessary, you can also specify the remote computer on which you want to create the event trigger using /S Computer, where Computer is the remote computer name or IP address, such as
Eventtriggers /create /s 192.168.1.150 /tr "Exchange Monitor" /l
"Application" /so "MSExchangeMTA" /t warning /tk c:\admin\scripts
\exe-errlog.bat

Displaying Currently Configured Event Triggers

You can obtain information about currently configured event triggers using Eventtriggers /query. Simply type the command at the prompt, such as
eventtriggers /query
The basic output of the query shows you the event trigger ID, event trigger name, and the task that is run, as shown in the following example:
Trigger ID Event Trigger Name Task
========== ==================== ==================================
4 Failure Audit Checks c:\admin\scripts\auditing.bat
2 Monitor 9220 Errors \\Mailer1\scripts\record-prob.bat
3 DNS AD Fix c:\admin\scripts\dns-adfix.bat
1 Disk Cleanup d:\windows\system32\cleanmgr.exe
Note 
You will use the trigger ID to delete the trigger. The output format, by default, is table (/Fo Table). You can use /Fo Csv to format the output as comma-separated values or /Fo List to format the output as a list. You can also use the /Nh parameter to turn off the display of headers, if either the Table or the Csv format option is specified.
To get more detailed information, use the /V (verbose) flag. With verbose output the additional columns of information are
  • Hostname  The computer name or IP address of the computer on which the event trigger is configured.
  • Query  The complete command text used to create the event trigger.
  • Description  The description of the trigger, if provided when the trigger was created.
  • Run As (User name)  The Run As user used to create the task and run the associated task for the event trigger.
As necessary, you can specify the remote computer whose triggers you want to query using /s Computer, where Computer is the remote computer name or IP address, such as
eventtriggers /query /s Mailer1
You can also specify the Run As user permissions using /U [Domain\]User [/P Password], where Domain is the optional domain name in which the user account is located, User is the name of the user account whose permissions you want to use, and Password is the optional password for the user account, such as
eventtriggers /query /s Mailer1 /u adatam\administrator /p dataset5

Deleting Event Triggers

When event triggers are no longer needed, you can delete them using the eventtriggers /delete command. The syntax is
eventtriggers /delete /tid ID
where ID is the trigger ID you want to delete. You can also use asterisk (*) as the trigger ID to delete all event triggers. Consider the following examples:
Delete event trigger 5:
eventtriggers /delete /tid 5
Delete all event triggers:
eventtriggers /delete /tid *
As necessary, you can specify the remote computer whose triggers you want to delete using /S Computer, where Computer is the remote computer name or IP address, and the Run As permission using /U [Domain\]User [/P Password], where Domain is the optional domain name in which the user account is located, User is the name of the user account whose permissions you want to use, and Password is the optional password for the user account, such as
eventtriggers /delete /tid 3 /s Mailer1 /u adatam\wrstanek /p outreef7
Caution 
You can’t restore event triggers once you’ve deleted them. If you think you may use an event trigger again, write the output produced from Eventtriggers /query /v to a file and then save the file for future reference. With the /V parameter, the file will contain the complete command text used to create the trigger.

Thursday 3 July 2014

Monitoring Processes and Performance

Monitoring Processes and Performance

An important part of every administrator’s job is to monitor network systems and ensure that everything is running smoothly—or as smoothly as can be expected, anyway. As you learned in the previous chapter, watching the event logs closely can help you detect and track problems with applications, security, and essential services. Often when you detect or suspect a problem, you’ll need to dig deeper to search out the cause of the problem and correct it. Hopefully, by pinpointing the cause of a problem, you can prevent it from happening again.

Managing Applications, Processes, and Performance

Whenever the operating system or a user starts a service, runs an application, or executes a command, Microsoft Windows starts one or more processes to handle the related program. Several command-line utilities are available to help you manage and monitor programs. These utilities include
  • Process Resource Manager (Pmon)  Displays performance statistics, including memory and CPU usage, as well as a list of all processes running on the local system. Used to get a detailed snapshot of resource usage and running processes. Pmon is included in the Windows Resource Kit.
  • Task List (Tasklist)  Lists all running processes by name and process ID. Includes information on the user session and memory usage.
  • Task Kill (Taskkill)  Stops running processes by name or process ID. Using filters, you can also halt processes by process status, session number, CPU time, memory usage, user name, and more.
In the sections that follow, you’ll find detailed discussions on how these command- line tools are used. First, however, let’s look at the ways processes are run and the common problems you may encounter when working with them.

Understanding System and User Processes

Generally, processes that the operating system starts are referred to as system processes; processes that users start are referred to as user processes. Most user processes are run in interactive mode. That is, a user starts the processes interactively with the keyboard or mouse. If the application or program is active and selected, the related interactive process has control over the keyboard and mouse until you switch control by terminating the program or selecting a different one. When a process has control, it’s said to be running “in the foreground.”
Processes can also run in the background, independently of user logon sessions. Background processes do not have control over the keyboard, mouse, or other input devices and are usually run by the operating system. Using the Task Scheduler, users can run processes in the background as well, however, and these processes can operate regardless of whether the user is logged on. For example, if Task Scheduler starts a scheduled task while the user is logged on, the process can continue even when the user logs off.
Windows tracks every process running on a system by image name, process ID, priority, and other parameters that record resource usage. The image name is the name of the executable that started the process, such as Msdtc.exe or Svchost.exe. The process ID is a numeric identifier for the process, such as 2588. The process priority is an indicator of how much of the system’s resources the process should get relative to other running processes. With priority processing, a process with a higher priority gets preference over processes with lower priority and may not have to wait to get processing time, access memory, or work with the file system. A process with lower priority, on the other hand, usually must wait for a higher-priority process to complete its current task before gaining access to the CPU, memory, or the file system.
In a perfect world, processes would run perfectly and would never have problems. The reality is, however, that problems occur and they often appear when you’d least want them to. Common problems include the following:
  • Processes become nonresponsive, such as when an application stops processing requests. When this happens, users may tell you that they can’t access a particular application, that their requests aren’t being handled, or that they were kicked out of the application.
  • Processes fail to release the CPU, such as when you have a runaway process that is using up CPU time. When this happens, the system may appear to be slow or nonresponsive because the runaway process is hogging processor time and is not allowing other processes to complete their tasks.
  • Processes use more memory than they should, such as when an application has a memory leak. When this happens, processes aren’t releasing memory that they’re using properly. As a result, the system’s available memory may gradually decrease over time and as the available memory gets low, the system may be slow to respond to requests or it may become nonresponsive. Memory leaks can also make other programs running on the same system behave erratically.
In most cases, when you detect these or other problems with system processes, you’ll want to stop the process and start it again. You would also want to examine the event logs to see if the cause of the problem can be determined. With memory leaks, you would want to report the memory leak to the developers and see if an update that resolves the problem is available.
Tip 
A periodic restart of an application with a known memory leak is often useful. Restarting the application should allow the operating system to recover any lost memory.

Examining Running Processes

When you want to examine processes that are running on a local or remote system, you can use the Tasklist command-line utility. With Tasklist, you can:
  • Obtain the process ID, status, and other important information about processes running on a system.
  • View the relationship between running processes and services configured on a system.
  • View lists of DLLs used by processes running on a system.
  • Use filters to include or exclude processes from Tasklist queries.
Each of these tasks is discussed in the sections that follow.

Obtaining Detailed Information on Processes

On a local system, you can view a list of running tasks, simply by typing tasklist at the command prompt. As with many other command-line utilities, Tasklist runs by default with the permissions of the currently logged on user and you can also specify the remote computer whose tasks you want to query, and the Run As permissions. To do this, use the expanded syntax, which includes the following parameters:
/s Computer /u [Domain\]User [/p Password]
where Computer is the remote computer name or IP address, Domain is the optional domain name in which the user account is located, User is the name of the user account whose permissions you want to use, and Password is the optional password for the user account. If you don’t specify the domain, the current domain is assumed. If you don’t provide the account password, you are prompted for the password.
To see how the computer and user information can be added to the syntax, consider the following examples:
Query Mailer1 for running tasks:
tasklist /s mailer1
Query 192.168.1.5 for running tasks using the account adatum\wrstanek:
tasklist /s 192.168.1.5 /u adatum\wrstanek
Tip 
The basic output of these commands is in table format. You can also format the output as a list or lines of comma-separated values using /Fo List or /Fo Csv, respectively. Remember you can redirect the output to a file using output redirection (> or >>), such as tasklist /s mailer1 >> current-tasks.log.
Regardless of whether you are working with a local or remote computer, the output should be similar to the following:
Image Name PID Session Name Session# Mem Usage
===================== ====== ============== =========== ============
System Idle Process 0 Console 0 16 K
System 4 Console 0 216 K
smss.exe 420 Console 0 480 K
csrss.exe 472 Console 0 4,420 K
sqlgea.exe 496 Console 0 3,352 K
services.exe 540 Console 0 3,288 K
sqlmon.exe 552 Console 0 32,508 K
sdman.exe 728 Console 0 2,856 K
sdman.exe 788 Console 0 3,840 K
sdman.exe 988 Console 0 4,016 K
sdman.exe 1036 Console 0 2,032 K
sdman.exe 1048 Console 0 15,624 K
spoolsv.exe 1348 Console 0 4,728 K
msdtc.exe 1380 Console 0 3,808 K
The Tasklist fields provide the following information:
  • Image NameThe name of the process or executable running the process.
    Note 
    The first process is named System Idle Process. This special system process is used to track the amount of system resources that aren’t being used. For more information on this process, see the “Monitoring Processes and System Resource Usage” section of this chapter.
  • PID  The process identification number.
  • Session Name  The name of the session from which the process is being run. An entry of console means the process was started locally.
  • Session #  A numerical identifier for the session.
  • Memory Usage  The total amount of memory being used by the process at the specific moment that Tasklist was run.
If you want more detailed information you can specify that verbose mode should be used by including the /V parameter. Verbose mode adds the following columns of data:
  • Status  Current status of the process as Running, Not Responding, or Unknown. A process can be in an Unknown state and still be running and responding normally. A process that is Not Responding, however, more than likely must be stopped or restarted.
  • User Name  User account under which the process is running, listed in domain\user format. For processes started by Windows, you will see the name of the system account used, such as SYSTEM, LOCAL SERVICE, or NETWORK SERVICE, with the domain listed as NT AUTHORITY.
  • CPU Time  The total amount of CPU cycle time used by the process since its start.
  • Window Title  Windows display name of the process if available. Otherwise, the display name is listed as N/A for not available. For example, the Helpctr.exe process is listed with the Windows title Help And Support Center

Viewing the Relationship Between Running Processes and Services

When you use Tasklist with the /Svc parameter, you can examine the relationship between running processes and services configured on the system. In the output, you’ll see the process image name, process ID, and a list of all services that are using the process, similar to that shown in the following example:
Image Name PID Services
========================= ===== ===========================================
System Idle Process 0 N/A
System 4 N/A
smss.exe 408 N/A
csrss.exe 456 N/A
winlogon.exe 484 N/A
services.exe 528 Eventlog, PlugPlay
lsass.exe 540 HTTPFilter, kdc, Netlogon, NtLmSsp,
PolicyAgent, ProtectedStorage, SamSs
svchost.exe 800 RpcSs
svchost.exe 956 Dnscache
svchost.exe 984 LmHosts
svchost.exe 996 AudioSrv, Browser, CryptSvc, dmserver,
EventSystem, helpsvc, lanmanserver,
lanmanworkstation, Netman, Nla, Schedule,
seclogon, SENS, ShellHWDetection, W32Time,
winmgmt, wuauserv, WZCSVC
spoolsv.exe 1300 Spooler
msdtc.exe 1332 MSDTC
dfssvc.exe 1400 Dfs
dns.exe 1436 DNS
svchost.exe 1492 ERSvc
inetinfo.exe 1552 IISADMIN, IMAP4Svc, POP3Svc, RESvc, SMTPSVC
ismserv.exe 1568 IsmServ
ntfrs.exe 1584 NtFrs
svchost.exe 1688 RemoteRegistry
mad.exe 1724 MSExchangeSA
mssearch.exe 1784 MSSEARCH
exmgmt.exe 1824 MSExchangeMGMT
svchost.exe 2000 W3SVC
store.exe 2108 MSExchangeIS
By default, the output is formatted as a table, and you cannot use the list or CSV format. Beyond formatting, the important thing to note here is that services are listed by their abbreviated name, which is the naming style used by Sc, the service controller command-line utility, to manage services.
You can use the correlation between processes and services to help you manage systems. For example, if you think you are having problems with the World Wide Web Publishing Service (W3svc), one step in your troubleshooting process is to begin monitoring the service’s related process or processes. You would want to examine
  • Process status
  • Memory usage
  • CPU time
By tracking these statistics over time, you can watch for changes that could indicate the process has stopped responding, is a runaway process hogging CPU time, or that there is a memory leak.

Viewing Lists of DLLs Being Used by Processes

When you use Tasklist with the /M parameter, you can examine the relationship between running processes and DLLs configured on the system. In the output, you’ll see the process image name, process ID, and a list of all DLLs that the process is using, as shown in the following example:
Image Name PID Modules
========================= ====== =============================================
System Idle Process 0 N/A
System 4 N/A
smss.exe 408 ntdll.dll
csrss.exe 456 ntdll.dll, CSRSRV.dll, basesrv.dll,
winsrv.dll, KERNEL32.dll, USER32.dll,
GDI32.dll, sxs.dll, ADVAPI32.dll, RPCRT4.dll,
Apphelp.dll, VERSION.dll
Knowing which DLL modules a process has loaded can further help you pinpoint what may be causing a process to become nonresponsive, to fail to release the CPU, or to use more memory than it should. In some cases, you might want to check DLL versions to ensure they are the correct DLLs that the system should be running. Here, you would need to consult the Microsoft Knowledge Base or manufacturer documentation to verify DLL versions and other information.
If you are looking for processes using a specified DLL, you can also specify the name of the DLL you are looking for. For example, if you suspect that the printer spooler driver Winspool.drv is causing processes to hang up, you can search for processes that use Winspool.drv instead of Winspool32.drv and check their status and resource usage.
The syntax that you use to specify the DLL to find is
tasklist /m DLLName
where DLLName is the name of the DLL to search for. Tasklist matches the DLL name without regard to the letter case, and you can enter the DLL name in any letter case. Consider the following example:
tasklist /m winspool.drv
In this example, you are looking for processes using Winspool.drv. The output of the command would show the processes using the DLL, with their process IDs, as shown in the following example:
Image Name PID Modules
========================= ====== =============================================
winlogon.exe 484 WINSPOOL.DRV
spoolsv.exe 1300 winspool.drv
explorer.exe 3516 WINSPOOL.DRV
mshta.exe 3704 WINSPOOL.DRV

Filtering Task List Output

Using the /Fi parameter of the Tasklist utility, task lists can be filtered using any of the information fields available, even if the information field isn’t normally included in the output due to the parameters you’ve specified. This means you can specify that you want to see only processes listed with a status of Not Responding, only information for Svchost.exe processes, or only processes that use a large amount of CPU Time.
You designate how a filter should be applied to a particular Tasklist information field using filter operators. The filter operators available are
  • Eq  Equals. If the field contains the specified value, the process is included in the output.
  • Ne  Not equals. If the field contains the specified value, the process is excluded from the output.
  • Gt  Greater than. If the field contains a numeric value and that value is greater than the value specified, the process is included in the output.
  • Lt  Less than. If the field contains a numeric value and that value is less than the value specified, the process is included in the output.
  • Ge  Greater than or equal to. If the field contains a numeric value and that value is greater than or equal to the value specified, the process is included in the output.
  • Le  Less than or equal to. If the field contains a numeric value and that value is less than or equal to the value specified, the process is included in the output.
As Table 7-1 shows, the values that can be used with filter operators depend on the task list information field you use. Remember that all fields are available even if they aren’t normally displayed with the parameters you’ve specified. For example, you can match the status field without using the /V (verbose) flag.

Table 7-1: Filter Operators and Valid Values for Tasklist
Filter Field Name
Valid Operators
Valid Values
CPUTime
eq, ne, gt, lt, ge, le
Any valid time in the format hh:mm:ss
Services
eq, ne
Any valid string of characters
ImageName
eq, ne
Any valid string of characters
MemUsage
eq, ne, gt, lt, ge, le
Any valid integer, expressed in kilobytes (KB)
PID
eq, ne, gt, lt, ge, le
Any valid positive integer
Session
eq, ne, gt, lt, ge, le
Any valid session number
SessionName
eq, ne
Any valid string of characters
Status
eq, ne
Running, Not Responding, Unknown
Username
eq, ne
Any valid user name, with user name only or in domain\user format
WindowTitle
eq, ne
Any valid string of characters
Double quotation marks must be used to enclose the filter string. Consider the following examples to see how filters can be used:
Look for processes that are not responding:
tasklist /fi "status eq not responding"
Note 
When working with remote systems, you can’t filter processes by status or Window title. A work around for this in some cases is to pipe the output through the FIND command, such as tasklist /v /s Mailer1 /u adatum\wrstanek | find /i “not responding”. Note that, in this case, the field you are filtering must be in the output, which is why the /V parameter was added to the example. Further, you should specify that the find command should ignore the letter case of characters by using the /I parameter.
Look for processes on Mailer1 with a CPU time of more than 30 minutes:
tasklist /s Mailer1 /fi "cputime gt 00:30:00"
Look for processes on Mailer1 that use more than 20,000 KB of memory:
tasklist /s Mailer1 /u adatum\wrstanek /fi "memusage gt 20000"
Enter multiple /Fi “Filter” parameters to specify that output must match against multiple filters:
tasklist /s Mailer1 /fi "cputime gt 00:30:00" /fi "memusage gt 20000"

Monitoring Processes and System Resource Usage

Process Resource Monitor (Pmon) displays a snapshot of system resource usage and running processes. When you run this utility by typing pmon at the command prompt, the Process Resource Monitor collects information on the current system resource usage and running processes and displays it in the console window. The statistics are collected again every five seconds and redisplayed automatically. Pmon continues to run until you press the Q key to quit, and any other key you press tells Pmon to update the statistics.
Note 
Pmon output cannot be redirected, and you can only run Pmon on a local computer. To examine the resources of a remote computer, remotely access the computer using Remote Desktop. Further, you cannot use Pmon with the REMOTE command. Pmon redirects command output and is incompatible with the REMOTE command.
Pmon output is in table format with columns and rows of data, as follows:
Memory: 523248K Avail: 300516K PageFlts: 905 InRam Kernel: 2444K P:11496K
Commit: 337868K/ 214648K Limit:1280320K Peak: 345720K Pool N: 8372K P:11648K
Mem Mem Page Flts Commit Usage Pri Hnd Thd Image
CPU CpuTime Usage Diff Faults Diff Charge NonP Page Cnt Cnt Name
39448 64 59570 282 File Cache
96 0:38:42 16 0 0 0 0 0 0 0 0 1 IdleProcess
0 0:00:03 216 0 4080 0 28 0 0 8 1810 59 System
0 0:00:00 480 0 197 0 164 0 5 11 17 3 smss.exe
2 0:00:09 5236 56 2803 24 3216 5 48 13 756 10 csrss.exe
0 0:00:01 4624 0 12878 0 7620 8 50 13 537 21 sqlgea.exe
0 0:00:05 4740 0 2181 0 3932 12 52 9 388 19 services.exe
0 0:00:04 30676 0 19113 0 28856 83 80 9 1040 61 sqlmon.exe
0 0:00:00 2860 0 780 2 1040 23 21 8 242 11 sdman.exe
0 0:00:00 3788 0 1076 0 1272 4 28 8 127 14 sdman.exe
0 0:00:00 944 0 232 0 340 1 7 13 7 1 pmon.exe
0 0:00:00 1776 0 464 0 536 1 25 8 15 1 notepad.exe
As shown, the first two rows of data provide a summary of memory usage. The values are in kilobytes (KB) and provide the following information:
  • Memory, Avail  Provides information on the total RAM on the system. Memory shows the amount of physical RAM. Avail shows the RAM not currently being used and available for use.
  • InRam Kernel  Provides information on the memory used by the operating system kernel. Critical portions of kernel memory must operate in RAM and can’t be paged to virtual memory. This type of kernel memory is listed as InRam Kernel. The rest of kernel memory can be paged to virtual memory and is listed after the InRam Kernel.
  • Commit, Limit, Peak  Provides information on committed physical and virtual memory. Commit lists physical memory which has space reserved on the disk page file, followed by the current amount of committed virtual memory. Limit lists the amount of virtual memory that can be committed without having to extend the paging file(s). Peak lists the maximum memory used by the system since the system was started. If the difference between the total memory available and the committed memory used is consistently small, you might want to add physical memory to the system to improve performance. If the peak memory usage is within 10 percent of the Limit value, you might want to add physical memory or increase the amount of virtual memory or both.
  • Pool N and P  Pooled memory values provide information on the paged pool, which is physical memory used by the operating system that can be written to disk when they are not being used, and the nonpaged pool, which is physical memory used by the operating system that cannot be written to disk and must remain in memory so long as they are allocated. Pool N is the size of the nonpaged pool and the value that follows it (Pool P) is the size of the paged pool.
Following the two rows of memory usage statistics, you’ll find columns of information detailing resource usage for individual processes. These data points provide lots of information about running processes and you can use this information to determine which processes are hogging system resources, such as CPU time and memory. The fields displayed are the following:
  • CPU  The percentage of CPU utilization for the process.
  • CpuTime  The total amount of CPU cycle time used by the process since it was started.
  • Mem Usage  The amount of memory the process is using.
  • Mem Diff  Displays the change in memory usage for the process recorded since the last update.
  • Page Faults  A page fault occurs when a process requests a page in memory and the system can’t find it at the requested location. If the requested page is elsewhere in memory, the fault is called a soft page fault. If the requested page must be retrieved from disk, the fault is called a hard page fault. Most processors can handle large numbers of soft faults. Hard faults, however, can cause significant delays, and if there are a lot of hard faults, you may need to increase the amount of memory or reduce the system cache size. To learn how to determine the volume of hard faults, see the section of this chapter titled “Monitoring Memory Paging for Individual Processes.”
  • Flts Diff  Displays the change in the number of page faults for the process recorded since the last update.
  • Commit Charge  Displays the amount of virtual memory allocated to and reserved for the process.
  • Usage NonP/Page  Shows nonpaged pool and paged pool usage. The non paged pool is an area of system memory for objects that can’t be written to disk. The paged pool is an area of system memory for objects that can be written to disk when they aren’t used. You should note processes that require a high amount of nonpaged pool memory. If there isn’t enough free memory on the server, these processes might be the reason for a high level of page faults.
  • Pri  Shows the priority of the process. Priority determines how much of the system resources are allocated to a process. Standard priorities are Low (4), Below Normal (6), Normal (8), Above Normal (10), High (13), and Real- Time (24). Most processes have a normal priority by default. The highest priority is given to real-time processes. You may also see other priorities. For example, the Idle Process thread has a priority of 0, as this thread doesn’t use CPU time but rather tracks when the CPU is idle. Some system service processes have priority 9 or 11, to give either a slightly higher than normal priority or a slightly above normal priority to an important process.
  • Hnd Cnt  The total number of file handles maintained by the process. Use the handle count to gauge how dependent the process is on the file system. Some processes, such as those used by Microsoft Internet Information Services (IIS), have thousands of open file handles. System memory is required to maintain each file handle.
  • Thd Cnt  The current number of threads that the process is using. Most server applications are multithreaded. Multithreading allows concurrent execution of process requests. Some applications can dynamically control the number of concurrently executing threads to improve application performance. Too many threads, however, can cause the operating system to switch thread contexts too frequently, actually reducing performance.
  • Image Name  The name of the process or executable running the process.
As you examine processes, keep in mind that a single application might start multiple processes. Generally, these processes are dependent on a central process, and from this main process a process tree containing dependent processes is formed. When you terminate processes, you’ll usually want to target the main application process or the application itself rather than dependent processes. This ensures that the application is stopped cleanly.
If you use Pmon to examine running processes, you’ll note three unique processes:
  • File Cache  The file system cache is an area of physical memory that stores recently used pages of data for applications. When you see changes in the file cache, you are seeing I/O activity for applications. Memory usage shows the total physical memory used by the file cache. Page faults shows the number of pages sought but not found in the file system cache and had to be retrieved elsewhere in memory (soft fault) or from disk (hard fault). If you monitor the Flts Diff for the File Cache, you can determine the cache fault rate. A consistently high cache fault rate may indicate the need to increase the amount of physical memory on the system.
  • Idle Process  Unlike other processes that track resource usage, Idle Process tracks the amount of CPU processing time that isn’t being used. Thus, a 99 in the CPU column for the Idle Process means 99 percent of the system resources currently aren’t being used. If you believe that a system is overloaded, you should monitor the idle process. Watch the CPU usage and the total CPU time. If the system consistently has low idle time (meaning high CPU usage), you may want to consider upgrading the processor or even adding processors.
  • System  System shows the resource usage for the local system process.

Stopping Processes

When you want to stop processes that are running on a local or remote system, you can use the Taskkill command-line utility. With Taskkill, you can stop processes by process ID using the /Pid parameter or image name using the /Im parameter. If you want to stop multiple processes by process ID or image name, you can enter multiple /Pid or /Im parameters as well. With image names, however, watch out, because Taskkill will stop all processes that have that image name. Thus if there are three instances of Helpctr.exe running, all three processes would be stopped if you use Taskkill with that image name.
As with Tasklist, Taskkill runs by default with the permissions of the user who is currently logged on and you can also specify the remote computer whose tasks you want to query, and the Run As permissions. To do this, you use the expanded syntax, which includes the following parameters:
/s Computer /u [Domain\]User [/p Password]
where Computer is the remote computer name or IP address, Domain is the optional domain name in which the user account is located, User is the name of the user account whose permissions you want to use, and Password is the optional password for the user account. If you don’t specify the domain, the current domain is assumed. If you don’t provide the account password, you are prompted for the password.
Note 
Sometimes it is necessary to force a process to stop running. Typically, this is necessary when a process stops responding while opening a file, reading or writing data, or performing other read/write operations. To force a process to stop, you use the /F parameter. This parameter is only used with processes running on local systems. Processes stopped on remote systems are always forcefully stopped.
Tip 
As you examine processes, keep in mind that a single application might start multiple processes. Generally, these processes depend on a central process, and from this main process a process tree containing dependent processes is formed. Occasionally, you may want to stop the entire process tree, starting with the parent application process and including any dependent processes, and to do this, you can use the /T parameter.
Consider the following examples to see how Taskkill can be used:
Stop process ID 208:
taskkill /pid 208
Stop all processes with the image name Cmd.exe:
taskkill /im cmd.exe
Stop processes 208, 1346, and 2048 on MAILER1:
taskkill /s Mailer1 /pid 208 /pid 1346 /pid 2048
Force local process 1346 to stop:
taskkill /f /pid 1346
Stop a process tree, starting with process ID 1248 and including all child processes:
taskkill /t /pid 1248
To ensure that only processes matching specific criteria are stopped, you can use all the filters listed in Table 7-1 except Sessionname. For example, you can use a filter to specify that only instances of Cmd.exe that are not responding should be stopped rather than all instances of Cmd.exe (which is the default when you use the /Im parameter).
Taskkill adds a Modules filter with operators EQ and NE to allow you to specify DLL modules that should be excluded or included. As you may recall, you use the Tasklist /m parameter to examine the relationship between running processes and DLLs configured on the system. Using the Taskkill Modules filter with the EQ operator, you could stop all processes using a specific DLL. Using the Taskkill Modules filter with the NE operator, you ensure that processes using a specific DLL are not stopped.
Tip 
When you use filters, you don’t have to specify a specific image name or process ID to work with. This means you can stop processes based solely on whether they match filter criteria. For example, you can specify that you want to stop all processes that aren’t responding.
As with Tasklist, multiple filters can be used as well. Again, double quotation marks must be used to enclose the filter string. Consider the following examples to see how filters can be used with Taskkill:
Stop instances of Cmd.exe that are not responding:
taskkill /im cmd.exe /fi "status eq not responding"
Stop all processes with a process ID greater than 4 if they aren’t responding:
taskkill /fi "pid gt 4" /fi "status eq not responding"
Stop all processes using the Winspool.drv DLL:
taskkill /fi "modules eq winspool.drv"
Caution 
Although the /Im and /Pid flags are not used in the previous example, the process IDs are filtered so that only certain processes are affected. You don’t want to stop the system or system idle process accidentally. Typically, these processes run with a process IDs of 4 and 0 respectively, and if you stop them, the system will stop responding or shut down.

Writing Custom Events to the Event Logs

Writing Custom Events to the Event Logs

Whenever you work with automated scripts, scheduled tasks, or custom applications, you might want those scripts, tasks, or applications to write custom events to the event logs. For example, if a script runs normally, you might want to write an informational event in the application log that specifies this so it is easier to determine that the script ran and completed normally. Similarly, if a script doesn’t run normally and generates errors, you might want to log an error or warning event in the application log so that you’ll know to examine the script and determine what happened.
Tip 
You can track errors that occur in scripts using %ErrorLevel%. This environment variable tracks the exit code of the most recently used command. If the command executes normally, the error level is zero (0). If an error occurs while executing the command, the error level is set to a nonzero value. To learn more about working with error levels, see the section of Chapter 3 titled, “Getting Acquainted with Variables.”
To create custom events, you’ll use the Eventcreate utility. Custom events can be logged in any available log except the security log, and can include the event source, ID and description you want to use. The syntax for Eventcreate is
eventcreate /l LogName /so EventSource /t EventType /id EventID /d
EventDescr
where
  • LogName  Sets the name of the log to which the event should be written. Use quotation marks if the log name contains spaces, as in “DNS Server.”
    Tip 
    You cannot write custom events to the security logs. You can, however, write custom events to the DNS Server, Directory Service, File Replication Service, or other service-related logs. Start by writing a dummy event using the event source you want to register for use with that log. The initial event for that source will be written to the application log. You can then use the source with the specified log and your custom events.
  • EventSource  Specifies the source to use for the event and can be any string of characters. If the string contains spaces, use quotation marks, as in “Event Tracker.” In most cases, you’ll want the event source to identify the application, task, or script that is generating the error.
    Caution 
    Carefully plan the event source you want to use before you write events to the logs using those sources. Each event source you use must be unique and cannot be the same name as an existing source used by an installed service or application. For example, you cannot use DNS, W32Time or Ntfrs as sources because these sources are already used by installed services or applications. Additionally, once you use an event source with a particular log, the event source is registered for use with that log on the specified system. For example, you cannot use “EventChecker” as a source in the application log and in the system log on MAILER1. If you try to write an event using “EventChecker” to the system log after writing a previous event with that source to the application log, you will see the following error message: “ERROR: Source already exists in ‘Application’ log. Source cannot be duplicated.”
  • EventType  Sets the event type as Information, Warning, or Error. “Success Audit” and “Failure Audit” event types are not valid; these events are used with the security logs and you cannot write custom events to the security logs.
  • EventID  Specifies the numeric ID for the event and can be any value from 1 to 1000. Before you assign event IDs haphazardly, you may want to write a list of the general events that can occur and then break these down into categories. You could then assign a range of event IDs to each category. For example, events in the 100s could be general events, events in the 200s could be status events, events in the 500s could be warning events, and events in the 900s could be error events.
  • EventDescr  Sets the description for the event and can be any string of characters. Be sure to enclose the description in quotation marks.
    Note 
    Eventcreate runs by default on the local computer with the permissions of the user who is currently logged on. As necessary, you can also specify the remote computer whose tasks you want to query and the Run As permissions using /S Computer /u [Domain\]User [/P Password], where Computer is the remote computer name or IP address, Domain is the optional domain name in which the user account is located, User is the name of the user account whose permissions you want to use, and Password is the optional password for the user account.
To see how Eventcreate can be used, consider the following examples:
Create an information event in the application log with the source Event Tracker and event ID 209:
eventcreate /l "application" /t information /so "Event Tracker" /id
209 /d "evs.bat script ran without errors."
Create a warning event in the system log with the source CustApp and event ID 511:
eventcreate /l "system" /t warning /so "CustApp" /id 511 /d
"sysck.exe didn't complete successfully."
Create an error event in the system log on MAILER1 with the source “SysMon” and event ID 918:
eventcreate /s Mailer1 /l "system" /t error /so "SysMon" /id 918 /d
"sysmon.exe was unable to verify write operation."

Wednesday 2 July 2014

Viewing and Filtering Event Logs Using Command Prompt

Viewing and Filtering Event Logs

You can view events recorded in the Windows event logs using the Eventquery utility. Eventquery flags set the format of the output, control the level of detail, and allow you to use filters to include or exclude events from the result set. When working with Eventquery, don’t overlook the power of automation. You don’t have to run the command manually each time. Instead, you can create a script to query the event logs on multiple systems and then save the results to a file. If you copy that file to a published folder on an intranet server, you can use your Web browser to examine event listings. Not only will that save you time, it will give you a single location for examining event logs and determining if there are issues that require further study.

Viewing Events and Formatting the Output

The basic syntax for Eventquery is
eventquery /l "LogName"
where LogName is the name of the log you want to work with, such as “Application,” “System,” or “Directory Service.” In this example, you examine the Application log:
eventquery /l "Application"
The output of this query would look similar to the following:
-------------------------------------------------------------------------------
Listing the events in 'application' log of host 'MAILER1'
-------------------------------------------------------------------------------
Type Event Date Time Source ComputerName
------------- ------- ------------------------ ------------------- ------------
Warning 9220 5/19/2004 4:38:01 PM MSExchangeMTA MAILER1
Information 1001 5/19/2004 4:28:50 PM MSExchangeIS MAILER1
Information 9600 5/19/2004 4:28:50 PM MSExchangeIS MAILER1
Information 9523 5/19/2004 4:28:50 PM MSExchangeIS Publ MAILER1
Information 9523 5/19/2004 4:28:49 PM MSExchangeIS Mail MAILER1
Information 9523 5/19/2004 4:28:48 PM MSExchangeIS Publ MAILER1
Information 9523 5/19/2004 4:28:47 PM MSExchangeIS Mail MAILER1
Information 9523 5/19/2004 4:28:46 PM MSExchangeIS Mail MAILER1
Information 3000 5/19/2004 4:28:45 PM MSExchangeIS Publ MAILER1
Information 1133 5/19/2004 4:28:41 PM MSExchangeIS Publ MAILER1
As you can see, the output shows the Type, Event, Date Time, Source, and ComputerName properties of events. Using the /V (verbose) option, you can add category, user, and description properties to the output. Thus, if you wanted a verbose view of the application log, you’d use the command:
eventquery /l "Application" /v
Note 
Technically, the quotation marks are necessary only when the log name contains a space, as is the case with the DNS Server, Directory Service, and File Replication Service logs. However, I recommend using the quotation marks all the time; that way, you won’t forget them when they are needed and they won’t cause your scripts or scheduled tasks to fail.
Tip 
Unlike previous command-line utilities that we’ve worked with, Eventquery is configured as a Windows script. If this is your first time working with Windows scripts from the system’s command line or you’ve configured WScript as the primary script host, you will need to set CScript as the default script host. You do this by typing cscript // h:cscript //s at the command prompt. This is necessary because you want to work with the command line rather than with the GUI.
Real World 
The script host is set on a per-user basis and if you are running a script as a specific user, that use might not have CScript configured as the default script host. An effective workaround for this is to enter cscript //h:cscript //s as a line of the script and then enter your event queries.
Eventquery runs by default on the local computer with the permissions of the user who is currently logged on. As necessary, you can also specify the remote computer whose tasks you want to query and the Run As permissions by using the expanded syntax which includes the following parameters:
/s Computer /u [Domain\]User [/p Password]
where Computer is the remote computer name or IP address, Domain is the optional domain name in which the user account is located, User is the name of the user account whose permissions you want to use, and Password is the optional password for the user account. For example, if you wanted to examine directory service events on MAILER1 using the Adatam\WRStanek account, you could use the following command:
eventquery /l "Directory Service" /s Mailer1 /u Adatam\WRStanek
Note 
If you don’t specify the domain, the current domain is assumed. If you don’t provide the account password, you are prompted for the password.
The syntax can be extended to include the following format options as well:
  • /Nh  Removes the heading row from the output of Table- or CSV-formatted data.
  • /Fo Format  Changes the output format, which by default is table (/Fo Table). Use /Fo Csv to format the output as comma-separated values. Use /Fo List to format the output as a list.
Where Eventquery gets interesting is in the range and filter facilities. With ranges, you can view
  • The N most recent events  Type /r N where N is the number of recent events to view, such as /r 50 for the 50 most recent events.
  • The N oldest events  Type /r -N where -N is the number of the oldest events to view, such as /r -50 for the 50 oldest events.
  • Events from N1 to N2  Type /r N1-N2 where N1 is the first event and N2 is the last event to view, with 1 being the most recent event recorded, 2 being the next previous event recorded, and so on. For example, to see events 10 to 20 you’d use /r 10-20.

Filtering Events

One of the key reasons for using Eventquery is its ability to use filters to include or exclude events from the result set. Typically, you won’t want to see every event generated on a system. More often, you will want to see only warnings or critical errors, and that is precisely what filters are for. Using filters, you can include only events that match the criteria you specify.
Any of the information fields available can be filtered, even if the information field is only listed with the verbose flag (/V) and you haven’t specified the verbose flag for the current command. This means you can filter events by type, date time, source, computer name, event ID, category, and user.
You designate how a filter should be applied to a particular Eventquery information field using filter operators. The filter operators available are
  • Eq  Equals. If the field contains the specified value, the event is included in the output.
  • Ne  Not equals. If the field contains the specified value, the event is excluded from the output.
  • Gt  Greater than. If the field contains a numeric value and that value is greater than the value specified, the event is included in the output.
  • Lt  Less than. If the field contains a numeric value and that value is less than the value specified, the event is included in the output.
  • Ge  Greater than or equal to. If the field contains a numeric value and that value is greater than or equal to the value specified, the event is included in the output.
  • Le  Less than or equal to. If the field contains a numeric value and that value is less than or equal to the value specified, the event is included in the output.
As Table 6-1 shows, the values that can be used with filter operators depend on the event information field you are using. Again remember that all fields are available even if they aren’t normally displayed with the parameters you’ve specified. For example, you can match the status field without using the /V (verbose) flag.
Table 6-1: Filter Operators and Valid Values for Eventquery
Filter Field Name
Valid Operators
Valid Values
Category
eq, ne
Any valid string of characters.
Computer
eq, ne
Any valid string of characters.
Datetime
eq, ne, gt, lt, ge, le
Any valid time in the format mm/dd/yy, hh:mm:ssAM or mm/dd/yy, hh:mm:ssPM.
ID
eq, ne, gt, lt, ge, le
Any valid positive integer, up to 65,535.
Source
eq, ne
Any valid string of characters.
Type
eq, ne
Information, Warning, Error, SuccessAudit, FailureAudit.
User
eq, ne
Any valid user name, with user name only or in domain\user format.
Quotation marks must be used to enclose the filter string. Consider the following examples to see how filters can be used:
Look for error events in the application log:
eventquery /l "application" /fi "type eq error"
Look for system log events on MAILER1 that occurred after
midnight on 05/06/04:

eventquery /s Mailer1 /l "system" /fi "date gt 05/06/04,00:00:00AM"
Look for DNS server log errors on MAILER1 with event ID 4004:
eventquery /s Mailer1 /l "dns server" /fi "id eq 4004"
Enter multiple /Fi parameters to specify that output must match against multiple filters:
eventquery /l "system" /fi "date gt 05/06/04,00:00:00AM" /fi "type
eq error"
Here, Eventquery would examine the system logs for error events that were created after midnight on 05/06/04. Keep in mind that filters are mutually exclusive. You can’t specify that you want to see both error and warning events using a single command line. You would need to enter two different commands. One with /Fi “type eq error” and the other with /Fi “type eq warning.”
However, if you are working with a log other than security (in which only success audit and failure audit events are logged), you can simply specify that you don’t want to see informational events. That way, you will only see warning and error events as shown in the following example:
eventquery /l "system" /fi "type ne information"
You can automate the event querying process by creating a script that obtains the event information you want to see and then writes it to a text file. Consider the following example:
@echo off
eventquery /s Mailer1 /l "system" /r 100 /fi "type ne information" >
\\CorpIntranet01\www\currentlog.txt
eventquery /s Mailer1 /l "application" /r 100 /fi "type ne
information" >> \\CorpIntranet01\www\currentlog.txt
eventquery /s Mailer1 /l "directory service" /r 100 /fi "type ne
information" >> \\CorpIntranet01\www\currentlog.txt
Here, you are examining the system, application and directory service event logs on MAILER1 and writing any resulting output to a network share on CorpIntranet01. If any of the named logs have warning or error events among the 100 most recent events in the logs, the warnings or errors are written to the Currentlog.txt file. Because the first redirection is overwrite (>) and the remaining entries are append (>>), any existing Currentlog.txt file is overwritten each time the script runs. This ensures only current events are listed. To take the automation process a step further, you can create a scheduled task that runs the script each day or at specific intervals during the day.