Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Sebery J.Cryptography.An introduction to computer security.1989

.pdf
Скачиваний:
43
Добавлен:
23.08.2013
Размер:
3.94 Mб
Скачать

14.2 Anomaly Intrusion Detection

461

accesses his computer from his oÆce during working hours from 8a.m. to 6p.m. and sometimes remotely via modem from 7p.m. to 10p.m, then an abnormal behavior would be an access to computer from his oÆce at midnight. On the other hand, misuse intrusion detection demands that the IDS store information about attacks on the security of the system known so far. Note that a user will be marked as an intruder as soon as the IDS comes to the conclusion that he tried to compromise the security using one of the known attack scenarios. Note that the IDS cannot detect intrusions if the applied attack scenarios are not recorded in its database. Normally, managers of the computer systems should update their attack scenario databases as soon as a new attack becomes known.

14.2 Anomaly Intrusion Detection

An IDS based on anomaly intrusion detection is in fact an identi cation system that uses some measurable characteristics of users activities. A user activity can be characterized by its

1.Intensity { this is re ected by the sheer volume of audit records produced for a user per a time unit. A better granulation can be achieved if the intensity is measured in the context of a particular type of activity.

2.Mix of di erent types of activity { this includes not only the collection of di erent types of activity but also other more speci c information as to the order in which the particular activities take place and the context in which a particular sequence of activities occurs.

The intensity measure is very much related to the type of activity and may be described by many speci c parameters. In general, it is possible to use two major intensity characteristics: the number of times a given activity occurs per a time unit, or the average amount of time consumed by a single activity. Typical intensity measures for a user are the amount of CPU time, the number of active processes, the number of I/O operations, the number of opened les, etc.

User identity can be characterized by types of activity (for instance, sending e-mail, calling an editor, compiling a program, creating a window, etc.), the order of activities (for example, after login, a user normally rst reads the e- mail, sends e-mail, saves e-mail copies, uses the Web browser and prints out some Web pages), and the context in which the speci c order of activities takes

462 14 INTRUSION DETECTION

place (i.e. di erentiation of a user activity pro le depending on whether the user accesses the system from his workstation or from a remote terminal).

14.2.1 Statistical IDS

Implementation of an IDS starts from choosing an appropriate collection of user activity measures. The selection depends on many factors, such as the required probabilities of false acceptance and false rejection, the required memory to store users' pro les, the eÆciency of the IDS, etc. Assume that the measures chosen are m1; : : : ; mn. Each user therefore is assigned the collection of random variables M1; : : : ; Mn. Each random variable can be stored in the form of its probability distribution (an expensive option) or in a compressed form that includes the name of the probability distribution together with parameters describing it. The pro le of a given user consists of the sequence of random variables (M1; : : : ; Mn) evaluated from the audit trail and stored by the IDS usually in a compressed form. The security policy determines which of the chosen measures are more important and which are less signi cant. To express the current security policy, the manager provides the IDS with a sequence of weights (w1; : : : ; wn) that needs be used together with the corresponding measures to determine the IDS decision about intrusion.

IDS based on statistical measures

Setup: Manager selects a collection of measures (m1; : : : ; mn) and a vector of weights (w1; : : : ; wn). For each user, IDS computes and stores the user pro le described by (M1; : : : ; Mn) from the audit trail.

Processing: For a given time interval, the IDS takes the corresponding audit

~

~

trail and computes the actual pro le of the user de ned by (M1

; : : : ; Mn).

~

 

The IDS uses distance functions di = di(Mi; Mi) to determine the extend

of abnormal behavior in respect to the measure mi. The distance functions need be treated as functions that operate on the pairs of probability distributions and return an integer that makes sense of the distance.

Decision: If

n

X widi dt

i=1

then the behavior in the time interval is considered to be normal, otherwise the behavior is abnormal (an intrusion detected). The integer dt is the

14.2 Anomaly Intrusion Detection

463

threshold value that determines the boundary between normal and abnormal behavior of a user.

Action: If an intrusion is detected, the user activities are suspended or/and the manager is immediate noti ed. Otherwise, the pro le of the user is updated.

While designing a statistical IDS, the following questions need be considered: { how to select a collection of measures (m1; : : : ; mn),

~

{ how to de ne distances di = di(Mi; Mi), { how to determine the threshold dt.

The measure selection also called feature choice is crucial to the quality of intrusion detection. Typically, the designer identi es rst a collection of all measures accessible in the system. Let the collection be (m1; : : : ; m`), then the designer tries di erent (if ` is small, the designer may try all) combinations of features that are most sensitive (a good discrimination of user) and stable (features do not change over time).

Once a collection of good features have been selected, the designer has to de ne the corresponding collection of distances between two probability distributions (for normal and abnormal behavior). This works well if the accepted measures are statistically independent. In most cases this assumption does not hold. A typical solution is to combine related features into one anomaly measure using covariance matrices [311]. The value of the threshold dt is selected experimentally as it directly in uences the two false rejection and false acceptance probabilities. It is also a matter of the security policy.

Statistical intrusion detection assumes that each user can be assigned the unique pro le that can be e ectively compared with the current approximation of the pro le. In general, a user is modeled by a stochastic process that is stationary or whose parameters do not vary dramatically so the update of the pro le can cope with the changes of behavior (the process is quasi stationary). More precise models include nonstationary stochastic processes or generalized Markov chains. Building such models is too expensive to be practical.

14.2.2 Predictive Patterns

Predictive pattern anomaly detection is based on the assumption that it is possible to identify normal and abnormal behavior of users from ordered sequences of events generated by them. So a pro le of a user is a collection of \typical"

464 14 INTRUSION DETECTION

sequences. A probabilistic nature of patterns of events generated by users can be re ected by assigning conditional probabilities to transitions to other events provided a given typical sequence has occurred. For instance, a typical pattern can be an ordered sequence of events

he1; e2; e3i

with P (e4 j he1; e2; e3i) = 0:1 and P (e04 j he1; e2; e3i) = 0:9. This reads: if a user generates the sequence he1; e2; e3i then only two events e4 and e04 may follow it

with the probabilities 0.1 and 0.9, respectively. A typical sequence he1; : : : ; eni together with associated conditional probabilities P(e(ni+1) j he1; : : : ; eni) for some i is called a rule. Note that the rule can be used only if a user applies the correct event pre x he1; : : : ; eni.

IDS based on predictive patterns

Setup: For each user, the IDS computes and stores the user pro le described by a collection of rules fR1; : : : ; Rng computed from the audit trail.

Processing: For a given time interval, the IDS takes the corresponding audit trail and computes conditional probabilities associated with the rules stored in the user pro le. The IDS uses a distance functions di (i = 1; : : : ; n) to determine the extend of abnormal behavior in respect to the rule Ri. The distance functions need be treated as functions that operate on the pairs of conditional probability distributions and return an integer that makes sense of the distance.

Decision: For a chosen by manager weights wi if

n

X widi dt

i=1

then the behavior in the time interval is considered to be normal, otherwise the behavior is abnormal (an intrusion detected). The integer dt is the threshold value that determines the boundary between normal and abnormal behavior of a user.

Action: If an intrusion is detected, the user activities are suspended or/and the manager is immediate noti ed. Otherwise, the pro le of the user is updated.

A major problem with this approach is that the rules can be only used if they are triggered by their event pre x. If none or few of the event pre xes were

14.2 Anomaly Intrusion Detection

465

generated by a user, it is impossible to make any reasonable decision and the IDS simply fails.

Advantages of this approach include the ability of the system to be adapted for misuse detection. A nice property of the system is that it works very well for users whose behavior exhibits a strong sequential pattern [505].

14.2.3 Neural Networks

Neural networks sometimes o er a simple and eÆcient solution in situations when other approaches fail. To use a neural network for intrusion detection, it is enough rst to train the neural net on a sequence of events generated by a user and later to use the net as a predictor of the next event.

IDS based on neural networks

Setup: For each user, the IDS maintains a neural net. The neural net is being trained on a sequence of events generated by the user.

Processing: The IDS repeatedly considers sequences of n events generated by the user. Each sequence is fed to the neural net. The network predicts the next event e~ and compares it with the event e issued by the user.

Decision: If e~ = e

then the behavior of the user is considered to be normal, otherwise the behavior is abnormal (an intrusion detected).

Action: If an intrusion is detected, the user activities are suspended or/and the manager is immediate noti ed.

The selection of the parameter n is an important issue. If n is too small, the network will not be able to predict the next event (a lot of false alarms). On the other hand, if n is too large, then there is no relations between the events at the beginning and at the end of sequence. Evidently, the IDS will fail if a user selects the next event nondeterministically. To x this, the neural net needs to exit a number of typical events.

466 14 INTRUSION DETECTION

14.3 Misuse Intrusion Detection

Note that anomaly intrusion detection always compares the current activity with the expected one de ned for a user and can be seen as a user identi cation. Misuse intrusion detection does not care whether users can be properly identi ed as long as they do not try to abuse the computer resources. From the IDS point of view, there are only two classes of users: friends and foes. To de ne the class of foes, it is necessary to determine precisely the meaning of intrusion. This is done by providing a list of intrusion scenarios or attacks (also called intrusion signatures). An intrusion signature de nes

{order of events (typically, commands),

{resources involved ( les, processes, CPU, memory, etc.),

{conditions on resources and events,

which compromises the security of the system. Intrusion signatures can be categorized into the following classes:

1.Simple signatures { the existence of a single event in the audit trail or/and the existence of a trace of intrusion attempt is enough to detect intrusion.

2.Event-based signatures { the existence of an ordered sequence of events is enough to conclude that the user is an intruder.

3.Structured signatures { the signature can be written as a regular expression.

4.Unstructured signatures { all signatures that do not fall into one of the above classes.

Having a collection of intrusion signatures, the IDS may apply a variety of di erent methods to detect that a user attempts to attack the system using some intrusion scenario recorded in the system as the corresponding intrusion signature. Some typical approaches involve the application of

{expert systems and

{nite state machines.

An expert system implementation of the IDS encodes the collection of intrusion signatures into if-then rules. A rule not only re ects a single intrusion signature (if part) but also speci es what action needs to be undertaken when an intrusion is detected (then part). The IDS takes an audit trail and investigates it to check whether or not some of the rules are active (or an attack is under way).

14.4 Uncertainty in Intrusion Detection

467

In the nite state machine approach, it is required for signatures to be translated into corresponding state transitions of the underlying machine. The states of the machine are divided into three classes: save (no intrusion detected), suspicious (advanced in one of the signatures), intrusion (an intrusion detected and the corresponding signature is active).

14.4 Uncertainty in Intrusion Detection

The most important issue related to an e ective intrusion detection is the adoption of an appropriate mathematical model that allows to generate user pro les eÆciently and facilitates an e ective and accurate decision making process for intrusion detection. Due to a nondeterministic nature of a user behavior, the decision about intrusive or nonintrusive behavior must take into account all evidences for and against the claim. There are several mathematical models to choose from. Two most popular are: the probabilistic model and the DempsterShafer model [133, 464]. In the probabilistic model, the decision about intrusion is based on the probabilistic assessment of the body of evidence. The DempsterShafer theory of evidence can be seen as a generalization of the probability theory.

14.4.1 Probabilistic Model

Given an event space over random events e1; : : : ; en such that P(e1[: : :[en) = 1 or Sni=1 ei = . The Bayes theorem asserts that for any random event B 2 (P(B) > 0)

P (ei

j

B) =

P (ei; B)

=

P (B j ei)P (ei)

:

(14.1)

P (B)

Pej2 P(B j ej )P (ej)

 

 

 

 

 

P (ei j B) is called a posteriori probability and P (ej) are a priori probabilities. From an intrusion detection point of view, the space de nes a collection of events that are occurring with di erent probabilities for normal and intrusive

behavior. De ne a hypothesis I \there is an

intrusion."

 

The complement I

 

 

 

 

 

 

 

 

reads \there is NO intrusion." Clearly, P (I [ I) = 1. From Equation (14.1), we

can obtain

 

 

 

 

 

P (I

j

e) =

P(I; e) =

P(e j I)P(I)

 

:

(14.2)

 

 

 

P (e)

P (e j I)P (I) + P (e j I)P (I)

 

N(e j I) =
S(e j I) =

468 14 INTRUSION DETECTION

To characterize evolution of validity of hypothesis I, we introduce four parameters: priori and posteriori odds and positive and negative likelihoods. A priori odds for I are the following ratio:

 

 

P(I)

O(I) = P(I)

A posteriori odds are de ned as:

O(I

j

e) = P(I j e)

 

P(I j e)

An odds ratio O(I) is a positive rational. For a hypothesis I such that P (I) =

 

 

P (I) = 0:5, the a priori odds O(I) = 1. If the value O(I) > 1, then P (I) >

 

 

P (I). If the value O(I) < 1, then P(I) < P (I). A posteriori odds provide a

quantitative measurements of validity of hypothesis I after the observation of a random event e.

The positive likelihood is the ratio

P(e j I)

P(e I)

j

and similarly the negative likelihood is the ratio

P (e j I)

P (e I)

j :

The positive likelihood characterizes the event e in terms of its relation to

intrusion. If S(e j I) > 1 then the event e con rms the hypothesis I, otherwise

j

the event is consistent with the anti-hypothesis I. If S(e I) 1, the event is neutral.

Consider some properties of the parameters.

 

Theorem 47. Given an event space and an event e 2 . Then

 

O(I j e) = S(e j I)O(I);

(14.3)

where I is the hypothesis that there is an intrusion.

Proof. According to the de nitions, we have the following sequence of equations:

 

 

 

P (e

I) P (I)

 

 

S(e

j

I)O(I) =

 

j

 

 

 

 

 

 

P (e j I) P (I)

 

 

 

 

 

P (e; I)

 

P(I

e)P(e)

 

 

=

 

=

j

 

 

 

 

 

P (e; I)

 

P(I j e)P(e)

 

 

=

P (I j e)

= O(I

j

e)

 

 

 

P (I j e)

tu

 

 

that proves the theorem.

 

 

 

14.4 Uncertainty in Intrusion Detection 469

Theorem 48. Assume that there is a collection of events e1; : : : ; en such that

n

 

 

 

n

 

P (e1; : : : ; en j I) = Qi=1 P(einj I) and P (e1

; : : : ; en j I) = Qi=1

P(ei j I), then

O(I j e1; : : : ; en) = O(I)

Y

S(ei j I):

 

 

(14.4)

 

i=1

 

 

 

 

Proof. Consider the following sequence of transformations:

 

 

 

 

P (I

 

e1

; : : : ; en)

 

P (I; e1; : : : ; en)

O(I

j

e1; : : : ; en) =

j

e1

; : : : ; en)

=

 

 

 

 

 

 

 

 

 

P (I j

 

P (I; e1; : : : ; en)

 

 

 

 

P (e1; : : : ; en I)P (I)

 

n

P (ei I)

 

 

 

=

 

 

 

 

 

 

j

 

=

Y

 

j

O(I)

 

 

 

P (e1; : : : ; en

 

j

 

P (ei

j

 

 

 

 

 

 

I)P (I)

 

i=1

 

I)

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

= O(I)

Y

S(ei j I);

 

 

 

 

 

 

 

 

 

i=1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

which proves Equation (14.4). If one observes that

 

 

P (ei j I)

 

 

 

 

O(I j ei)

 

 

 

 

 

 

= P(I j ei) P (I) =

;

 

 

 

 

 

P (ei j I)

P(I j ei) P (I)

 

 

O(I)

 

 

 

 

 

 

 

then Equation (14.4) can be rewritten as

 

 

 

 

 

 

 

 

 

 

1

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Y

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

O(I j e1; : : : ; en) = O(I)(n 1)

O(I j ei)

 

 

 

 

i=1

 

 

 

 

tu

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Consider an example. Let the space = fe0; e1g = f0; 1g. Time is tied up by de ning a sequence of random variables E1; E2; : : : for the corresponding time instances. We assume that users generate events for every time instance i so P (Ei = e) is the probability that the user generated event e 2 at the time i. We also assume that P (I) = P (I) = 1=2 and P(E1 = 0 j I) = P (E1 = 1 j I) = 1=2, P(E1 = 0 j I) = P(E1 = 1 j I) = 1=2.

Normal behavior is characterized by the following conditional probabilities:

P (Ei+1 = 0 j Ei = 0) = 1=4;

I

P (Ei+1 = 1 j Ei = 0) = 3=4;

I

P (Ei+1 = 1 j Ei = 1) = 3=4;

I

P (Ei+1 = 0 j Ei = 1) = 1=4;

I

for i = 1; 2; : : :. Intrusive behavior di ers from the normal one and is characterized by the following conditional probabilities:

470 14 INTRUSION DETECTION

PI (Ei+1

PI (Ei+1

PI (Ei+1

PI (Ei+1

=0 j Ei = 0) = 1=4 + ";

=1 j Ei = 0) = 3=4 ";

=1 j Ei = 1) = 3=4 ";

=0 j Ei = 1) = 1=4 + ";

for i = 1; 2; : : :.

The initial odds O(I) = P (I) = 1 can be computed from the assumed prob-

P (I)

ability distribution for I. Similarly O(I j E1 = 0) = O(I j E1 = 1) = 1. In fact, the probability P(I) can be selected arbitrarily and the IDS uses the initial odds as a benchmark for further evaluation of validity of the hypothesis I. Compute the following probabilities:

P (E2 = 0 j I) = P (E1 = 0 j

I)PI (E2 = 0 j

E1 = 0) +

1

 

 

 

 

 

P (E1 = 1 j

I)PI (E2 = 0 j

E1 = 1) =

 

 

 

 

 

4

 

P (E2 = 1 j

I) = 1

P (E2 = 0 j I) =

3

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

 

 

 

 

 

 

j

 

 

 

 

j

 

 

 

 

j

 

 

 

 

 

 

 

 

P (E2 = 0

I) = P (E1 = 0

 

 

= 0

E1 = 0) +

 

 

 

 

I)P (E2

 

 

 

 

 

 

 

 

 

 

 

 

j

 

I

 

 

j

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I

 

 

 

 

 

 

 

 

 

 

 

 

 

 

P (E1 = 1

 

I)P (E2

= 0

 

E1 = 1)

 

 

 

 

 

 

= 1

(1

+ ") +

1

(1 + ") = 1

+ "

 

 

 

 

 

 

 

 

 

2

4

 

 

 

2

4

 

4

 

 

 

 

 

 

 

 

P (E2 = 1 j

 

 

 

 

 

 

 

3

"

 

 

 

 

 

I) = 1

P (E2 = 0 j I) =

4

 

 

 

 

1 + 2"

P (E2 = 0) = P (E2 = 0 j

I)P(I) + P(E2 = 0 j

 

 

I)P (I) =

4

P (E2 = 1) = 1

 

P (E2 = 0) = 3 2"

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

 

 

 

P (I

j

E2 = 0) = P (E2 = 0 j

I)P (I) =

 

1

 

 

 

 

 

 

 

 

2 + 4"

 

 

 

 

 

 

 

 

 

 

 

 

P (E2 = 0)

 

 

 

 

 

 

 

 

 

 

 

 

 

P (I j E2 = 0) =

1

+ 4"

 

 

 

 

 

 

 

P (I j E2 = 0) = 1

2 + 4"

 

 

 

 

 

 

 

P (I

j

E2 = 1) = P (E2 = 1 j

I)P (I) =

 

3

 

 

 

 

 

 

 

 

6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

P (E2 = 1)

 

4"

 

 

 

 

 

 

 

 

 

 

 

 

P (I j E2 = 0) =

3

 

4"

 

 

 

 

 

 

 

P (I j E2 = 1) = 1

6

4"

 

 

 

 

 

 

 

A posteriori odds are

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

O(I j E2 = 0) =

 

1

and

O(I j E2

= 1) =

 

3

 

:

 

 

 

 

 

 

 

 

1 + 4"

3

 

4"

 

 

Соседние файлы в предмете Электротехника