Previous

Content  

Next


2.10.- TCINDEX classifier

 

Google
The tcindex classifier was specifically designed to implement the Differentiated Service architecture on Linux. It is explained in Differentiated Services on Linux [10] and Linux Advanced Routing & Traffic Control HOWTO [7], but in both documents, in my modest opinion, the explanation is highly technical and a little bit confuse, having the reader even more questions and doubts when the reading is finished.
The tcindex classifier bases its behavior in the skb->tc_index field located in the packet sk_buff buffer structure. This buffer space is created for every packet entering into or being created from the Linux box. Because of this, the tcindex classifier must be used only with those queuing disciplines that can recognize and set the skb->tc_index field; these are: GRED, DSMARK and INGRESS queuing disciplines.
 
I think it is easier to approach the tcindex classifier study by analyzing which qdisc/class/filter writes and which qdisc/class/filter reads the skb->tc_index field. Let's start by copying the figure 2.9.3 from the previous section, renumbered as 2.10.1 here, to be used as reference:
 

 
Next we have to go to the C code (sorry, but it's better) to poke around. We will use this procedure:
 
  1. We write one asseveration.
  2. We present the C code that sustains it.
  3. We make the reference with the figure above.
  4. We present the tc command required to get that behavior.
 
   

 

Asseveration: The skb->tc_index value is set by the dsmark queuing discipline when the set_tc_index parameter is set. The skb->iph->tos, which contains the packet's DS field value, is copied onto the skb->tc_index field.

In the figure, this process is represented by the big red vertical line going from top (skb->iph->tos field) to bottom (skb->tc_index field) in the dsmark entrance. As an example, the next tc command is used to get this behavior:

Asseveration: the skb->tc_index field value is read by the tcindex classifier; then, the filter applies a bitwise operation on a copy of the value (this means, the original value is not modified); the final value obtained from this operation is passed down to the filter elements to get a match. Having a match, the class identifier corresponding to this filter element is returned back and passed to the queuing discipline as the resulted class identifier.

Okay, the classifier lookup is done by applying first the following bitwise operation to the skb->tc_index field:

( skb->tc_index & p.mask ) >> p.shift

mask and shift are integer values we have to define in the main filter. Let's see a command to understand better this complicated part:

The first command sets the DSMARK queuing discipline. Because set_tc_index is set, the packet's DS field value is copied onto the skb->tc_index field when the packet enters the qdisc. The second command is the main filter. For this example, it has three elements (next 3-commands). In the main filter we define mask=0xfc and shift=2. This filter reads the skb->tc_index value (containing the packet's DS field value), applies the bitwise operation, and the value resulted is passed down to the elements to look for a match. pass_on means: if a match is not found in this element continue with the next one.
Let's suppose that a packet having its DS field value marked as 0x30 (corresponding to the AF12 class) is entering the qdisc. The value 0x30 is copied by dsmark from the packet's DS field onto the skb->tc_index field. Next the classifier reads this field. It applies the bitwise operation to the value. What happens?
  ( skb->tc_index & p.mask ) >> p.shift = ( 0x30 & 0xfc ) >> 2  =
  ( 00110000 & 11111100 ) = 00110000 >> 2 = 00001100 = 0xc
Final value after bitwise operation is 0xc which corresponds to the decimal value 12. This value is passed down to the filter elements. The first element doesn't match because it matches decimal value 10 (handle 10 tcindex).
Next element matches because it matches decimal value 12 (handle 12 tcindex); then, the class identifier returned back to the queuing discipline will be 1:112 (classid 1:112). In the figure, this process is represented by the big blue vertical line going from bottom (skb->tc_index field) to the green filter elements (to get a class identifier) and then from the green filter elements to the yellow class identifier returned back from the classifier to the queuing discipline. Now let's see what's going to do the dsmark queuing discipline with the class identifier value returned back.
 
Asseveration: the minor part value of the class identifier returned to the DSMARK queuing discipline by the tcindex classifier is copied back by the queuing discipline onto the skb->tc_index field.

Well, fellows, finally the class identifier returned back again (it likes to travel, doesn't it?) to the skb->tc_index field. But, be careful. The value copied back was the class identifier's minor value, this means, 112 from the classid 1:112. It is very important to interpret well this ubiquitous value. 112 doesn't mean decimal value 112. Each of this numbers is a nibble (4-bits). Then 112 is really:  000100010010. It's better if we separe the nibbles, then the number is: 0001-0001-0010. This is now the new value contained in the skb->tc_index field.
In the figure, this process is represented by the big green vertical line going from yellow rectangle classid to the bottom skb->tc_index field.
 
Asseveration: On DSMARK queuing discipline the skb->tc_index value is used as an index to the internal table of mask-value pairs to get the pair to be used. Then, the pair selected is used, when dequeuing, to modify the packet's DS field value using a combined and-or bitwise operation.
We saw this already in previous section. Let's show first the C code:

The commands above, are taken from the afcbq example of the Differentiated Service on Linux distribution (we will see every example on the DS on Linux distribution in detail, but later on). For now, to explaining this part, we will use a different set of commands; then we have:
 

 
This example is not as intelligent as it should be, but, for what we are trying to explain it is good enough. The first command sets the dsmark queuing discipline 1:0. Next 3-commands define the classes of the discipline. Now is the turn for the main filter. As we saw above, this filter reads the skb->tc_index field containing the packet's DS field value and after doing a bitwise operation on a copy of it, passes down the result obtained to the filter elements.
 
This commands are in fact changing the AF class of packets marked as AF1x to AF2x. This is what is called re-marking when talking on differentiated service terminology. The class is changed preserving the rest of the bits (drop precedence and ecn bits). When one AF1x's packet enters, its DS field is copied onto the skb->tc_index field by the dsmark queuing discipline, just because the set_tc_index parameter is set. Next the main filter is invoked. Let's suppose that one AF12 packet is entering. After the copy the new skb_>tc_index value will be 0x30.
 
   

 

The main filter takes a copy of this value (0x30) and applies its bitwise operation with mask=0xfc and shift=2; then we have:
  (0x30 & 0xfc) >> 2 = (00110000 & 11111100) >> 2 = 00110000 >> 2 = 00001100 = 0xc = 12
Great!! Final value is decimal 12. This value is passed down to the filter elements. Second element matches and the class id value 1:2 is returned back to the dsmark queuing discipline. As we saw above, immediately dsmark strippes the class id major value and copies back the minor value again onto the skb->tc_index field. The new value of skb->tc_index is now decimal 2.
Now is the turn again for dsmark queuing discipline when the packet is leaving out the discipline. The discipline reads the skb->tc_index field in the buffer's packet. The value is decimal 2. With this value it enters its own internal table. But this table was built for us with the 3 commands following the queuing discipline creation. Entering with decimal 2 index, the table contains the values mask=0x1f and value=0x40. The example is idiot because all classes have the same mask-value pair parameters. But, anyway, I'm tired and I don't want to think too much, just enough to explain how this stuff makes its work.
Finally the dsmark queuing discipline does the following operation over the AF12 marked packet.
  (0x30 & 0x1f) | 0x40 = (00110000 & 00011111) | (01000000) =
(00010000 | 01000000) = 01010000 = 0x50
Okay, 0x50 is the value which corresponds to the class AF22. Tha packet enters as class-drop precedence AF12 and departures as class-drop precedence AF22.
The really important thing to understand here is that dsmark reads the skb->tc_index value to select a class or an index into the internal table of mask-value pairs, for getting the pair to be used, later on, to update the DS field from the dequeing packet. This entire process is represented by the big purple lines and arrows and the internal dsmark table representation to the right of the figure 2.10.1 above.
 
Asseveration: The 4-rightmost bits of the skb->tc_index field are used by the GRED queuing discipline to select a red virtual queue (VQ) for the packet entering the discipline. If the value (4-rightmost bits) is out of range of number of virtual queues, the skb->tc_index field is set (it shouldn't be) to the number of the default virtual queue by the GRED queuing discipline.

In this case, we don't have to put explicitly the packet into one virtual queue using a filter. It's good enough to set the skb->tc_index field value of the packet's buffer with the number of the virtual queue we want to select. For setting the skb->tc_index field we can use a dsmark qdisc and its attached filter, or we can use the ingress queuing discipline as it will be explained later on. Let's see how to set an example of this configuration:

These commands show a GRED configuration using DSMARK to select the virtual queue. The first command creates the dsmark queuing discipline. Packet's DS field will be copied onto the skb->tc_index field on entrance. Next command sets the main filter. Packets having DS field values corresponding to classes AF11, AF12 and AF13 will generate values 10, 12 and 14 respectively, after the (DS field & 0xfc) >> 2 bitwise operation is applied.
These values are passed down to the filter elements which are set using the next 3 commands. Class id 1:1, 1:2 and 1:3 are returned back for classes AF11, AF12 and AF13 respectively. When the dsmark queuing discipline receives back the class id returned values, it sets the skb->tc_index with the minor values of them. This way, skb->tc_index is set to 1, 2, or 3 for packet's class AF11, AF12, or AF13 respectively. It's great!! We have already set the skb->tc_index field for the gred queuing discipline.
Next command sets the main gred queuing discipline having as parent the dsmark queuing discipline. Last 3 commands set the gred virtual queues number 1, 2 and 3 respectively. But, we don't have to worry about how to put packets into the gred virtual queues. GRED does itself its work by reading the skb->tc_index value and placing the packets into the corresponding virtual queues.
 
Our last asseveration: When using the INGRESS queuing discipline, skb->tc_index field is set with the minor part of the class identifier returned by the used attached filter.

The ingress queuing discipline is not a queue at all. It just invokes the attached classifier and when the class identifier is returned, it extracts the minor part from it and copies the result onto the skb->tc_index field.
The ingress qdisc's classifier could be a u32 classifier or a fw classifier. The tcindex classifier cannot be used because it requires that the skb->tc_index field is set, and because the setting is done by the ingress queuing discipline itself, the initial skb->tc_index value will be zero.  Excluding the tcindex classifier, I suppose we can use any kind of classifier to be attached to the ingress queuing discipline. Being u32 or fw the used classifier, in both cases you can police the flows entering at the same time by implementing a policer into the classifier. Because this is specially important for the Differentiated Service architecture, we are going to explain a little more about policing in the next section. For now, we are going to show two examples using the fw classifier and the  u32 classifier.

In this example we use the fw classifier. Traffic enters through the eth1 interface and leaves the router through the eth0 interface. The ingress queuing discipline is configured on interface eth1. On this interface we, previously, set iptables to mark any flow entering with fw mark=2, and then flows from network 10.2.0.0/24 with fw mark=1.
Using two filter elements we set skb->tc_index field with the value 1 (flowid :1) for packets with fw set to 1 (handle 1 fw), and with the value 2 (flowid :2) for packets with fw set to 2 (handle 2 fw).
Finally we configure a dsmark queuing discipline on outgoing interface eth0. Packets leaving the router with its skb->tc_index field set to 1 (classid 1:1) are marked on its DS field applying the bitwise operation ((DS & mask) | value). Then, packets from network 10.2.0.0/24 (identified with skb->tc_index=1) are marked as 0x88 (which corresponds to DS class AF41), and rest of traffic (identified with skb->tc_index=2), is marked as 0x90 (which corresponds to DS class AF42).
The u32 classifier is used in a similar way; but we don't need iptables for this case. For example:
 

 
As you see this configuration is even simpler than when using the fw classifier.We configure the ingress queuing discipline, then using two filter elements attached to it, we set skb->tc_index field with the value 1 (flowid :1) for packets with DS field set to 0x28 (match ip tos 0x28 0xfc), preserving the ecn bits, and with the value 2 (flowid :2) for packets with DS field set to 0x30 (match ip tos 0x30 0xfc), again preserving the ecn bits. These packets happen to be the differentiated service classes AF11 and AF12, respectively.
 
Our setting is some kind of "promoting packets" configuration. The dsmark queuing discipline marks packets leaving the router with its skb->tc_index field set to 1 (classid 1:1), i.e., AF11's class packets, as 0xb8 (which corresponds to DS class EF), and packets leaving the router with its skb->tc_index field set to 2 (classid 1:2), i.e., AF12's class packets, as 0x28 (which corresponds to DS class AF11). Then DS AF11's class packets are promoted to DS EF and DS AF12's class packets are promoted to DS AF11.
 
   

 

Well, fellows. With this explanation we finish the TCINDEX classifier. Next section will be dedicated to explore a little about the filter's police capability.

   


Previous

Content  

Next