Lets understand LACP state machine using Linux bond

Photo by Brett Sayles on Pexels.com

First, Lets bring up a linux bond with LACP (type=4).

Server side config:
nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad ip4 6.6.6.60/24
nmcli con mod id bond0 bond.options mode=802.3ad,lacp_rate=slow
nmcli con add type bond-slave ifname enp4s0f0 con-name enp4s0f0 master bond0
nmcli con add type bond-slave ifname enp4s0f1 con-name enp4s0f1 master bond0
nmcli con bond0 up

Juniper switch side config:
set chassis aggregated-devices ethernet device-count 3
set interfaces ae4 aggregated-ether-options minimum-links 1

set interfaces ae4 unit 0 family ethernet-switching interface-mode trunk
set interfaces ae4 unit 0 family ethernet-switching vlan members vlan415
set interfaces ae4 aggregated-ether-options lacp active
set interfaces ae4 aggregated-ether-options lacp periodic fast

set interfaces et-0/0/1:2 ether-options 802.3ad ae4
set interfaces et-0/0/1:3 ether-options 802.3ad ae4

See the bond status

[root@compute-0 network-scripts]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v4.18.0-305.45.1.el8_4.x86_64

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 04:3f:72:d9:c0:49
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 2
	Actor Key: 21
	Partner Key: 5
	Partner Mac Address: c8:fe:6a:f2:44:00

Slave Interface: enp4s0f1
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 04:3f:72:d9:c0:49
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 04:3f:72:d9:c0:49
    port key: 21
    port priority: 255
    port number: 1
    port state: 63 <<<<<<<<<<<<<<<<
details partner lacp pdu:
    system priority: 127
    system mac address: c8:fe:6a:f2:44:00
    oper key: 5
    port priority: 127
    port number: 7
    port state: 63  <<<<<<<<<<<<<<<<

Slave Interface: enp4s0f0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 04:3f:72:d9:c0:48
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 04:3f:72:d9:c0:49
    port key: 21
    port priority: 255
    port number: 2
    port state: 63  <<<<<<<<<<<<<<<<
details partner lacp pdu:
    system priority: 127
    system mac address: c8:fe:6a:f2:44:00
    oper key: 5
    port priority: 127
    port number: 6
    port state: 63 <<<<<<<<<<<<<<<

Lets see the packet capture on one of the member interfaces

Packet capture:
[root@compute-0 network-scripts]# tcpdump -nnvei enp4s0f0
dropped privs to tcpdump
tcpdump: listening on enp4s0f0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:19:56.578296 c8:fe:6a:f2:44:c4 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Partner Information TLV (0x02), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0
12:19:56.914331 04:3f:72:d9:c0:48 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Partner Information TLV (0x02), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

So, what i am trying to explain here? These old stuff? No, ohh Wait, what i am explaining here that also old but not much discussed, so lets discuss

How can you tell that your LACP session is negotiated and up running one? Which are LACP states of state machines agreed upon by both sides? And in case, negotiation failed, Which state it failed? Well, You would have noticed in the bond output the “Port state: 63”. That is the place for all the answers.

Look at below code snipped from https://elixir.bootlin.com/linux/v4.3/source/drivers/net/bonding/bond_3ad.c#L51

/* Port state definitions (43.4.2.2 in the 802.3ad standard) */
#define AD_STATE_LACP_ACTIVITY   0x1
#define AD_STATE_LACP_TIMEOUT    0x2
#define AD_STATE_AGGREGATION     0x4
#define AD_STATE_SYNCHRONIZATION 0x8
#define AD_STATE_COLLECTING      0x10
#define AD_STATE_DISTRIBUTING    0x20
#define AD_STATE_DEFAULTED       0x40
#define AD_STATE_EXPIRED         0x80

Now, lets see the highlighted text in packet capture again and you would see on both member interfaces, Partner and actor have flags “[Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]

So, going by code snipped values (which are Hex values) and summing them up, their decimal values Distributing (32) + Collecting (16) + Sync (8) + Aggregation (4) + Timeout (2) + Activity (1) is 63. And these are the flags we have seen in successfully negotiated session and bond output port state.

Hmn, what are these states for? Something more about them?

The LACP_Activity flag indicates a participant’s intent to transmit periodically to detect and maintain aggregates.
The LACP_Timeout flag indicates that the participant wishes to receive frequent periodic transmissions and will aggressively times out received information.
The Aggregation flag indicates that the participant will allow the link to be used as part of an aggregate.
The Synchronization flag indicates that the transmitting participant’s mux component is in sync with the system id and key information transmitted.
The Collecting flag indicates that the participant’s collector is on.
The Distributing flag indicates that the participant’s distributor is on.

We have 2 more states, Expired and Defaulted, However, before going there, we need to understand few timers and timeout values. Below is code snipped from https://elixir.bootlin.com/linux/v4.3/source/drivers/net/bonding/bond_3ad.c#L43

/* Timer definitions (43.4.4 in the 802.3ad standard) */
#define AD_FAST_PERIODIC_TIME      1
#define AD_SLOW_PERIODIC_TIME      30
#define AD_SHORT_TIMEOUT_TIME      (3*AD_FAST_PERIODIC_TIME)
#define AD_LONG_TIMEOUT_TIME       (3*AD_SLOW_PERIODIC_TIME)
#define AD_CHURN_DETECTION_TIME    60
#define AD_AGGREGATE_WAIT_TIME     2

The values of the timers are the same as mentioned above in seconds.

When LACP PDU is received at the port, it moves to “Current” state. And starts its Timeout timer. Now, when this Timeout timer expires, Port’s state moves to “Expired” state And Timeout time reset again. When Timeout again expires, Port moves to Defaulted state.

So, natural question, When we see ports in this state, then lets try to understand the entire LACP state machine now with the packet captures.

Below is the packet capture when local port has session Expired and Defaulted whereas Partner information from its last saved record.

12:25:16.538321 04:3f:72:d9:c0:48 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Default, Expired]
	Partner Information TLV (0x02), length 20
	  System 00:00:00:00:00:00, System Priority 65535, Key 1, Port 1, Port Priority 255
	  State Flags [Activity, Timeout]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

Remote peer is now sending its PDU with Default and Expired state with Partner information from its last saved record.

12:25:17.174182 c8:fe:6a:f2:44:c4 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation, Default, Expired]
	Partner Information TLV (0x02), length 20
	  System 00:00:00:00:00:00, System Priority 1, Key 5, Port 6, Port Priority 1
	  State Flags [Timeout, Aggregation, Default]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

The local port starts LACP PDU exchange with going into "Aggregation" state. 

12:25:17.474322 04:3f:72:d9:c0:48 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation]
	Partner Information TLV (0x02), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation, Default, Expired]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

The Peer port too started "Aggregation" now with Partner also being in "Aggregation".

12:25:17.484591 c8:fe:6a:f2:44:c4 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation]
	Partner Information TLV (0x02), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

12:25:18.486075 c8:fe:6a:f2:44:c4 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation]
	Partner Information TLV (0x02), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

Local port has moved to "Synchronization" now. 

12:25:18.722322 04:3f:72:d9:c0:48 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Synchronization]
	Partner Information TLV (0x02), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

Peer is still in "Aggregation".

12:25:18.733618 c8:fe:6a:f2:44:c4 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation]
	Partner Information TLV (0x02), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Synchronization]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

Peer is in "Collecting, Distributing" now.

12:25:19.490298 c8:fe:6a:f2:44:c4 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Partner Information TLV (0x02), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Synchronization]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

local port also moved to "Collecting, Distribution" with Peer port as well.

12:25:19.658321 04:3f:72:d9:c0:48 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Partner Information TLV (0x02), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0
12:25:19.668598 c8:fe:6a:f2:44:c4 > 01:80:c2:00:00:02, ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
	Actor Information TLV (0x01), length 20
	  System c8:fe:6a:f2:44:00, System Priority 127, Key 5, Port 6, Port Priority 127
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Partner Information TLV (0x02), length 20
	  System 04:3f:72:d9:c0:49, System Priority 65535, Key 21, Port 2, Port Priority 255
	  State Flags [Activity, Timeout, Aggregation, Synchronization, Collecting, Distributing]
	Collector Information TLV (0x03), length 16
	  Max Delay 0
	Terminator TLV (0x00), length 0

There are few things to know here,

  • Timeout values are transmitted to peer, For example, server side can have lacp rate as slow (30 secs) and switch side can have lacp rate fast (1 sec). Both will update each other with their lacp rates.
  • When configured server bond interface as Passive, it will not initiate LACP session by sending PDU first though will responds to peer’s PDUs to start session negotiations
  • If LACP session does not come up, bond interface will bring down all its member interfaces administratively.
  • MII status is not mandatory to be configured, you can see in my example, I am not configuring it and link failure detection is now handled with LACP PDUs.
  • LACP is a control plane mechanism And does not dictate the choice of algorithms. For example, LACP can work with 2 tuple or 5 tuple algorithms. Default is Layer2.

Lets see output of ovs and ovs-dpdk LACP bond as well.

[root@compute-0 heat-admin]# ovs-appctl lacp/show bond0
---- bond0 ----
  status: active negotiated
  sys_id: 04:3f:72:d9:c0:48
  sys_priority: 65534
  aggregation key: 3
  lacp_time: fast

member: enp4s0f0: current attached
  port_id: 4
  port_priority: 65535
  may_enable: true

  actor sys_id: 04:3f:72:d9:c0:48
  actor sys_priority: 65534
  actor port_id: 4
  actor port_priority: 65535
  actor key: 3
  actor state: activity timeout aggregation synchronized collecting distributing

  partner sys_id: c8:fe:6a:f2:44:00
  partner sys_priority: 127
  partner port_id: 6
  partner port_priority: 127
  partner key: 5
  partner state: activity timeout aggregation synchronized collecting distributing

member: enp4s0f1: current attached
  port_id: 3
  port_priority: 65535
  may_enable: true

  actor sys_id: 04:3f:72:d9:c0:48
  actor sys_priority: 65534
  actor port_id: 3
  actor port_priority: 65535
  actor key: 3
  actor state: activity timeout aggregation synchronized collecting distributing

  partner sys_id: c8:fe:6a:f2:44:00
  partner sys_priority: 127
  partner port_id: 7
  partner port_priority: 127
  partner key: 5
  partner state: activity timeout aggregation synchronized collecting distributing

As you can see, OVS bond shows all the negotiated flags of the port and By looking that, we can tell if session established or not.

[root@computedpdk-0 heat-admin]# ovs-appctl lacp/show
---- dpdkbond0 ----
  status: passive negotiated
  sys_id: 04:3f:72:d9:c0:48
  sys_priority: 65534
  aggregation key: 1
  lacp_time: slow

member: dpdk2: current attached
  port_id: 2
  port_priority: 65535
  may_enable: true

  actor sys_id: 04:3f:72:d9:c0:48
  actor sys_priority: 65534
  actor port_id: 2
  actor port_priority: 65535
  actor key: 1
  actor state: aggregation synchronized collecting distributing

  partner sys_id: c8:fe:6a:f2:44:00
  partner sys_priority: 127
  partner port_id: 6
  partner port_priority: 127
  partner key: 5
  partner state: activity timeout aggregation synchronized collecting distributing

member: dpdk3: current attached
  port_id: 1
  port_priority: 65535
  may_enable: true

  actor sys_id: 04:3f:72:d9:c0:48
  actor sys_priority: 65534
  actor port_id: 1
  actor port_priority: 65535
  actor key: 1
  actor state: aggregation synchronized collecting distributing

  partner sys_id: c8:fe:6a:f2:44:00
  partner sys_priority: 127
  partner port_id: 7
  partner port_priority: 127
  partner key: 5
  partner state: activity timeout aggregation synchronized collecting distributing

You can notice that, dpdk port which is local port here, does not show “Activity” and “Timeout” flags. However, it doesn’t mean, these states do not exist with dpdk bond. It does exist and look at this code line https://github.com/DPDK/dpdk/blob/main/drivers/net/bonding/rte_eth_bond_8023ad.h#L17. This dpdk bond is successfully negotiated and up running one.

With that, I end this blog here, Hope it cleared some of your doubts.

Note: I have used RHEL 8.4 with kernel version 4.18.0-305.45.1.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s