A few months ago I had a strange behaviour: Sometimes the L2TP/IPSec connection between my workstation at home and our company VPN silently failed. After clicking on the VPN connection icon in the tasktray it only showed Connecting to… and stopped working after 60 seconds. The VPN connection status did dot get updated and showed just the connection name as nothing has ever happened. Strangely enough, this only happened if the following conditions were true:

  • I tried to connect to the VPN by using the taskbar VPN connection helper
  • and I tried to connect before a timespan of 4 minutes has passed after I had logged in into the workstation.

If I tried to connect by using the Control Center > Network connections > VPN > [VPN Name] icon it always worked. If I tried to connect after 4 minutes, by using the tasktray icon it most of the time also worked.

Enable VPN logging

The first thing I am always doing when struggling with something: Checking all the logs which could be involved in the process.

First of all I started to look at our VPN gateway, a Sophos UTM 9, for any meaningful information. Sadly, it did only show that the connection was canceled by the peer, my workstation.

Windows has AFAIK two methods how to get some debug information about the VPN connection. First of all, you can enable the log file by using

netsh ras diagnostics * state=enabled
# or
netsh ras diagnostics set rastracing * enabled

After you have enabled the logging, you can find the log file at c:\windows\traching\RASMAN.log. In my case it only showed that the connection was terminated after 60 seconds.

The second method I used were low-level stuff

netsh trace start VpnClient per=yes maxsize=0 filemode=single
# reporoduces issue
netsh trace stop

VpnClient can also be replaced by VpnClient_Dbg to trace more information. After my issue had been reproduced, I took the trace as described by Microsoft and loaded it into netmon. This did also not show any useful information.

Hard disk timeouts

After a few days I realized that the VPN connection only failed when using the tasktray icon and my hard disk had a high load at the same time. This always happened during and after the logon process. The solution for this was easy: I tried only connect to the VPN during a time without heavy hard disk load. After some googling it turned out, that also others had more or less the same issue.

I assume there is some timing issue between updating the GUI and the IPSec state itself.

Connection issue appeared without disk load

Until the beginning of October I had lived with the issue. I had to wait a few minutes but it worked.

One day in the first week of October I were no longer able to connect to VPN even by waiting a few minutes. In the hindsight I assume that somehow KB4524147 had a sideeffect. Our Sophos UTM now showed the following error:

2019:11:05-09:22:24 fw1 openl2tpd[24780]: PROTO: tunl 62860: HELLO received from peer 1
2019:11:05-09:22:25 fw1 openl2tpd[24780]: FSM: CCE(62860) event XPRT_DOWN in state CLOSING
2019:11:05-09:22:35 fw1 openl2tpd[24780]: FSM: CCE(62860) event XPRT_DOWN in state CLOSING
2019:11:05-09:22:45 fw1 pluto[6304]: "L_for admin"[24] w.x.y.z:12040 #208: NAT-Traversal: received 2 NAT-OA. using first, ignoring others
2019:11:05-09:22:45 fw1 pluto[6304]: "L_for admin"[24] w.x.y.z:12040 #208: responding to Quick Mode
2019:11:05-09:22:45 fw1 pluto[6304]: "L_for admin"[24] w.x.y.z:12040 #208: IPsec SA established {ESP=>0xa0d5d138 <0x15bb6495 NATOA=192.168.43.94}
2019:11:05-09:22:45 fw1 pluto[6304]: "L_for admin"[114] w.x.y.z.:12040 #203: received Delete SA(0xab38c176) payload: deleting IPSEC State #207
2019:11:05-09:22:45 fw1 openl2tpd[24780]: FSM: CCE(62860) event XPRT_DOWN in state CLOSING
2019:11:05-09:22:47 fw1 openl2tpd[24780]: FUNC: tunl 62860 deleted
2019:11:05-09:22:47 fw1 openl2tpd[24780]: FUNC: tunl 62860: deleting context

Googling for received Delete_SA returned https://github.com/hwdsl2/setup-ipsec-vpn/issues/288#issuecomment-349861905 which led me to the error 809 described at https://github.com/hwdsl2/setup-ipsec-vpn/blob/master/docs/clients.md#windows-error-809 .

In the end, I updated my registry key by using

REG ADD HKLM\SYSTEM\CurrentControlSet\Services\PolicyAgent /v AssumeUDPEncapsulationContextOnSendRule /t REG_DWORD /d 0x2 /f

and then rebooted my PC. After that, the VPN worked again