Virtual network environment
update 30.09.2022 11:30:03 - notes on network stack changes in Windows 22H2
An advancement of personal computers has enabled to get an entire distributed computing network into a single device. This article describes how to do it quickly and inexpensively on a business grade MS Windows notebook.
Surely, a virtualized network environment on a notebook wouldn't replace a consumer-faced production environment. Though, it is enough for home and office appliances, software development environment, as well as many test environments. The thinkable use cases are:
Road warriors. You need a slew of network services on demand, while headquarter that provides the embellished network infrastructure is not reliably available. You want to replicate at least some of quick response services on premises. You have historically lugged a bunch of boxes, while traveling on assignments. You can now virtualize any and most of the services. The particular point is ability to do it, while hitting the Internet via a single IP. You'd save on convenience charges for additional network devices in hotels.
Business continuity. In emergency or disaster recovery you can relocate just with a single device in your palm and quickly bring online any missing service. Since the setup enables a VPN connection to the corporate network, a swarm of relocated workers can recover from data center or head quarters loss by spinning out first-response services on their distributed devices. It may just keep the lights on until a backup site or new corporate data center comes online.
SOA development. Tweaking systems on real hardware is passe since cloud computing has emerged. The lion share of new software today is for on-line world. An it means latency and semaphoring (latency in social interactions) if you need to reboot or change system config that impacts the work of others. A local, personalized networking test environment helps greatly to speed up development process. Imagine doing the same but quicker and cheaper. Costs and setup complexity have been a prohibitive factor. Now they are not.
Software: To pull the trick off you'd want a pro/enterprise version of Windows 10, or Windows 11, as it has Hyper-V virtualization on board free of charge. You may select a FOSS OpenWRT embedded firmware for a router.
Processor: Each compute instance would require one processor core for a headless instance and two cores for an instance with graphical UI. You will need to boot up at least two service instances (a router and a host UI) plus a number of utility virtual instances. Depending on workload you may want more cores for some utililty instances. The calculation is just for the very minimum and snappy operations. To make any sense at least a couple of virtual utility instances are there to expect. It results in 3 mandatory service + 4 utility cores = 7 ALUs. A minimum 8 core CPU is advisable to start off.
RAM: It is a not so easily calculable demand. Much is set by workload software. A zero load Hyper-V host and the router will be good with 6G and 256 M RAM. The service part of the setup will fir 8G RAM. The overhead depends on OSes and applications they run. Out of experience, a bunch of services would require in excess of 16G as of 2022. 32G is a reasonable, comforting and obtainable RAM on a notebook.
The networking and routing capabilities of a consumer-grade Windows and Hyper-V are quite limited. Partially because of Microsoft's monetization effort and the demand to buy a server version if a meaningful networking stack is to be exploited. The server version of Windows is a no-go too, since it makes the setup too expensive, bloated and outright monstrous at boot and management. I've read a lot of instructions aiming the similar goal that indicated the author's frustration, which obviously were lacking the acceptable end result. None of the slamming successes are known to me. The content of the article is a result of tinkering while sitting on the shoulders of the giants, who tried to get there before me.
The key discipline in success is to get rid of the Windows'es networking stack as much as possible. Any meaningful network-related task in the proposed setup is done by a dedicated virtual instance (VPC). I've been familiar with OpenWRT, so the version 19.07.9 is deployed in the VPC. You may become successful with another router code as well.
The router integrates packages to:
establish a VPN pipe to corporate network. In my case it is a level 2 distributed sites tinc network. It enables services by hosts distributed across the Europe as if they were on a local subnet. I.e. samba, network printing, broadcast discovery, multicast and so on.
operate efficient umbrella NAT, i.e. single IP outbound interface for all hosted instances. Right, the host and the virtual instances are all behind the NAT.
run fully fledged DHCP for local instances, while retaining capability to assign static leases. It allows a remote desktop protocol (RDP) connection, or SSH connection with VPC via pre-defined subnet addresses and passthrough ports at public IP.
passthrough samba. It allows to access intranet resources from an extranet device, like cell phone or tablet.
- Hyper-v host configuration
1. Install Windows components: Hyper-v, simple TCP/IP services, PowerShell, SMB direct.
Enable VPC management (i.e. VM start/stop/change) by a not privileged user. Launch Computer Management with admin privileges. Go to System Tools > Local Users and Groups > Groups >Hyper-V Administrators Properties. Add users to the profile. After you do this you will need to log out and log back in to take effect.
Open Hyper-v Manager at your host either with admin privileges, or as a user with elevated rights.
3.1 Create an internal network switch. Give it a concise name, for example, 'LAN'.
3.2 Create an external network switch. Give it a concise name, for example, 'WAN'. Bind it to an existing multi-NIC bridge, or to a network card of your choice, for example, WiFi.
Launch Windows PowerShell with admin privileges. Tune up windows interfaces:
4. Network stack has been changed in Windows update 22H2. This setup concerns interface alias change, default topology for virtual cards and attributes, as well as VPF filtering module. The old setup will break after update. The VM's network cards report in kind of "no cable plugged". The stack might be changed again soon, judging by the complains at MS support site, i.e. the tune-up granularity has become worse and there some stability issues. It is recommended to stay on 22H1 for a while.
A somewhat working configuration would be as follows, i.e.
a) not demanding forwarding tag any more, it sems to be working either way.
b) not requiring to make changes in 'wan' switch attributes. The 'wan' under the actual settings is now aliased 'default switch', which cannot be manipulated.
c) 'VPF filter' becomes 'Azure VPF filter' and has an adverse effect on connectivity. Fortunately, the new stack works with VPF disabled.
If restoring functionality after the update a reliable procedure could be: a) delete virtual switches, b) reset network, c) reboot, d) create new virtual switches, assign them to the VM's NICs, run the commands:
5. Check that host's firewall does not block operations.
5.1 Enable ip, icmp and other necessary networking. For example, in Windows Defender Firewall enable "Core Networking Diagnostics - ICMP Echo Request (ICMPv4-In)". The actual rules to check do depend on your use cases.
5.2 Add Hyper-V processes into the antivirus exceptions: Windows32/VMMS.exe and Windows32/VMWP.exe
6. Optional. On Hyper-V host launch regedit with admin privileges. Go to HKLM\ SYSTEM\ CurrentControlSet\ Services\ VMSMP\ Parameters and add a DWORD named as follows. Give it the value of 1.
(for 1Gb adapter).
(for 10Gb adapter).
Close the regedit instance. The first option explicitly enables VMQ for network cards, which work at speeds less than 10 Gbps, i.e. for the most of them in consumer segment. By default only the 10Gbps adapters use the VMQ option if checked.
2. OpenWRT installation
OpenWRT enablement process is quite straight forward and would take approximately 30 minutes. The more demanding task could be OpenWRT firmware setup. The later is hard to predict, since it is greatly depending on your intranet configuration and service demands. A minimal process to set admin password, subnet, interfaces and static DHCP leases would take approximately 15 minutes.
The key elements are copying distributive firmware image onto a virtual hard disk, enabling level 2 emulation at level 3 MS switch by checking 'enable MAC spoofing' option, attaching NICs in proper order, and disabling host to use wan interfaces directly.
Download an actual i64 OpenWRT image from Index of /releases/19.07.9/targets/x86/64/ (openwrt.org). A fair choice is generic, ext4, combined variation. Extract img from gz archive.
Download physdiskwrite.exe program from m0n0wall - physdiskwrite. Put it in the same folder as img file.
Launch computer management with admin privileges on Hyper-v host. Then click on Disk Manager tab. Create a vhdx disk 268-500 MB, initiate it as GPT.
Launch a command line console with admin privileges on Hyper-v host. Go to the directory, where the img file is stored. Flash the img to the vhdx by launching physdiskwrite.exe. When prompted, select proper disk; input 'yes' to overwrite. Close the CLI window.
5. Detach vhdx in Disk Manager. Close the Computer Management console.
6. Open Hyper-v Manager at your host either with admin privileges, or as a user with elevated rights. Create a virtual machine (VM):
RAM 256 MB,
attach the vhdx disk file to 'IDE controller 0',
add a standard network adapter (not legacy) and bind to LAN switch, set a static MAC, enable MAC spoofing,
add a standard network adapter (not legacy) and bind to WAN switch, set a static MAC, allow VMQ if the adapter is facing a physical card,
set 'Always start this virtual machine automatically' to on,
set 'Automatic Stop Action' to 'Power off',
7. Launch the VM's console. Launch the VM from console. Let it finish boot sequence. Push enter to get into the VM's command line. Enter 'ifconfig' and check that lan is at eth0 and wan at eth1. Check that the eth0 has obtained an IP from your access point.
8. Launch a browser at your Hyper-v host. Connect to 192.168.1.1. Setup the router via web UI.
9. In Hyper-v Manager disable management co-use on WAN switch. It will disable host traffic via external interface and pass all traffic through the MS internal bridge, hence the virtual router. In case of WiFi the external interface will show actual wifi name, but no internet access. The network icon will show that the host is connected via a wired interface.
10. In the Windows Power Shell console check that the lan and wan destinations are accessible by host. Here the examples in the order: 1. virtual router LAN interface; 2. world; 3. the subnet VM; 4. a vpn subnet. Close the Power shell console.
11. Reboot host. Check that the virtual router is automatically up and functional.
3. Tweaking connection speed
After introduction of the router as a middleman between your PC and the Internet access point, you may experience significant slowdown in data transfer rates. Particularly upload speeds may worsen x*10 times. Combine it with the fact that many ISPs provide last mile speed discrepancy, i.e. upload being any way much slower than download. Then an additional slowdown can bring user experience to a creeping halt. For example, on a 100 Mbps download connection, I had 5 Mbps upload. Then the speed worsening brought it to 400 Kbps.
The reason for the phenomenon is strictly limited modem MTU. Fiberglas, HF-cable, or DSL modems, while being able to provide hilarious data transfer rates, are vulnerable to IP packet embellishment. Should the incoming packet size exceed the pipe's MTU, the modems start to cut the packets in pieces and assemble them back on the other end. The commercial modem hardware has neither excessive compute power nor buffer memory to do it properly. IP packet processing by modems turns into a bottleneck. Once the upload bottleneck comes into effect, the download speed drops down significantly too.
Where does the IP packet embellishment come from? There could be a situation that the NAT you install on your VPC is one too many. There could be other NATs all the way up to the real IP on the Internet. It is called NAT chaining. Some NAT realizations mess with the packet contents, changing for example ALG, and herewith packet size. NAT chaining very quickly gets over the threshold of MTU tolerance, which is often close at 1540 bytes total.
There are two approaches to resolve the issue. The first is to limit MTU in your LAN at approximately 1200. The consequence would be:
a bit slower LAN. Yes, you can live with it, since virtual LAN at 10 Gbps and minimal latency will compensate packet inefficiency. Your virtual network will still be much quicker than the real life local networks out there.
you will not be able to use and test software requiring jumbo frames. This could be a deal breaker.
Striving for a universal solution leads to the second approach. Shape packets on the capable hardware. Any modern CPU runs at frequencies 1GhZ and above, while tapping recent RAM technology. This is by factor 10 more than enough. In my setup, the virtual router at full load was barely capable to produce 4% load for one core of a 10th gen, U series Intel CPU.
The SCM service. Test the Internet speed by your favorite tool. Install package luci-app-sqm. Set in GUI:
download speed at 85% of the pipe,
upload speed at 85% of the pipe,
the setup script: piece_of_cake.qos,
Parametrize physical layer adaptation. For an Ethernet Fiber channel works:
packet overhead: 44,
packet max size: 1544,
Then save and reboot. Test the Internet speed by your favorite tool.
4. VPN setup
This section will not be expanded, since the vpn protocol selection and establishment is up to many variables out of individual's control span. It makes no sense to dive into setup and settings for a particular choice.
In my case tinc level 2 switch mode is used. It mandatory connects to three other sites and dynamically connects to many other available subnets on the intranet. Every site has its own segment /24 with IP assignments like: 192.168.segment.machine, while the entire intranet is /16, i.e. network mask is 255.255.0.0. It makes the hardest use case for Microsoft networking stack. The test has been passed. All machines and arp services are available to any VPC and the host.
5. Samba setup
Any virtual network is a good thing until it isn't. In a physical network environment one can attach any device and use the network services. In a virtual environment only the virtual instances have the access. What if you had mobile compute units and need to synchronize data or access bases via a mobile app? The solution is to passthrough from physical extranet into virtual intranet. Here it is detailed for samba services, i.e. the file sharing that is commonly enabled in Windows via Network tab in Explorer.
Install samba service and axillary utilities:
Login to the virtual router via SSH and modify the setup file /etc/samba/smb.conf.template
The samba server at the virtual router would have no its own data resources of interest and meaning. The only server purpose is to enable controlled selective access to intranet resources. Samba server acts as an in-kind smart firewall appliance bridging two subnets. The method to do it is to establish a discoverable samba service at wan (outbound) interface. The shares of the service are network resources, which are mounted via lan (inbound) interface. The data is located elsewhere, typically accessed via VPN.
Modify setup in the file /etc/config/samba. Add the interface, at with to serve:
Add to the file /etc/config/samba one or multiple sections like the following. Each section represents one resource on the intranet. The variable 'path' indicates where the resource is mounted within the router file system. Here folder: /mnt/remote_resource_1. The folder has to exist prior its use. Create it by the command mkdir /mnt/remote_resource_1.
Setup firewall. Add into the file /etc/config/firewall the following content:
The next step is to create the users, who can access the shares. It is a standard routine that depends on the actual authentication schema. I won't expand on that here. The simplest schema would be local user accounts. It is good enough to test the functionality and to control it locally.
Execute following commands in CLI on the virtual router. When creating passwords do not provide input for user samba, just hit Enter. This no-password user is to test that simplified logon is disabled. The other user (supposedly your name) has to have a password of serious complexity. The same passwords have to be entered multiple times, once for system entry and then for samba entry.
Samba would become operational in stealth mode after router reboot. The user is required to type router's wan IP in the Explorer's address line to access the shares. The WSD service allows resource discovery in Windows 10, Windows 11 Network tab, hence using GUI.
At the time the OpenWRT implementation of wsdd2 service can operate at one interface at a time. It is usually linked to lan or loopback. To ensure it is linked to wan, please check that the file /etc/init.d/wsdd2 contains the following line:
Last but not least, one has actually to link the resources and the declared shares. You can use the command 'mount' upon VPN link coming up. Tinc profile has a tinc-up file, which is a shell script to execute upon the event. Add the following lines to the end. The number of mount lines has to be equal to the number of declared 'sambashare's. The sleep command gives a lee time to accomplish VPN connections with other hosts and propagate intranet service availability messages.
Set the 'wsize' variable to a multiple of SO_RCVBUF. Do not set both of them too big. The router by default has a cumulative buffer in the range tens of MB. It was meant by protocol developers to speedup local disk operations while writing jumbo blocks. The large buffer solution makes actually a disservice if a remote disk is operated.
After a local client uploads all its data to the samba server, it won't receive acknowledgement until all data is actually stored on the disk. For a remotely attached disk it means: until all data is sent through the cifs interface. This link is much slower than data transfer on LAN and is the bottleneck in the setup. The typical situation is that all buffers are full and waiting until the data is pumped through cifs. Enabling asynchronous operation makes the situation even worse. At the end of writing a file the client stops transmission and awaits for acknowledgement for a few seconds. The acknowledgment may come minutes later if the buffers are big. If the upload speed to a far-away resource is slow, the delay caused by processing large buffer can be tremendous. It will trip timeout timers in the client's software and end batch uploads right there.
A smaller buffer is better for slow writes. The size of the buffer is the sum of sizes set by SO_RCVBUF and wsize parameters. The two consecutive buffers are also better to be balanced by a multiplicator that reflects expected speed difference between LAN and VPN, i.e. max [lan speed / vpn speed]. If VPN speed is different for various assignments sites, then the lowest of them will likely dictate the formula outcome. Lan speeds and WiFi are roughly identical.
Upon VPN link down unmount the resources. Add the lines to the 'tinc-down' file:
You are all set if you've successfully gotten to the point. Good luck!