Basic Networking Reference

A condensed set of notes on networking basics. Not intended as an educational resource. More like a short walk through on the basics.

Loopback Interface

The loopback device is a special, virtual network interface that a computer uses to communicate with itself. It is used mainly for diagnostics and troubleshooting, and to connect to servers running on the local machine.

When a network interface is disconnected, e.g. an ethernet port is unplugged or wifi is turned off, no communication on that interface is possible, not even communication between the computer and itself. The loopback interface does not represent any actual hardware, but exists so applications running on the computer can always connect to servers on the same machine. (E.g. Freeciv, a client-server based application would not be able to function without the loopback)

This is important for troubleshooting. The loopback device is sometimes explained as purely a diagnostic tool. But it is also helpful when a server on the machine is running, and a client on the same machine wants to access its resources.

This is e.g. nice for web development, because the loopback interface can facilitate a basis for a webserver, without having to expose the work in progress state of the site to the outside world.

127.0.0.1

For IPv4, the loopback interface is assigned all the IPs in the 127.0.0.0/8 address block. That is, 127.0.0.1 through 127.255.255.254 all represent the local machine. Most importantly, though, 127.0.0.1 is the default loopback address, commonly known as localhost, and is always the first and default entry in /etc/hosts. Changing this can cause applications that assume it to break, so changing this value to another IP is usually not recommended.

Thus, to log in as root via SSH to the SSH server running on your own machine, you would use:

ssh root@localhost

Like other network adapters, the loopback device shows up in the output of ip addr. Its name is lo:

root@linuxpc:~# ip addr
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
...

127.0.1.1

Some software might expect the system hostname to be resolvable to an IP address if there is no outside network access. Debian then chose to define 127.0.1.1 for mapping the IP address of the host_name in case that no outside network situation.

The host_name matches the hostname defined in /etc/hostname.

Otherwise this value might not have any special significance on other systems.

Physical Interfaces

Since a machine can have a huge number of interfaces, virtual and physical, it’s good to be able to identify the physical ones, the command to do so is:

lshw -class network

To see ALL devices, whether active or not:

ip addr

Use ip link set to bring devices up or to bring them down:

ip link set dev $INTERFACE_NAME up
ip link set dev $INTERFACE_NAME down

Domains and DNS

[ Local PC ]
       |
       x (Asks for "nas.local" OR "google.com")
[ Local DNS Resolver (optional, most home networks do not need this) ] (Matches "nas.local"? Yes!)----> [ Returns Local IP ]
       |
       x (No match for "google.com")
[ Public Internet DNS (1.1.1.1) ] ----------------> [ Returns Public IP ]

Example of a local DNS Resolver ~ PowerDNS -> basically an application with a MySQL backend that stores domain name translations ~ the InnoDB engine for actual long term storage, some mechanism for internal application cache for frequently requested data

The Mechanism

Linux relies on a central configuration file called /etc/nsswitch.conf (Name Service Switch) to dictate the lookup priority when trying to resolve a domain name to an IP:

root@linuxpc:/# cat /etc/nsswitch.conf | grep "^hosts"
hosts:          files mdns4_minimal [NOTFOUND=return] dns myhostname

In this example:

Can use nslookup to check what happens:

root@linuxpc:/# nslookup google.com
Server:         1.1.1.1
Address:        1.1.1.1#53

Non-authoritative answer:
Name:   google.com
Address: 172.217.23.238
Name:   google.com
Address: 2a00:1450:400e:806::200e

For a simpler check, whether a domain is reachable at all ping -c3 is good.

Key Terms to Know

Sidenote on Routing

Without routing, the machine knows who it wants to talk to, but not where to send the packet next.

DNS tells a machine what IP address belongs to a domain name. Routing tells a machine where packets should be sent to reach that IP address.

For example, I want to reach my desktop PC from my test machine:

somebody@debian-test:~$ ping -c3 linuxpc
PING linuxpc (192.168.2.37) 56(84) bytes of data.
64 bytes from linuxpc (192.168.2.37): icmp_seq=1 ttl=64 time=0.057 ms
64 bytes from linuxpc (192.168.2.37): icmp_seq=2 ttl=64 time=0.135 ms
64 bytes from linuxpc (192.168.2.37): icmp_seq=3 ttl=64 time=0.139 ms

--- linuxpc ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2028ms
rtt min/avg/max/mdev = 0.057/0.110/0.139/0.037 ms

Now the operating system has an IP address, but it still needs to decide:

It can be checked using ip route, which shows the routing table:

somebody@debian-test:~$ ip route
default via 10.100.0.1 dev ens3 onlink
10.100.0.0/24 dev ens3 proto kernel scope link src 10.100.0.10

Breaking it down:

Which is the gateway in this case, meaning this is going out to the interface of another machine, where the routing table of that machine takes over, and decides the next hop. Basically, gateway ~ “The machine I give packets to when I don’t know how to reach the destination myself.”

The gateway can be configured in /etc/network/interfaces.

NAT

The internet runs on IPv4 addresses, but math dictates that they cap out at about 4.2 billion unique addresses, which is a lot, but not enough to cover the needs of the world.

NAT was invented to save the internet from running out of IP addresses, by allowing households and office networks to share one single public IP address.

Public vs Private IPs

To make NAT work, the internet standards bodies reserved three specific blocks of IP addresses to be used strictly inside private networks. These are called Private IPs:

The Rule: Private IP addresses are completely invisible to the public internet. Routers on the internet are programmed to instantly drop any packet that comes from or goes to a 192.168.X.X address. Millions of people around the world can use the exact same address 192.168.1.15 inside their own homes simultaneously without causing a conflict.

So how does a computer access the internet behind a router?

The router does some post-processing once it gets a request from a device on its network:

The NAT Table: To remember which device on the local network actually asked for this page, the router notes down the swap in a temporary internal ledger called a NAT Translation Table:

Private Source (Internal)    Assigned Port (External)    Destination Website
192.168.1.15 : 51200    Port 61001  website.com : 443

The main idea is that now, only one IP needs to be exposed towards the public internet, rather than a unique IP for each device.

Virtual Machines

When a VM is configured to use NAT mode, the host machine will act as something like a router.

iptables -t nat -A POSTROUTING -s 192.168.100.0/24 -o wlo1 -j MASQUERADE
[ Virtual Machine ]  (IP: 192.168.100.2 : Internal IP of the VM, only the host OS knows this)
       |
       x (Passes through Virtual NAT engine on Host)
[ Host Computer ]    (IP: 192.168.2.37 : Internal IP of the PC)
       |
       x  (Passes through Physical NAT engine on Router)
[ Physical Router ]  (IP: 99.88.77.66  : Public internet IP)
       |
       x
[ Public Internet ]

The Alternative: Bridged Mode

VLAN

VLANs are able to subdivide LANs. This is done by tagging traffic.

So it is very different from NAT (Layer 3): It isolates by changing the numbers of the IP address pool (192.168.100.X vs 192.168.200.X).

Whereas VLAN Tagging (Layer 2): Isolates by appending a hidden tag directly onto the raw Ethernet packet header, completely ignoring what the IP addresses inside the packet are.

Before trying to work on VLANs, its good to check whether the network card in your device supports them:

somebody@linuxpc:~$ lsmod | grep 8021q
8021q                  45056  0
garp                   20480  1 8021q
mrp                    20480  1 8021q

It is possible to create VLANs directly with the ip link command, e.g.

ip link add link virbr-wifi name virbr-vlan100 type vlan id 100
ip addr add 192.168.101.1/24 dev virbr-vlan100
ip link set dev virbr-vlan100 up

This virtual interface will behave just like any normal interface. All traffic routed to it will go through the master interface (in this example, virbr-wifi) but with a VLAN tag.

Using the same command, we can inspect it:

somebody@linuxpc:~$ ip -d link show virbr-vlan100
8: virbr-vlan100@virbr-wifi:  mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:fc:4b:90 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 0 maxmtu 65535
    vlan protocol 802.1Q id 100  addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

It can be destroyed with the usual commands:

ip link set dev virbr-vlan100 down
ip link delete virbr-vlan100

Interfaces created this way exist only in the system runtime memory, and won’t stick around after reboot. For persistent configurations, need to either use /etc/network/interfaces or libvirt xml configs and hooks in the case of kvm managed VMs.

Libvirt

With libvirt the process of creating a VLAN is fairly straight forward. Here we will go through the process of creating a simple VLAN.

For example, maybe it is desired to separate out networks, to make sure they don’t somehow gain access to adjacent databases.

Example desired set up, vlan100 and vlan200:

ip link add link virbr-wifi name virbr-vlan100 type vlan id 100
ip addr add 192.168.101.1/24 dev virbr-vlan100
ip link set dev virbr-vlan100 up

ip link add link virbr-wifi name virbr-vlan200 type vlan id 200
ip addr add 192.168.102.1/24 dev virbr-vlan200
ip link set dev virbr-vlan200 up

Which should look something like:

[ VM 1 (VLAN 100) ] ---\
                        --> [ virbr-wifi (Host Gateway) ] -> [ Routing Table ] -> [ NAT ] -> [ wlo1 ]
[ VM 2 (VLAN 200) ] ---/

To convert this to libvirt, need to create the following network rules:

<network>
  <name>virbr-vlan100</name>
  <forward mode='bridge'/>
  <bridge name='virbr-wifi'/>
</network>

and

<network>
  <name>virbr-vlan200</name>
  <forward mode='bridge'/>
  <bridge name='virbr-wifi'/>
</network>

Remember that it will be necessary to assign IPs for the VMs inside these networks using /etc/network/interfaces on the VMs.

Define, start, and autostart the VLANS

virsh net-define vlan100.xml
virsh net-start virbr-vlan100
virsh net-autostart virbr-vlan100
virsh net-define vlan200.xml
virsh net-start virbr-vlan200
virsh net-autostart virbr-vlan200

Libvirt should automatically create iptables rules to let these networks access wlo1. However, it also accidentally allows them to route traffic to each other.

The virbr-wifi interface must be capable of vlan_filtering, i.e. cat /sys/class/net/virbr-wifi/bridge/vlan_filtering should say 1.

Basically, we can create the virtual interfaces used by vlans via the libvirt xml directly, but actually adding the VLAN tagging is not possible at that stage, so we need to use a hook. When a VM boots up, we’ll check the name of the VM to determine which VLAN it should belong to, and then do the interface set up, IP address assignment, and configure NAT routing so virtual machines on the vlan can access the Internet via the wlo1 wifi interface.

Add it via hooks /etc/libvirt/hooks/network:

#!/bin/bash
NET_NAME="$1"
ACTION="$2"
VLAN_ID=$(echo "$NET_NAME" | grep -Eo "[0-9]+")

if [[ "$NET_NAME" =~ ^vlan ]] && [[ "$ACTION" = "started" ]]; then
    V_INTF="virbr-wifi.${VLAN_ID}"
    GATEWAY_IP="10.${VLAN_ID}.0.1/24"

    # enable vlan filtering on virbr-wifi interface, tells the interface to inspect and honor VLAN tags
    ip link set virbr-wifi type bridge vlan_filtering 1

    # creates a tagged virtual interface linked to your bridge for the specific, assigns a gateway IP to this new interface
    ip link add link virbr-wifi name "$V_INTF" type vlan id "$VLAN_ID"
    ip addr add "$GATEWAY_IP" dev "$V_INTF"
    ip link set "$V_INTF" up

    sysctl -w net.ipv4.ip_forward=1 > /dev/null
    # hides the private VLAN IPs behind the host public IP
    iptables -t nat -C POSTROUTING -o wlo1 -j MASQUERADE 2>/dev/null || iptables -t nat -A POSTROUTING -o wlo1 -j MASQUERADE
    # allows outgoing traffic originating from the VLAN interface to pass out through wlo1
    iptables -A FORWARD -i "$V_INTF" -o wlo1 -j ACCEPT
    # allows incoming traffic back to vlan, if it belongs to a connection that a machine inside the vlan started first
    iptables -A FORWARD -i wlo1 -o "$V_INTF" -m state --state RELATED,ESTABLISHED -j ACCEPT
    # allows the bridge interface to accept and process untagged or tagged management traffic 
    bridge vlan add dev virbr-wifi vid ${VLAN_ID} self
    # libvirt traffic rules to allow traffic in/out
    iptables -I LIBVIRT_FWO 1 -i virbr-wifi -s 10."$VLAN_ID".0.0/24 -j ACCEPT
    iptables -I LIBVIRT_FWI 1 -o virbr-wifi -d 10."$VLAN_ID".0.0/24 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
fi

# when the vm is stopping, delete the rules
if [[ "$NET_NAME" =~ ^vlan ]] && [ "$ACTION" = "stopped" ]; then
    V_INTF="virbr-wifi.${VLAN_ID}"
    GATEWAY_IP="10.${VLAN_ID}.0.1/24"

    iptables -t nat -D POSTROUTING -o wlo1 -j MASQUERADE 2>/dev/null
    iptables -D FORWARD -i "$V_INTF" -o wlo1 -j ACCEPT 2>/dev/null
    iptables -D FORWARD -i wlo1 -o "$V_INTF" -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null
    iptables -D LIBVIRT_FWO -i virbr-wifi -s 10."$VLAN_ID".0.0/24 -j ACCEPT 2>/dev/null
    iptables -D LIBVIRT_FWI -o virbr-wifi -d 10."$VLAN_ID".0.0/24 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT 2>/dev/null

    ip link delete "$V_INTF" 2>/dev/null
fi

Now, one more hook is still needed. In the next hook the goal is to estbalish network isolation by watching for VMs to start, extracting a VLAN ID from the name of the VM, and instantly putting its network adapter into that specific VLAN.

The qemu hook (/etc/libvirt/hooks/qemu):

#!/bin/bash

BRIDGE_NAME="virbr-wifi"

VM_NAME="$1"
ACTION="$2"
PHASE="$3"

# Only execute when any VM moves into the fully started phase
if [ "$ACTION" = "started" ] && [ "$PHASE" = "begin" ]; then

    # Check if the VM name matches the pattern ending in -vlan followed by digits
    if [[ "$VM_NAME" =~ -vlan([0-9]+)$ ]]; then
        # Extract the matched digits (VLAN ID) from the regex capture group
        VLAN_ID="${BASH_REMATCH[1]}"

        # Find the specific dynamic tap interface (vnetX) assigned to this VM on our bridge
        for i in {1..5}; do
            TAP_IF=$(virsh domiflist "$VM_NAME" | grep "$BRIDGE_NAME" | awk '{print $1}')
            [ -n "$TAP_IF" ] && break
            sleep 0.5
        done

        # If a valid interface is found, apply the VLAN isolation rules
        bridge vlan del dev "$TAP_IF" vid 1 2>/dev/null
        bridge vlan add dev "$TAP_IF" vid "$VLAN_ID" pvid untagged
    fi
fi

To block the VLANs from talking to one another, add a persistent drop rule to disallow this:.

iptables -A FORWARD -i virbr-vlan.+ -o virbr-vlan.+ -j DROP

Finally, remember that it will be necessary to assign IPs for the VMs inside these networks using /etc/network/interfaces on the VMs, e.g.

auto ens3
allow-hotplug ens3
iface ens3 inet static
    address 10.100.0.10
    netmask 255.255.255.0
    gateway 10.100.0.1
    dns-nameservers 1.1.1.1 8.8.8.8

That would be mostly what is needed for a most basic VLAN setup.

Address Resolution Protocol

This is The translation layer for intra-LAN communication.

Switches and network cards do not understand IP addresses. At the physical layer, devices can only talk to each other using hardware MAC Addresses (like 52:54:00:fc:4b:90).

E.g. if there are two MariaDB servers with replication from 192.168.2.37 on the same local network, they do not use IP addresses to communicate with each other.

Before the master can send a single packet to the replica, it has to shout out a question to the entire local network using an ARP broadcast: “Who has IP 192.168.2.38? Tell 192.168.2.37!” The replica hears the shout and replies directly: “That’s me! Here is my MAC address.” The master server saves this mapping in its local ARP Cache (arp -an) so it can bypass the broadcast for the next few minutes.

Ports

Hitting an IP address with packets on its own does little good, because the receiving computer won’t magically just know what should process those packets. This is where ports enter the picture.

In Linux when an application starts up and wants to use port xyz, it must request the OS “I want to listen on port xyz”, which results in a piece of code being saved to memory, listening to the network interface. That means any packet coming this hosts IP with value of port xyz will be sent to that application.

Ports are a transport layer concept, meaning that a TCP or UDP connection itself needs to succeed before handing off to this consideration. TCP and UDP headers have a specific section for indicating port numbers. Network layer protocols, like IP, are unaware of what port is in use in the network connection. In a standard IP header, there is no place to indicate which port the data packet should go to. Ports are numbered from 1 to 65535.

Ephemeral Ports

Some higher range ports are reserved for ephemeral ports.

root@linuxpc:/# lsof -i -P -n | grep 3306
mariadbd     1575           mysql   24u  IPv4   14474      0t0  TCP *:3306 (LISTEN)
mariadbd     1575           mysql   25u  IPv6   14475      0t0  TCP *:3306 (LISTEN)
mariadbd     1575           mysql   51u  IPv4 4048038      0t0  TCP 192.168.2.37:3306->192.168.2.37:36398 (ESTABLISHED)

MariaDB is listening on a well known port, the standard 3306. The client application on 192.168.2.37 is connecting to the database. Linux automatically assigned port 36398 from its dynamic range (32768 to 60999) to handle this temporary session. Essentially, these ephemeral port are there to handle the client-side of a communication session. It is best to try and avoid manually assigning ports into this range.

Some special port numbers

There are many many more, you can always use lsof -i -P -n to inspect what ports are listening on a machine.

If you just want to know whether a process is using a port, you can do:

lsof -i -P -n | grep $(ps aux | grep "process name" | grep -v grep | awk '{print $2}')

Blocking Ports

iptables can be used to block ports, e.g. if you are not using PostgreSQL, you might choose to just block the port entirely:

iptables -A INPUT -p tcp --destination-port 5432 -j DROP

If you just use it for testing and only want access from some test machine, you can get granular with it

iptables -A INPUT -p tcp -s 192.168.1.100 --dport 5432 -j ACCEPT
iptables -A INPUT -p tcp --dport 5432 -j DROP

More can be done, like also blocking outgoing traffic, but this is not an iptables documentation.

TCP/UDP sidenote

This means UDP is somewhat faster, because there’s no handshake waiting for acknowledgements, no retransmissions in case of missed packets.

TCP does a handshake process when establishing a connection:

DHCP

When a device connects to wifi, or a VM boots up on a new virtual bridge, it has no idea what the network layout looks like. It does not know its own IP address, it does not know the IP of the router IP, and it does not know where the DNS servers are. The option we’ve already seen is to assign IPs manually, but that’s not always a good idea.

DHCP solves this using a fast, high-trust packet exchange called D.O.R.A., which runs entirely over UDP ports 67 and 68.

The D.O.R.A. Handshake

Because a newly connected machine does not have an IP address yet, it cannot send a standard network packet. It has to yell out into the darkness using Layer 2 broadcasts (sending packets to the MAC address FF:FF:FF:FF:FF:FF).

The steps are as follows:

Leased IP: 192.168.2.50
Local Subnet Mask: 255.255.255.0
Local Default Gateway (Router IP): 192.168.2.1
Available DNS Servers: 1.1.1.1 and 8.8.8.8

Dealing with the Flight:

The router or virtual bridge can be configured to create a permanent rule, telling the DHCP server: “Whenever you see the specific hardware MAC address of specific host machine X, always hand the same specific IP address. Never let anyone else have it, and never change it.”

Bypasses the DHCP server completely. Using the network configuration tool of the chosen machine (like /etc/network/interfaces) manually hardcode the IP address configuration. Because the machine never sends out that initial “DISCOVER” broadcast packet, the DHCP server is completely skipped.

Note: If you do this, you must make sure the static IP you choose sits completely outside the automated DHCP “pool” range on your router, or your router might accidentally hand the same IP to another machine at the same time, causing an IP address conflict that drops both machines offline.