# DHCP problem



## hac3ru (Feb 26, 2014)

Hello,

I have a network (/20) divided on 16 aliases. I want to allow unknown MACs on this network in 192.168.14.0/24 and 192.168.15.0/24 networks. I edited dhcpd.conf so it looks like this:

```
default-lease-time 86400;
ddns-update-style interim;
log-facility local3;    # Note: syslog-ng filters by program
#
# Network options
#
shared-network camin
{
    authoritative;
    option domain-name "home.ro";
    option domain-name-servers 8.8.8.8;

    subnet 192.168.15.0 netmask 255.255.255.0
    {
        default-lease-time 600;
        option routers 192.168.15.9;               #Alias15's IP
        option subnet-mask 255.255.255.0;
        option broadcast-address 192.168.15.255;

        pool
        {
            range 192.168.15.10 192.168.15.100;
        }
    }


    subnet 192.168.0.0 netmask 255.255.255.0
    {
        option routers 192.168.0.9;             #Alias0's IP
        option subnet-mask 255.255.255.0;
        option broadcast-address 192.168.0.255;
        server-identifier 192.168.0.0;
    }
include "/usr/local/etc/defined_hosts.list";
}
```
This is just the beginning of the dhcpd.conf file. The subnets declarations repeat with IP's changed accordingly.
The problem is that the PCs connected to the network sometimes get an IP address but no connection to the server (can't ping the server), sometimes they don't get an IP at all. I am talking about the PCs that don't have their MAC address written in the defined_hosts.list file. In the dhcpd.log file I can see:

```
Feb 26 14:06:45 c3 dhcpd: DHCPDISCOVER from 00:11:00:11:22:00 via bge1
Feb 26 14:06:45 c3 dhcpd: icmp_echorequest 172.23.15.10: Invalid argument
Feb 26 14:06:46 c3 dhcpd: DHCPOFFER on 172.23.15.10 to 00:11:00:11:22:00 (Hac3ru-PC) via bge1
Feb 26 14:06:46 c3 dhcpd: Unable to add forward map from Hac3ru-PC.home.ro to 172.23.15.10: timed out
Feb 26 14:06:46 c3 dhcpd: DHCPREQUEST for 172.23.15.10 (172.23.0.1) from 00:11:00:11:22:00 (Hac3ru-PC) via bge1
Feb 26 14:06:46 c3 dhcpd: DHCPACK on 172.23.15.10 to 00:11:00:11:22:00 (Hac3ru-PC) via bge1
Feb 26 14:06:49 c3 dhcpd: DHCPINFORM from 172.23.15.10 via bge1
Feb 26 14:06:49 c3 dhcpd: DHCPACK to 172.23.15.10 (00:11:00:11:22:00) via bge1
Feb 26 14:06:49 c3 dhcpd: send_packet: Invalid argument
Feb 26 14:06:49 c3 dhcpd: dhcp.c:1305: Failed to send 305 byte long packet over fallback interface.
Feb 26 14:06:52 c3 dhcpd: DHCPINFORM from 172.23.15.10 via bge1
Feb 26 14:06:52 c3 dhcpd: DHCPACK to 172.23.15.10 (00:11:00:11:22:00) via bge1
Feb 26 14:06:52 c3 dhcpd: send_packet: Invalid argument
Feb 26 14:06:52 c3 dhcpd: dhcp.c:1305: Failed to send 305 byte long packet over fallback interface.
```
What is that dhcp.c:1305 error? I opened the dhcp.c file, located at /usr/src/usr.sbin/sysinstall/dhcp.c and line 1305 is the last line, which is empty...


----------



## SirDice (Feb 26, 2014)

hac3ru said:
			
		

> What is that dhcp.c:1305 error? I opened the dhcp.c file, located at "/usr/src/usr.sbin/sysinstall/dhcp.c" and line 1305 is the last line, which is empty...


It's the wrong file. The file you're looking at is part of the source for sysinstall(1). It has nothing to do with the sources of net/isc-dhcp43-server.

Did you bind the DHCP daemon to an interface?

```
dhcpd_ifaces="bge1"                          # ethernet interface(s)
```


----------



## hac3ru (Feb 26, 2014)

Ohh. I have installed the net/isc-dhcp41-server. Where is the dhcp.c file then? 
`whereis dhcp.c` only returns the above mentioned location.

Yes I did bind the interface because the known hosts get the addressed that they have been assigned.


----------



## SirDice (Feb 26, 2014)

hac3ru said:
			
		

> Ohh. I have installed the /net/isc-dhcp41-server. Where is the dhcp.c file then?


It's part of the sources that built the executable. It's not something that's actually installed. Messages like that are usually the result of an assertion in the code.


----------



## hac3ru (Feb 26, 2014)

Okay. So how do I resolve this? I need DHCP allowing unknown hosts to connect to the network and can't do that right now 

Got another question that's not related: What's the difference when installing a package from ports or using `pkg_add`?


----------



## SirDice (Feb 26, 2014)

Are you using a relay? And are the two network segments also on different broadcast domains?


----------



## kpa (Feb 26, 2014)

Looking at your configuration makes me think that it's not really possible to serve IP addresses in such fashion because now dhcpd(8) has no way of telling which subnet it should use for an unknown client. The initial requests will have 0.0.0.0 as source address and 255.255.255.255 as destination so it can not really make a decision which subnet to select because all the subnets are on the same interface, bge1.


----------



## SirDice (Feb 26, 2014)

A configuration like that can work but only if you have two separate broadcast domains and are using DHCP relays to forward the requests to the DHCP server. If both network segments are in the same broadcast domain the DHCP server indeed won't know which pool to use. If I recall correctly it'll always use the first pool but I've never tested it that way.


----------



## hac3ru (Feb 26, 2014)

No, I am not using a relay. The /etc/rc.conf looks like:

```
ifconfig_bge1="inet 192.168.0.0 netmask 255.255.240.0"
ifconfig_bge1_alias0="inet 192.168.0.2 netmask 255.255.255.0"
ifconfig_bge1_alias1="inet 192.168.1.2 netmask 255.255.255.0"
... and so on
```
So every alias should have it's own broadcast address. 

@kpa
I don't understand why is that? 
Since I have a pool of IPs only in subnet 192.168.15.0/24 shouldn't it use that to assign IPs to unknown hosts?

Ideas about what can I do?


----------



## usdmatt (Feb 26, 2014)

I'm intrigued at what problem you're trying to solve with all these subsets. Judging by your interface configuration it looks like you only have one broadcast domain (Your server has all these addresses on one interface and I assume all your computers are on the same switch network). In standard networking theory it's considered bad practice to have more than one IP subnet defined in the same broadcast domain. In fact in the Cisco exams some of the correct answers rely on you assuming this.

First of all you are assigning the bge1 interface an address of 192.168.0.0/20. This sits in the 192.168.0.0-192.168.15.255 subnet. 192.168.0.0 is actually the 'network address'. Some devices will let you use this IP, others won't. It's generally considered bad practice to use this address and network admins will pretty much never use it.

You're then assigning more addresses with /24 masks, that overlap the original /20 subnet. This seems like an awkward and badly designed configuration. It would be considered wrong by most people.

Is there a specific reason you have 16 /24 networks? Do you need 256x16 addresses?
Considering this is all on the same broadcast domain anyway, is there any reason you can't just assign the server one address - 192.168.0.1/20, give your computers an address in the 192.168.0.x-192.168.13.x range and have a dynamic pool of 192.168.14.0-192.168.15.254. All with a mask of 255.255.240.0 and gw of 192.168.0.1.

I can't help much with the dhcp.conf problems specifically as I do very little with isc-dhcpd, I just find this a very strange setup.


----------



## hac3ru (Feb 26, 2014)

In theory, having 16 x /24 networks will give me 16 broadcast addresses, for each /24 network. I took this design from one of the ISPs here, which say that it's better in case of a virus that's broadcasting trash for example. They said that if one of the computers have a virus or something that's generating traffic on the broadcast address, it would broadcast on 192.168.x.255, therefore affecting only PCs in 192.168.x.0/24 network. I didn't test this. Still, the network is working, the computers receive IPs from different networks, with that network's broadcast address. The problem is that sometimes the PCs get an IP address from the pool without being able to connect to the server and sometimes they don't get an IP address at all.
I'm all out of ideas. I was thinking that maybe something was wrong when installing isc-dhcp41-server but I did a clean reinstall and the problem is still there. To go on, I created a virtual machine with the same configuration which seems to work, even though I am getting: 
	
	



```
Unable to add forward map from XXXXXXX.hostname to 192.168.x.x: Request timed out
```
Still, even if I get that error, the IPs assigned from the pool are able to access the network, the server and the internet. I have no idea what's happening with the main server and what can I do to fix it. A clean FreeBSD reinstall would mean about 4 hours of downtime and I can't afford to do that only as a last resource so I'm open to any other ideas.

Thank you for your help.


----------



## usdmatt (Feb 26, 2014)

I see no value in having multiple subsets on the same LAN (broadcast domain) or how it would help broadcast storms. All broadcasts will reach every machine on the same LAN regardless of their configured address. If the ISP want to limit the possibility of problems, they need to segregate the networks onto independent broadcast domains properly and have routers provide connectivity between the subnets.

As I mentioned, running multiple subnets on the same physical network is, by the book, incorrect. I think you're introducing complexity for the sake of it with no real benefit. Most medium sized companies make do with less than 16 networks, and if they do have multiple subnets, they'll probably be correctly used for independent network areas with routers in between.


----------



## hac3ru (Feb 26, 2014)

Okay so I'll just create a big /20 network again. That's easy. The hard part is to get the pool to work.


----------



## SirDice (Feb 27, 2014)

hac3ru said:
			
		

> In theory, having 16 x /24 networks will give me 16 broadcast addresses, for each /24 network. I took this design from one of the ISPs here, which say that it's better in case of a virus that's broadcasting trash for example. They said that if one of the computers have a virus or something that's generating traffic on the broadcast address, it would broadcast on 192.168.x.255, therefore affecting only PCs in 192.168.x.0/24 network. I didn't test this.


There are very few viruses that actually do this, it's called a Smurf attack. Most network devices and hosts can be configured not to respond to this kind of traffic. So you're worried about something that may never happen and can be mitigated through other means. Besides that, it's common to use VLANs for this purpose, not to run multiple networks on the same "wire".


----------



## usdmatt (Feb 27, 2014)

The smurf attack relies on sending a directed broadcast, which is the only real use for the 'subnet broadcast' address (Although irrelevant these days). A packet is sent to a router with a destination of, for example, 192.168.0.255. If the router happens to have a network with that range on another interface, it will broadcast the packet out on that interface. The standards have required that routers do not forward these packets by default for the last 15 years (and no-one really has a reason to enable it) so this kind of attack is fairly historic.

For general use you can assume that a broadcast sent by one host will be received by every other host on the same physical network, regardless of their address. Broadcasts are handled by switches and forwarded out on all ports (that is a member of the same VLAN) automatically so everyone receives it even if they don't want it.

I suspect the ISP in this case actually are using independent physical networks or VLANs, just like any other ISP would to split their address space between customers. The OP has replicated the subnet layout but not the actual network segregation, so the separate ranges provide no real benefit.


----------



## SirDice (Feb 27, 2014)

usdmatt said:
			
		

> The smurf attack relies on sending a directed broadcast, which is the only real use for the 'subnet broadcast' address (Although irrelevant these days). A packet is sent to a router with a destination of, for example, 192.168.0.255. If the router happens to have a network with that range on another interface, it will broadcast the packet out on that interface. The standards have required that routers do not forward these packets by default for the last 15 years so this kind of attack is fairly historic.


You can still find a lot of network printers responding to a subnet broadcast, unfortunately. Which is one of the reasons why they should be put on their own network or VLAN.


----------



## hac3ru (Feb 27, 2014)

So, back to my problem 

I created the 192.168.0.0/20 network again. Last night, I got an IP from the pool, it worked for around 10 minutes, got server and internet access on that PC. After 10 minutes it dropped and couldn't get it back anymore 

Ideas?  I really don't want to reinstall the OS


----------



## SirDice (Feb 27, 2014)

Try setting your lease time to about 10 minutes.

If the client loses connection what IP address does it have at that moment? You may also want to check your network for rogue DHCP servers. I've had it happen were some department decided they wanted wireless and just hooked up a wireless router to the network. Unfortunately it also started serving DHCP for the rest of the network which conflicted with the settings we used.


----------



## hac3ru (Feb 27, 2014)

Well, the default lease time is 86400 (24 hours). I'll set it to 10 minutes and see if it works.

The clients still has the 192.168.15.x IP address, can communicate with other PCs on the network (the switches offer this functionality most probably) but cannot communicate with the server (192.168.0.1) and cannot access the network (since the server is not doing NAT for that IP).

There is no other DHCP server on the network because it should affect other computers as well. I also tried to connect directly to the server (internal NIC to laptop) and it gives me an IP address but again, cannot ping the server or anything on the internet.


----------



## SirDice (Feb 27, 2014)

hac3ru said:
			
		

> The clients still has the 192.168.15.x IP address, can communicate with other PCs on the network (the switches offer this functionality most probably) but cannot communicate with the server (192.168.0.1)


Check your subnet masks, it's the most common thing to get wrong in this case.


----------



## hac3ru (Feb 27, 2014)

If I write the MAC address as a fixed host, from network 192.168.15.0/24 it works so it's not a problem.

I quadruple checked the config. Still, I don't know what's with that error from dhcp.c. On the virtual machine, I don't get that error and everything works ok.


----------



## usdmatt (Feb 27, 2014)

I wouldn't start messing with manual MAC entries. If the hosts can communicate with each other, which they must if they get a DHCP response, then you must have fairly simple problems.

Can you not just do something simple like the following?

/etc/rc.conf

```
ifconfig_bge1="192.168.0.1/20"
dhcpd_enable="YES"
dhcpd_ifaces="bge1"
```

/usr/local/etc/dhcpd.conf

```
option subnet-mask 255.255.240.0;
option routers 192.168.0.1;
option ..

subnet 192.168.0.0 netmask 255.255.240.0 {
  option ...

  pool {
    range 192.168.12.1 192.168.13.254;
    deny unknown-clients;
  }

  pool {
    range 192.168.14.1 192.168.15.254;
    allow unknown-clients;
  }
}

host static-host-1 {
  hardware ethernet 00:11:22:33:44:55;
  fixed-address 192.168.x.x;
}
```

I'm not 100% on the DHCP config as I spent literally 60 seconds checking it but it shouldn't be far off (I've not bothered to find out how you define 'known clients'). I see no reason to make it any more complicated than this unless you want to start physically splitting networks. I'm still not convinced you need 4000 addresses though.

Being able to get DHCP but not talk to the server makes no sense unless you're mixing up subnet masks and confusing all the hosts.


----------



## hac3ru (Feb 27, 2014)

Did that, same problem. As I said, on a virtual machine with the same configuration (same pf.conf, ipfw.rules, dhcpd.conf, and rc.conf is working. The physical server is not working though)

One funny thing I have observed, if I restart the server, the clients get IP addresses and they can communicate with the server for about 5 minutes. After that, no new clients get IPs from the pool and the old ones can't communicate with the server anymore.... What is this???


----------



## wblock@ (Feb 28, 2014)

Leases expiring?


----------



## hac3ru (Feb 28, 2014)

The leases are set as
	
	



```
default-lease-time 86400
```
They should not expire. And still, even if they expire, I should get a new one if I disable and re-enable the NIC


----------



## SirDice (Feb 28, 2014)

hac3ru said:
			
		

> The leases are set as
> 
> 
> 
> ...


It should actually renew the lease at half the lease time. So something isn't working correctly. The lease doesn't appear to expire as the clients still have the correct IP address. It's just that they can't access the gateway any more for some reason. At least that's what I understood of the situation.


----------



## usdmatt (Feb 28, 2014)

If nothing can communicate with the server at all after 5-10 minutes then it suggests issues with the server itself. I'm reluctant to start suggesting server/nic problems though as we have very little real troubleshooting information to go on and I suspect you may just be getting things into a mess.

It makes no sense for a machine to get DHCP correctly, work, and then stop 10 minutes later. That does not look like a DHCP problem.

What does the output of `ifconfig eth1` look like on the server?
What does the output of the same command (with the correct interface of course, or `ipconfig` on Windows) look like once a machine has booted and got an IP address?
Does the output of the above command on the client change when they lose connectivity (i.e. do they still appear to have a valid address when it stops working)?
Do all hosts lose connectivity to the server or just some?

I'm starting to come to the same conclusion as posters in this very similar thread (viewtopic.php?f=7&t=32039&start=25), that if this is a real network with users on, you should just get someone in who knows what they're doing, that can actually deploy a properly designed, working network.


----------



## wblock@ (Feb 28, 2014)

I was thinking leases expiring and the default route somehow being lost or wrong when the new lease is obtained.  And agreed that a second, rogue DHCP server somewhere on the network can produce some effects like that.


----------



## kpa (Feb 28, 2014)

Is there any packet filtering done on the interface? If so post the rules please.


----------



## hac3ru (Mar 31, 2014)

Okay I'm back. Sorry for the delay...

I installed a FreeBSD OS on a virtual machine, same pf.conf, same ipfw.rules, same dhcpd.conf. I added some other virtual workstations into the network and it all worked flawlessly....?!

I trashed the aliases.
Now the dhcpd.conf is looking clean, like this:

```
#
# Options
#
default-lease-time 86400;
max-lease-time 86400;
min-lease-time 86400;
ddns-update-style none;
log-facility local3;    # Note: syslog-ng filters by program
#
# Network options
#
shared-network camin
{
    authoritative;
    option domain-name "c3.campus.utcluj.ro";
    option domain-name-servers 193.226.6.229, 193.226.5.151, 193.226.6.233, 217.73.173.3, 193.226.5.33, 8.8.8.8;
    #option domain-name-servers 193.226.6.229;
    subnet 172.23.0.0 netmask 255.255.240.0
    {
#       pool
#       {
#           max-lease-time 300;
#           min-lease-time 150;
#           range 172.23.14.10 172.23.15.240;
#           allow unknown-clients;
#       }
        range 172.23.14.10 172.23.15.240;
        option routers 172.23.0.1;
        option subnet-mask 255.255.240.0;
        option broadcast-address 172.23.15.255;
    }
}
#
# Leases
#
include "/var/db/c3/dhcpd.conf";
```
I swapped to 172.23.0.0/24 network. I have tried to achieve this using a pool or the `range` directly but it did not work. Also, I changed the dhcp interface to bge1 so I'll post the output of `ifconfig bge1`

```
bge1: flags=88843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,STATICARP> metric 0 mtu 1500
        options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>
        ether 00:11:0a:e9:9b:5e
        inet 172.23.0.1 netmask 0xfffff000 broadcast 172.23.15.255
        inet6 fe80::211:aff:fee9:9b5e%bge1 prefixlen 64 scopeid 0x5
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex,master>)
        status: active
```
The `ipconfig` on windows machines returns the proper IP, Mask, Gateway, DNS and all other informations on the NIC connected to that network...
I am getting an IP from 172.23.14.0/24 or 172.23.15.0/24 network but I am not able to communicate with 172.23.0.1 (FreeBSD server).
If a rogue DHCP server would release IP addresses, I should get some other IP (192.168.x.x usually) and the router's IP address as gateway, which I do not.

I am running no packet filtering at this moment. I changed pf.conf to

```
ext_if="em0"
int_if="bge1"

internal_net="172.23.0.0/20"
external_addr="10.134.168.54"

table <campus> { 172.22.0.0/21, 172.23.0.0/20, 172.24.0.0/21, 172.25.0.0/21 }
# NAT
nat on $ext_if from $internal_net to !<campus> -> $ext_if
# Pass all
pass all
```
Any other ideas?

Thank you for the struggle 

Later edit:
STUPID STUPID STUPID STUPID!
I would have never think about this. It seems that the script that generates the .leases file, also generates an ARP file, which is loaded and the command `ifconfig bge1 staticarp` is loaded. The script is old and I was never curious enough (and to be honest I didn't think that one would create static ARP lists) to look for something like this.
I modified that into `ifconfig bge1 -staticarp` and it all works great now. Sorry for wasting your time but I think that we all learned something today )


----------

