Home > Classroom Content > Troubleshooting DHCP server pools and DHCP clients on Cisco routers
Troubleshooting DHCP server pools and DHCP clients on Cisco routers
Instructor: Mark Jacob
Technology: Cisco

Have you ever encountered a situation where you have configured a Cisco router as your DHCP server, you think you have followed all the steps, and yet the downstream devices don’t seem to be cooperating?  I have built a scenario which I hope will demonstrate some of the pitfalls and gotchas that can interfere with your dream of a smooth running network.  First, let’s look at the scenario so we know what is supposed to happen, and then step-by-step we will analyze why it is failing.

Here is a diagram of the network:

001-Troubleshooting-DHCP-server-pools-and-DHCP-clients-on-Cisco-routers

I have configured a couple of DHCP pools on HQ – one to serve up an address to f0/1 on Branch2 and the other to provide an address to f0/1 on 10.ROUTER.  (I know it is a strange name, but I wanted it to be associated with the router that was to receive a ten dot address, so I figured, “What the heck…” and called it 10.ROUTER.)  I have the link between HQ and the switch configured as a dot1q trunk and Branch2 (and everything else downstream from HQ in this scenario) situated on interface f0/0.2 on HQ and VLAN 2 on the switch.  I have a DHCP Relay set up on f0/0 on Branch2 to forward requests from 10.ROUTER to the upstream DHCP Server.  Now, on to troubleshooting…

The first thing I want to know is if Branch2 successfully received an ip address on the f0/1 interface.  Here is the relevant output from debug dhcp detail issued on Branch2 (I omitted the extra lines in Notepad):

002-Troubleshooting-DHCP-server-pools-and-DHCP-clients-on-Cisco-routers.jpg

So far so good.  Let’s look at the pool on HQ which issued the address:

ip dhcp excluded-address 209.165.202.2 209.165.202.31
!
ip dhcp pool Branch2
   network 209.165.202.0 255.255.255.224
!

Note that with a subnet mask value of 224 in the 4th octet, there are 30 host addresses available.  However, with the excluded addresses noted above, the only possible address that can be obtained by Branch2 f0/1 is 209.165.202.1.

Now let’s go downstream and see if 10.ROUTER has obtained an ip address.

10.ROUTER#sh ip int bri
Interface                  IP-Address      OK? Method Status                Protocol
FastEthernet0/0            unassigned      YES NVRAM  administratively down down
FastEthernet0/1            unassigned      YES DHCP   up                    up

I do see that it knows it is supposed to be configured via DHCP.  Now to see if it actually has an address.

10.ROUTER#sh int f0/1
FastEthernet0/1 is up, line protocol is up
  Hardware is Gt96k FE, address is c002.4b34.0001 (bia c002.4b34.0001)
Internet address will be negotiated using DHCP

Hmmm…nothing yet.  I suspect a problem already because if it was functioning correctly, f0/1 would have an ip address by now.  Let’s reset that interface while monitoring the debugs:

10.ROUTER(config-if)#no shut
10.ROUTER(config-if)#
*Mar  1 00:51:24.303: DHCP: DHCP client process started: 10
*Mar  1 00:51:24.323: RAC: Starting DHCP discover on FastEthernet0/1
*Mar  1 00:51:24.327: DHCP: Try 1 to acquire address for FastEthernet0/1
*Mar  1 00:51:24.379: DHCP: SDiscover attempt # 1 for entry:
*Mar  1 00:51:27.823: DHCP: SDiscover attempt # 2 for entry:
*Mar  1 00:51:31.823: DHCP: SDiscover attempt # 3 for entry:
10.ROUTER(config-if)#u all

After omitting unnecessary debug output, we are left with the conclusion that yes, 10.ROUTER is trying to obtain an ip address.  Where is the hangup?  Let’s check server activity on HQ to see if those requests are actually arriving.

HQ#debug ip dhcp server packet
DHCP server packet debugging is on.
HQ#

I left HQ in that state for quite some time, meanwhile I was observing that 10.ROUTER was still doing its level-best to get an ip address.  No output at all on HQ.  What do you suspect could be the issue?  How would you dig deeper?  Since we know HQ successfully issued an ip address to Branch2 earlier, we can presume DHCP is working correctly on the HQ router.  We also see 10.ROUTER attempting to receive an address.    Maybe you are a little suspicious and want to see the DHCP configuration on HQ as it relates to 10.ROUTER.  I don’t blame you – I would want to see it too.  Here is the relevant portion of the running-config:

ip dhcp excluded-address 209.165.202.2 209.165.202.31
ip dhcp excluded-address 10.1.1.2 10.1.1.254
!
ip dhcp pool Branch2
   network 209.165.202.0 255.255.255.224
!
ip dhcp pool 10route
   network 10.1.1.0 255.255.255.0
!

I have excluded some addresses again, but that should not interfere with the successful operation of DHCP.  (Some well-versed readers may already suspect something from looking at the above output, but let’s not just give it away – let’s find it.)  As mentioned above, it seems that 10.ROUTER and HQ are both doing their jobs.  Let’s check the middle.  Branch2 is configured with an ip helper-address to send the DHCP broadcasts from downstream sources to HQ.  We can debug to see if this is happening, but what debug command would give us the output we seek?  You can try debug dhcp detail on Branch2, but that shows output for client activity on that router, so even with 10.ROUTER screaming in his left ear, no debug output appears on Branch2 with that command.  How about debug ip dhcp server packet?

*Mar  1 01:17:15.376: DHCPD: setting giaddr to 10.1.1.2.
*Mar  1 01:17:15.380: DHCPD: BOOTREQUEST from 0063.6973.636f.2d63.3030.322e.3462.3334.2e30.3030.312d.4661.302f.31 forwarded to 172.16.1.100.

This output shows that Branch2 seems to be doing what we asked it to do – it says it forwarded the requests to 172.16.1.100, just as we configured.  Here is that portion of the configuration:

Branch2#sh run int f0/0
Building configuration…

Current configuration : 124 bytes
!

interface FastEthernet0/0
 ip address 10.1.1.2 255.255.255.0
 ip helper-address 172.16.1.100
 speed 100
 full-duplex
end

Yes, the ip helper-address is there, it is on the correct interface, and it is using itself as the source of the referred packet, as it should.  But we have already seen that this request never makes it to HQ.   Let’s dig a little deeper into our debug options on Branch2 to see if we can narrow down the problem.  I have created the following access-list:

Extended IP access list 150
    10 permit udp any any eq bootpc      ! (this is UDP port 68)
    20 permit udp any any eq bootps      ! (this is UDP port 67)
Branch2#

Now I will issue a debug command which uses that access-list as a filter.  Let’s observe the results:

Branch2#debug ip packet detail 150
IP packet debugging is on (detailed) for access list 150
Branch2#
*Mar  1 01:27:43.064: IP: s=0.0.0.0 (FastEthernet0/0), d=255.255.255.255, len 604, rcvd 2
*Mar  1 01:27:43.068:     UDP src=68, dst=67
*Mar  1 01:27:43.076: IP: s=10.1.1.2 (local), d=172.16.1.100, len 604, unroutable
*Mar  1 01:27:43.080:     UDP src=67, dst=67

So Branch2 is trying to send the packet (recall that above the actual output said forwarded to 172.16.1.100.)  Now we can be more granular in our analysis.  Branch2 is trying to send the packet, but the other relevant output shows that this destination is unroutable.  Let’s verify this with a simple ping.

Branch2#ping 172.16.1.100 

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.1.100, timeout is 2 seconds:
…..

Success rate is 0 percent (0/5)
Branch2#

How about show ip route?

Branch2#show ip route
Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP
       D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area
       N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2
       E1 – OSPF external type 1, E2 – OSPF external type 2
       i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2
       ia – IS-IS inter area, * – candidate default, U – per-user static route
       o – ODR, P – periodic downloaded static route 

Gateway of last resort is not set

     209.165.202.0/27 is subnetted, 1 subnets
C       209.165.202.0 is directly connected, FastEthernet0/1
     10.0.0.0/24 is subnetted, 1 subnets
C       10.1.1.0 is directly connected, FastEthernet0/0
Branch2#

Aha!  I see no gateway of last resort, and this is verified by the absence of a quad-zero route (the default route: 0.0.0.0/0).  We are half-way there.  Now the question is, how would you solve this issue?  Perhaps you feel the temptation to configure a static route or a default route on Branch2 to correct the problem.  Considering the size of this network scenario, it seems that this is a valid solution.  But how does it scale?  If I have multiple downstream routers being served by HQ instead of just one or two, then administratively typing static or default routes on all of them is inefficient.  Remember the hint above when we looked at the DHCP pool config on HQ?  I said that some may see a potential problem (or the solution) in that config.  Before we correct it, let’s reason on this point.  If I am sitting on an ip host that is configured as a DHCP client with all options set to automatic, then I can issue the route print command and see something interesting.  I will use my Windows 7 x64 box as an example.  Here is a portion of the output of that command:

003-Troubleshooting-DHCP-server-pools-and-DHCP-clients-on-Cisco-routers

My PC has a quad-zero route.  I know I did not type it.  How did it get there?  It turns out that if the client is set to receive information from a DHCP server, one of the pieces of information provided (when it is configured) is the default gateway.   Once an end device is provided with a default gateway, it can send any information destined to anywhere other than the network on which it resides to the default gateway using, you guessed it, the default (quad-zero) route.  The same thing is true for Cisco devices that have an interface that is set to be a DHCP client.  So why does Branch2 have no default route?  Take a look again at the HQ config for the DHCP pools:

ip dhcp excluded-address 209.165.202.2 209.165.202.31
ip dhcp excluded-address 10.1.1.2 10.1.1.254
!
ip dhcp pool Branch2
   network 209.165.202.0 255.255.255.224
!
ip dhcp pool 10route
   network 10.1.1.0 255.255.255.0
!

Do you see what is NOT there?  It is missing the default-router portion of the configuration.  Let’s insert that into the config (using default-router 209.165.202.2) and see what happens on Branch2.  (Also note that once I reconfigure the DHCP pool on HQ, I will release and renew the ip adress in Branch2’s interface f0/1 to force it to re-acquire.)
Once done, here is the output of show ip route on Branch2:

Branch2#sh ip route
Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP
       D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area
       N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2
       E1 – OSPF external type 1, E2 – OSPF external type 2
       i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2
       ia – IS-IS inter area, * – candidate default, U – per-user static route
       o – ODR, P – periodic downloaded static route 

Gateway of last resort is 209.165.202.2 to network 0.0.0.0.

      209.165.202.0/27 is subnetted, 1 subnets
C       209.165.202.0 is directly connected, FastEthernet0/1
     10.0.0.0/24 is subnetted, 1 subnets
C       10.1.1.0 is directly connected, FastEthernet0/0

S*   0.0.0.0/0 [254/0] via 209.165.202.2
Branch2#

Woohoo!  Now Branch2 has a default route and a gateway of last resort.  Here is the new pool config on HQ:

ip dhcp pool Branch2
   network 209.165.202.0 255.255.255.224
   default-router 209.165.202.2

So because a default-router was included in the HQ pool, Branch2 now has a route to 172.16.1.100, or so we would presume.  Let’s check:

Branch2#ping 172.16.1.100

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.1.100, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/47/80 ms
Branch2#

Yeah baby!  However, don’t drive home without verifying that the solution implemented truly has resolved the entire issue.  Remember, we started this process trying to allow 10.ROUTER to receive a DHCP-issued address from HQ.  Let’s see if that has occurred.  Since I am still debugging client activity on that router, I see this:

10.ROUTER(config-if)#%Unknown DHCP problem.. No allocation possible
*Mar  1 03:20:41.639: DHCP: Waiting for 60 seconds on interface FastEthernet0/1

I also see repeated attempts to negotiate an ip address, each of which ends in failure.  So much for packing up the laptop and heading out.  We have verified that Branch2 knows how to get to the ip helper-address now, so let’s check on HQ to see if the messages from 10.ROUTER are appearing now.

HQ#deb ip dhcp server pack
DHCP server packet debugging is on.
HQ#
Mar  1 03:24:04.127: DHCPD: DHCPDISCOVER received from client 0063.6973.636f.2d63.3030.322e.3462.3334.2e30.3030.312d.4661.302f.31 through relay 10.1.1.2.
Mar  1 03:24:04.127: DHCPD: Sending DHCPOFFER to client 0063.6973.636f.2d63.3030.322e.3462.3334.2e30.3030.312d.4661.302f.31 (10.1.1.1).
Mar  1 03:24:04.127: DHCPD: unicasting BOOTREPLY for client c002.4b34.0001 to relay 10.1.1.2.

So the messages are being received by HQ and replies are being sent.  Let’s see who owns the MAC address c002.4b34.0001.  It should be 10.ROUTER’s interface f0/1.

10.ROUTER#sh int f0/1
FastEthernet0/1 is up, line protocol is up
  Hardware is Gt96k FE, address is c002.4b34.0001 (bia c002.4b34.0001)

Yes it is.  So the server has correctly identified which client is requesting an ip address, it knows the request has come through a relay agent, and it is unicasting a reply.  What else could be breaking our DHCP process?

To investigate more deeply, I have configured the same extended access-list on HQ that I had previously configured on Branch2.  This one:

Extended IP access list 150
    10 permit udp any any eq bootpc      ! (this is UDP port 68)
20 permit udp any any eq bootps      ! (this is UDP port 67)

Now I will fire it up and wait for the requests from 10.ROUTER to appear.  Here is the output:

HQ#debug ip packet detail 150
IP packet debugging is on (detailed) for access list 150
Mar  1 04:46:38.858: IP: tableid=0, s=10.1.1.2 (Ethernet0/0.2), d=172.16.1.100 (Loopback0), routed via RIB
Mar  1 04:46:38.858: IP: s=10.1.1.2 (Ethernet0/0.2), d=172.16.1.100, len 604, rcvd 4
Mar  1 04:46:38.858:     UDP src=67, dst=67
Mar  1 04:46:38.862: IP: s=172.16.1.100 (local), d=10.1.1.2, len 328, unroutable
Mar  1 04:46:38.866:     UDP src=67, dst=67

So we noted above that HQ said it was unicasting a reply back to the requesting client.  It turns out it was only trying to unicast that reply.  How do we know?  Because it was sending to a destination ip address of 10.1.1.2 (the relay agent, which is expected) but the destination is unroutable.  Let’s verify that with show ip route on HQ:

HQ#sh ip route
Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP
       D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area
       N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2
       E1 – OSPF external type 1, E2 – OSPF external type 2
       i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2
       ia – IS-IS inter area, * – candidate default, U – per-user static route
       o – ODR, P – periodic downloaded static route

Gateway of last resort is not set

     172.16.0.0/24 is subnetted, 1 subnets
C       172.16.1.0 is directly connected, Loopback0
     209.165.202.0/27 is subnetted, 1 subnets
C       209.165.202.0 is directly connected, Ethernet0/0.2
HQ#

HQ has no route back to the 10.1.1.0 network.  How about that?!  The request gets from the requesting interface to the relay agent, the relay agent forwards it to the DHCP server, the server processes the request and answers correctly, but the reply chokes and dies.  How do we fix it this time?  Since the main purpose of this blog post was to demonstrate that the DHCP process will fail if the DHCP server has no route back to the requesting client, we can do one of two things.  We can configure a static route back to the 10.1.1.0 network, or we can configure a default route sending everything out of HQ down the path to Branch2 and on its way.  Once again, if there were multiple branches, then individual static routes to each one would be required – one for each remote network destination.  (Another alternative would be to have a routing protocol in place so HQ could learn paths to these remote networks, but that is outside the scope of this blog.)  In our scenario, since everything we want to reach is down the path through Branch2, let’s just configure a default route that points that way, using this command:  ip route 0.0.0.0 0.0.0.0 f0/0.2.

With that in place, let’s quickly hop on 10.ROUTER and observe the DHCP debug messages (again, deleting the unnecessary lines):

*Mar  1 03:47:50.159: DHCP: SDiscover attempt # 2 for entry:
*Mar  1 03:47:50.159:    Hostname: 10.ROUTER
*Mar  1 03:47:50.159: DHCP: SDiscover: sending 298 byte length DHCP packet
*Mar  1 03:47:50.159:             B’cast on FastEthernet0/1 interface from 0.0.0.0
*Mar  1 03:47:50.303: DHCP: Received a BOOTREP pkt
*Mar  1 03:47:50.307: DHCP: Scan: Message type: DHCP Offer
*Mar  1 03:47:50.315: DHCP Offer Message   Offered Address: 10.1.1.1
*Mar  1 03:47:50.315: DHCP: SRequest– Requested IP addr option: 10.1.1.1
*Mar  1 03:47:50.367: DHCP: Received a BOOTREP pkt
*Mar  1 03:47:50.371: DHCP: Scan: Message type: DHCP Ack
*Mar  1 03:47:50.379: DHCP: rcvd pkt source: 10.1.1.2,  destination:  255.255.255.255
*Mar  1 03:47:50.379: DHCP Ack Message
*Mar  1 03:47:50.379: DHCP: Server ID Option: 209.165.202.2
*Mar  1 03:47:53.399: DHCP: Applying DHCP options:
*Mar  1 03:47:53.399: DHCP Client Pooling: ***Allocated IP address: 10.1.1.1
*Mar  1 03:47:53.499: Allocated IP address = 10.1.1.1  255.255.255.0

*Mar  1 03:47:53.499: %DHCP-6-ADDRESS_ASSIGN: Interface FastEthernet0/1 assigned DHCP address 10.1.1.1, mask 255.255.255.0, hostname 10.ROUTER
10.ROUTER#

Now we have achieved the results we desired!  Is there anything else we forgot?  Well, we are using Cisco routers as our hosts, but what if it was a user on a host PC at ip address 10.1.1.1 and he needed access to a resource connected to HQ?  Or what if that 172.16.1.100 address was a server to which this host needed access?  Before I pack up and go home, what else should I verify?  You got it – let’s make sure that this host can reach remote destinations.

10.ROUTER#ping 172.16.1.100

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.1.100, timeout is 2 seconds:
…..

Success rate is 0 percent (0/5)
10.ROUTER#

Nooooo!!  What else has gone wrong?  Let’s take advantage of our previous troubleshooting and check the ip route situation on 10.ROUTER.

10.ROUTER#sh ip route
Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP
       D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area
       N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2
       E1 – OSPF external type 1, E2 – OSPF external type 2
       i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2
       ia – IS-IS inter area, * – candidate default, U – per-user static route
       o – ODR, P – periodic downloaded static route

Gateway of last resort is not set 

     10.0.0.0/24 is subnetted, 1 subnets
C       10.1.1.0 is directly connected, FastEthernet0/1

Curses!  Again no default route, as well as no specific route to the 172.16.1.0 network.  I will correct it the same way as before, by modifying the pool on HQ which serves this client.  The next-hop ip address for 10.ROUTER is 10.1.1.2, so that is what I will configure in HQ.  I will use this command inside the DHCP pool:  default-router 10.1.1.2.  Then I will release and renew f0/1 on 10.ROUTER and that should fix it.  Let’s verify.  Within the debug messages (debug dhcp detail is still running on 10.ROUTER) there is this:

*Mar  1 04:22:15.094:   Setting default_gateway to 10.1.1.2
*Mar  1 04:22:15.098:   Adding default route 10.1.1.2
*Mar  1 04:22:16.098:   Adding route to DHCP server 209.165.202.2 via FastEthernet0/1 10.1.1.2

NOW let’s look at show ip route on 10.ROUTER:

10.ROUTER#show ip route
Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP
       D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area
       N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2
       E1 – OSPF external type 1, E2 – OSPF external type 2
       i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2
       ia – IS-IS inter area, * – candidate default, U – per-user static route
       o – ODR, P – periodic downloaded static route 

Gateway of last resort is 10.1.1.2 to network 0.0.0.0.

         209.165.202.0/32 is subnetted, 1 subnets
S       209.165.202.2 [254/0] via 10.1.1.2, FastEthernet0/1
     10.0.0.0/24 is subnetted, 1 subnets
C       10.1.1.0 is directly connected, FastEthernet0/1
S*   0.0.0.0/0 [254/0] via 10.1.1.2
10.ROUTER#

The default route has appeared!  All that remains is to verify reachability to the remote address 172.16.1.100:

10.ROUTER#ping 172.16.1.100

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.1.100, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 48/73/96 ms
10.ROUTER#

We know the first ping died while waiting for the ARP process to complete, but now we know all is well!

I hope this blog has been informative and useful.  Troubleshooting is really being emphasized in the latest Cisco classes.  This makes perfect sense, because network people never seem to get any credit when things work well, but their names are dipped in mud when there is the slightest glitch.  For that reason, the ability to perform troubleshooting – that is, the skill of restoring connectivity quickly, is paramount in the arsenal of the gifted network admin.

Until the next thing breaks…

Mark Jacob
Cisco Instructor – Interface Technical Training
Phoenix, AZ