the “Timeout waiting for PADO packets” hassle

Posted in adsl, bridge, debian, ISP, pado
In one of our branches I have configured that the server connects to 
the internet via bridge.This configuration has been set up in order 
forme to retain better control regarding VLANs within the branch etc..
This server is running Debian GNU/Linux 4.0 which has many hats 
(router...firewall) and is connected via (PPPoe) DSL modem which is 
configured in bridge mode.
eth0 is for the employee network, eth1 is for the user network and 
eth2 is used for internet access.

After changing the ISP on one of our branches I asked that the
 Thomson router to be configured in bridge mode, before
actually switch to the router (physicaly).

After I've made the switch the internet connection went down.
First thing that I did is check all of the settings and changes
 that I made to ensure that the new ISP connection (account) will work
Things like...chap-secrets,pap-secrets,dsl-providers....

Anyway I got this in syslog 

(tail -f /var/log/syslog)
Sep 2 09:56:21 maia pppd[2805]: Timeout waiting for PADO packets
Sep 2 09:56:21 maia pppd[2805]: Unable to complete PPPoE Discovery
Sep 2 09:57:26 maia pppd[2805]: Timeout waiting for PADO packets
Sep 2 09:57:26 maia pppd[2805]: Unable to complete PPPoE Discovery
Sep 2 09:58:31 maia pppd[2805]: Timeout waiting for PADO packets
Sep 2 09:58:31 maia pppd[2805]: Unable to complete PPPoE Discovery
Sep 2 09:59:36 maia pppd[2805]: Timeout waiting for PADO packets
Sep 2 09:59:36 maia pppd[2805]: Unable to complete PPPoE Discovery
Sep 2 10:00:41 maia pppd[2805]: Timeout waiting for PADO packets
Sep 2 10:00:41 maia pppd[2805]: Unable to complete PPPoE Discovery
Sep 2 10:01:46 maia pppd[2805]: Timeout waiting for PADO packets
Sep 2 10:01:46 maia pppd[2805]: Unable to complete PPPoE Discovery
Sep 2 10:01:46 maia pppd[2805]: Exit.

Which kinda told me that the most probable solution is that the ISP 
didn't acctualy configure the router in bridge mode.

I tested the connection using an ordinary laptop where I put in the 
info given from the ISP and I saw that the connection came through.

Later on I realised that the bridge connection was enabled only on
 port 2 of the router.

But unfortunatelythe helpdesk guys didn't tell me this and they kept
 on assuring me that the bridge mode was enabled on all 4 ports.

So here's a tip for all of you guys out there if you get a 
"Sep 2 09:56:21 maia pppd[2805]: Timeout waiting for PADO packets
Sep 2 09:56:21 maia pppd[2805]: Unable to complete PPPoE Discovery"
error it's probably the ISPs fault.

One command away from hell :D

Posted in rm*

So there I was cleaning out the e-mail server (Postfix running on Debian), and I noticed that a lot of customers have their trash folder with up to 6 GB of data in them. Since e-mail server was running out of space I decided to clean it up.
So I proceed casually to type in the following while in the trash directory..
root@XYXY:/rm *

rm: cannot remove `bin’: Is a directory
rm: cannot remove `boot’: Is a directory
rm: cannot remove `dev’: Is a directory
rm: cannot remove `emul’: Is a directory
rm: cannot remove `etc’: Is a directory
rm: cannot remove `home’: Is a directory
rm: cannot remove `initrd’: Is a directory
rm: cannot remove `lib’: Is a directory
rm: cannot remove `lost+found’: Is a directory
rm: cannot remove `media’: Is a directory

The one thing that went through my mind is…Sh** – what a massive fu** up

After panicking for a hour or two…started from bits and pieces to pull it all back together.

Fortunately I had a pretty updated backup on a another server and that eased my mind because it wasn’t a total disaster – I just had to find a way to start up the VM that I had successfully deleted.

I took a look at the other VM I had running and copied their boot – I knew I was going to need this because , I have realised (after A LOT of googling :D ) that I removed /lib64 and /lib32 (which are symlinks to /lib).
Thus disabling me from executing any command because it depends on /lib64/ld-linux-x86-64.so.2.

Here’s the part where I got lucky and saved my sorry a** :)

So there is this one file that is tatically linked on the system – and it doesn’t need any libs (/lib64) – and that is the /lib/ld-linux-x86-64.so.2 file.
If you use this file you can force ln command to check out /lib for the libs instead of the missing (deleted by yours truly) /lib64.

So all you need to do is type in /lib/ld-linux-x86-64.so.2-library-path /lib /bin/ln-sf/lib /lib64
Which in fact recreates the /lib64 symlink… HURRAY :D

Debian server crashing after a lot of data transfer

Posted in debian weird

I’m running 2.6.32-5-xen-amd64 version of Debian….

Recently I’ve had quite of a surprise when all of a sudden my DNS server crashed – to make matters even worse this happened during a presentation of a newly created web page, making the presentation impossible.

I rushed to connect to the server only to realize that the VM that is running and everything seems to be just fine.
Before further examination I’ve decided to restart the whole machine (running a couple of VMs).

After restart and successfully booting of all the services / VMs – everything seemed to be in order. All of our online web services started running and it looked like DNS started working again.

I was completely confused…and started to check logs (one by one) in order to find out of what was going on.

This is what caught my eye in /var/log/syslog

Jun 17 03:00:02 **YourServerName** kernel: [10600465.191348] EXT4-fs (dm-15): ext4_orphan_cleanup: deleting unreferenced inode 131315
Jun 18 03:00:05 **YourServerName** kernel: [10686867.920018] EXT4-fs (dm-15): ext4_orphan_cleanup: deleting unreferenced inode 131315
Jun 19 03:00:03 **YourServerName** kernel: [10773266.248593] EXT4-fs (dm-15): ext4_orphan_cleanup: deleting unreferenced inode 131315
Jun 20 03:00:02 **YourServerName** kernel: [10859664.573971] EXT4-fs (dm-15): ext4_orphan_cleanup: deleting unreferenced inode 131315

After googling it…I found out it is a issue for a lot of IT folk out there

http://ubuntuforums.org/showthread.php?t=1861588

Some are suggesting that it is hardware related (HDD bad sectors)

https://bbs.archlinux.org/viewtopic.php?id=95683

It seems like a lot of data transfer locks the HDD into read only mode!

The only thing that I found in common is that it happens when a lot of data transfer(heavy hard disk activity) is being done 12h / 24 h before on Unix machines.
The night before I have scp a DB from my computer to the server and then later on to another server that acted as a host.