Viewing disk usage

Keeping tabs on your storage is always important, as no one enjoys getting a call in the middle of the night that a server is having an issue, much less something that could've been avoided, such as a filesystem growing too close to being full. Managing storage on Linux systems is easy once you master the related tools, the most useful of which I'll go over in this section. In particular, we'll answer the question "what's eating all my free space?" and I'll provide you with some examples of how to find out.

First, the df command. This command is likely always going to be your starting point in situations where you don't already know which volume is becoming full. When executed, it gives you a high-level overview, so it's not necessarily useful when you want to figure out who or what in particular is hogging all your space. However, when you just want to list all your mounted volumes and see how much space is left on each, df fits the bill. By default, it shows you the information in bytes. However, I find it easier to use the -h option with df so that you'll see information that's a bit easier to read. Go ahead and give it a try:

df -h

Output from the df -h command

The output will look different depending on the types of disks and mount points on your system. In the example I showed earlier, I took the information from a Digital Ocean VPS. In my case, you'll see that the root filesystem is located on a disk called DOROOT, and it is currently only utilizing 10% of the space allocated for it. As you can see from the example, df gives you some very useful information for starting your investigation. It shows you the Filesystem, its Size, how much space is Used, how much is available (Avail), how much percentage is being used (Use%), and the filesystem the device is Mounted on.

While investigating disk utilization, it's also important to understand the concepts of inodes. While going deep into the concept of inodes is beyond the scope of this book, the basics are simple enough and can save you a lot of trouble, in particular in situations where an application is reporting that your disk is full but the df -h command shows plenty of free space is available. Think of the concept of an inode as a type of database object, containing metadata for the actual items you're storing. Information stored in inodes are details such as the owner of the file, permissions, last modified date, and type (whether it is a directory or a file).

The problem with inodes is that you can only have a limited number of them on any storage media. This number is usually extremely high and hard to reach. In the case of servers, though, where you're possibly dealing with hundreds of thousands of files, the inode limit can become a real problem. I'll show you some output from one of my servers to help illustrate this:

Output from the df -i command

The command I executed to see the output in the previous screenshot was df -i. As you can see, the -i option of df gives us information regarding inodes instead of the actual space used. In this example, my root filesystem has a total of 1310720 inodes available, of which 94735 are used and 1215985 are free. If you have a system that's reporting a full disk (though you see plenty of space is free when running df -h), it may actually be an issue with your volume running out of inodes. In this case, the problem would not be the size of the files on your disk, but rather the sheer number of files you're storing. In my experience, I've seen this happen from mail servers becoming bound (millions of stuck emails, with each email being a file), as well as unmaintained log directories. It may seem as though having to contend with an inode limit is unbecoming of a legendary platform such as Linux, though as I mentioned earlier, this limit is very hard to reach—unless something is very, very wrong.

The next step in investigating what's gobbling up your disk space is finding out what is using it all up. At this stage, there are a multitude of tools you can use to investigate. The first I'll mention is the du command, which is able to show you how much space a directory is using. Using du against directories and subdirectories will help you narrow down the problem. Like df, we can also use the -h option with du to make our output easier to read. By default, du will scan the current working directory your shell is attached to and give you a list of each item within the directory, the total space each item consists of, as well as a summary at the end.

Note

The du command is only able to scan directories that its calling user has permission to scan. If you run this as a non-root user, then you may not be getting the full picture. Also, the more files and subdirectories that are within your current working directory, the longer this command will take to execute. If you have an idea where the resource hog might be, try to cd into a directory further in the tree to narrow your search down and reduce the amount of time the command will take.

The output of du -h can often be more verbose than you actually need in order to pinpoint your culprit and can fill several screens. To simplify it, my favorite variation of this command is the following:

du -hsc *

Basically, you would run du -hsc * within a directory that's as close as possible to where you think the problem is. The -h option, as we know, gives us human readable output (essentially, giving us output in the form of megabytes, gigabytes, and so on). The -s option gives us a summary and -c provides us with a total amount of space used within our current working directory. The following screenshot shows this output from my desktop (yes, I have a lot of games and music):

Example output from du -hsc *

As you can see, the information provided by du -hsc * is a nice, concise summary. From the output, we know which directories within our working directory are the largest. To further narrow down our storage woes, we could cd into any of those large directories and run the command again. After a few runs, we should be able to narrow down the largest files within these directories and make a decision on what we want to do with them. Perhaps we can clean unnecessary files or add another disk. Once we know what is using up our space, we can decide what we're going to do about it.

At this point in reading this book, you're probably under the impression that I have some sort of strange fixation on saving the best for last. You'd be right. I'd like to finish off this section by introducing you to one of my favorite applications, the Ncurses Disk Usage Utility (or more simply, the ncdu command). The ncdu command is one of those things that administrators who constantly find themselves dealing with disk space issues learn to love and appreciate. In one go, this command gives you not only a summary of what is eating up all your space, it also gives you an ability to traverse the results without having to run a command over and over while manually traversing your directory tree. You simply execute it once and then you can navigate the results and drill down as far as you need.

To use ncdu, you will need to install it as it doesn't come equipped on Ubuntu by default:

# apt-get install ncdu

Once installed, simply execute ncdu in your shell from any starting directory of your choosing. When done, simply press q on your keyboard to quit. Like du, ncdu is only able to scan directories that the calling user has access to. You may need to run it as root to get an accurate portrayal of your disk usage.

Note

You may want to consider using the -x option with ncdu. This option will limit it to the current filesystem, meaning it won't scan network mounts or additional storage devices; it'll just focus on the device you started the scan on. This can save you from scanning areas that aren't related to your issue.

When executed, ncdu will scan every directory from its starting point onward. When finished, it will give you a menu-driven layout allowing you to browse through your results:

ncdu in action

To operate ncdu, you move your selection (indicated with a long white highlight) with the up and down arrows on your keyboard. If you press Enter on a directory, ncdu switches to showing you the summary of that directory, and you can continue to drill down as far as you need. In fact, you can actually delete items and entire folders by pressing d. Therefore, ncdu not only allows you to find what is using up your space, it also allows you to take action as well!