TLUG Talk: TLE-BU, 2005

From AN!Wiki
Jump to: navigation, search

 AN!Tools :: Talks :: TLUG Talk: TLE-BU, 2005

Note: This program is no longer being developed as the parent company, TLE, is no longer in busines. This is here mainly for historical reasons. This was the second of two talks I gave on TLE-BU. This one showed changes made in the TLE-BU backup program over the intervening year.

Warning: TLE-BU is still on Source Forge for download, but it it not recommended for use. There was a critical bug with non-ASCII file names that caused unreliable backups. TLE folded before I got to fix it, unfortunately.

Future: The AN::Tools module suite directly evolved from the initial failings of TLE-BU. I decided to write it as a portable module suite that would provide all of the core functions TLE-BU would have needed but in a much more robust package. Once AN::Tools is finished, my first personal project will be to write AN!Backup, a completely new "under the hood" version of TLE-BU. I plan for it to have the same look and feel of TLE-BU when the time eventually comes.

TLE-BU
Open Source, Web-Based Backup Software

Madison Kelly,

Lead Technician, TLE

Oct. 11, 2005

Contents

Intro

My name is Madison Kelly. I am the Lead Technician for my employer, "The Linux Experience". I want to give this talk tonight to introduce a new, open sourced program I wrote called TLE-BU written, in Perl, Javascript and a little bit of C using PostgreSQL database as the backed for data storage. Some of you may remember that I presented this program a year ago. Well, a lot has changed and I hope you like how it has evolved!

I decided to write a new backup application from the ground up when I was unable to find an existing backup application that provided certain features a client of ours had requested. Do not get me wrong, there are several very mature and very well written open source backup programs! The main feature that we found lacking though was a user-friendly front end. All of the viable existing backup software I found was console-based or very much aimed at an experienced administrative user which was not helpful for our needs. The existing software for most part also did not provide the features I wanted like intelligent spanning of destinations, an archival component and so on.

There is a lot of information I hope to cover in the next hour. I ask you all to refrain from making comments or asking questions until I am finished. If you think I have made a mistake please feel free to correct me if it is significant. If we can get through the presentation without interruptions there will be time for a question and answer (and correction) period.

Finally before we begin; TLE-BU is in beta-stage at this moment. As with any beta software I have done my best to make it stable and bug free but until it reaches 'mature' at version 1.0 please expect the occasional bug or three. I would always love to hear from anyone with ideas on the program or who finds a bug. Even the seemingly smallest contributions can in fact help me a great deal.

Let's Begin

Perhaps one of the most important yet rarely done things that any computer user should do is backup their important data. If you work in an IT role for a company the need to have a good backup system is that much more critical. Almost all of us in this room have at some point lost data themselves that wasn't backed up or seen this happen to someone. The sick, sinking feeling is indescribable... It just sucks.

Regular backups almost never happen for long if a person tries to manually copy over important data to removable media like a CD/DVD or external drive. Human nature almost always gets in the way. People get busy, people put it off "until tomorrow" and then people forget. Eventually, and usually pretty quickly, the regular backups simply stop happening. I know this is true for me and I have seen it be true with others as well.

Genesis

I wanted a set of features and when I couldn't find any existing software that would work for me I, like many geeks, decided to write my own program. This meant I had to lay out some key features of the program that I wanted before I began:

What did I want this backup program to be?

First question; "What media will store the source data?"

First option; CD/DVDs

They are inexpensive but have limited storage space and are a pain to manage when data needs to span disks. They don't support easy modification of existing data and need a special erase call to reuse if they are RW disks.

Second option; Tape drives

They are expensive and only the highest-priced Ultrium drives approached the size of modern system's hard drives. Tapes are relatively unreliable and in the case of the drive being damaged recovering the data would require finding another compatible drive to read the data. Tape and drive failures can be silent tricking the user into thinking their media is okay when in fact it has failed. Periodic testing of backup media is labour intensive.

Third option; USB2/Firewire Hard drives

The price per Gigabyte is very inexpensive. They are fast and reliable. The drives can be connected to almost any system using standard hardware. Drives connected over USB or Firewire can be "hot-swapped" without additional costly hardware.

That choice was easy

Second question; "Who do I want to aim this program to?"

First option; Technically proficient system administrators.

There is already several existing and mature backup programs out there for technical users whose job it is to backup servers or who have time to dedicate to learning a complex backup program for one or two installs only.

Second option; Small to medium businesses.

Many companies are in a position where they may have many workstations and a few servers but not have an on staff IT person. These companies generally rely on contracting an IT support company to handle difficult tasks and do their best to take care on daily computer tasks themselves. This means the program had to have a friendly interface and be easy to use once it is installed. No existing backup programs seems to fall into this catagory.

Another easy choice.

Second question; "What features do I want in this program?"

First feature; Granular directory and file selection.

Most (all?) of the existing Linux backup software provides limited ability to select what directories and files will be selected or excluded from a backup or recovery job. I wanted this program to provide a traditional file browser interface that allowed for a simple point-and-click method of selecting and unselecting what would be backed up.

Second feature; It had to be accessible from Microsoft and other workstations.

As much as I want to promote Linux I am also well aware that many users are tied to Microsoft Windows machines for various reasons. The office user who would be tasked with things like recovering data from a given drive would likely have little or no experience with Linux. This meant I could either write client software to access the program or provide a web interface. Both had pros and cons but I decided on a web interface mainly because it would work on any machine regardless of the underlying OS.

Third feature: There had to be an easy way to find backed up data.

This meant having some mechanism for recording detailed directory and file information that could then be searched even when all destination drives where offline. The search had to provide a way to easily identify a given destination drive and provide details such as when a given file was last modified before it was backed up.

Forth feature: It had to have a friendly interface and be easy to use by non-techies.

The program had to have reasonable default values so that it could perform full backups with minimal configuration of the program. Whenever a technical term had to be used there needed to be a simple way for the user to read about that term. In short, the program has to be, as much as possible, as intuitive as possible and be a teacher when needed.

The last straw, or, what made me nearly bite off more than I could chew:

Building a backup program that was easy to use was critical to me. I have been actively promoting Linux and moving home users and small businesses over to Linux for some time now. The greater Linux community has done amazing work in the last few years making Linux easier to use and thus letting me be successful in these migrations. Most people here will agree that distributions like Fedora Core and Mandrake have made using Linux "easy" and realistic for John and Jane Q. Public and that is what has been scarring Microsoft badly (and "That's a Good Thing").

I ran into a particular client where not having an easy to use backup program became the deal-breaker for a Linux migration. This client ran a small office with two servers (Netware 4.11 and MS Windows 2000 Server) with about 6 employee workstations. The only thing that kept him on MS and Netware instead of Linux/Samba was the easy selection of files to backup and recover that he had available to him using ArcServe. This person was not a technician so he declined to use several of the Linux command-line based backup programs I presented to him resulting in a lost opportunity to migrate another small business away from Microsoft.

This lack of an application that was easy to use was the driving force behind me creating TLE-BU. The goal of TLE-BU is to both lighten the load of admins and to provide one more option to those of us working to migrate less-technically inclined people and businesses.

The Program:

First and foremost, TLE-BU exists in large part thanks to many people, including many people here tonight who took the time to answer my many, many questions on the mailing list. I would like to list everyone but it would take so long I would have little time to actually present to you guys. If you get a chance to look at the main TLE-BU manual I have done my best to remember all key people and list them and their contributions in section 1.6.0 'People TLE-BU needs to thank!'.

TLE-BU was developed on and tested under Debian and Fedora Core 2+. I have not yet been able to test it on other systems though I suspect and bugs that show up on other Linux 2.6.x-based distributions should be fairly easy to fix. If you are running FC2+ or Debian on a 2.6 kernel then TLE-BU should run nicely for you.

You can download the program from the main TLE-BU website at 'http://tle-bu.org'. I work full time on this program so please check in whenever you want to see what is new and what has changed. There is a TLE-BU Community Forum on the main web page and there are two tradition mailing lists hosted by SourceForge (follow the 'Download' link on the main page). Anyone who wishes to join is more than welcome!

So what IS TLE-BU Exactly?

In one line:

TLE-BU is a web-based, network-aware backup and archival program using USB2/Firewire connected disks to store data.

A little more detail

  • TLE-BU has been written in Perl with small amounts of javascript and C using a PostgreSQL database to store it's data and detailed information on all files it scans.
  • It can backup data from multiple source partitions or network shares to multiple destination drives or network shares intelligently.
  • All 'devices' (partitions and network mounts) are handled as individual items and can be 'labeled' by the user in any way he or she wishes to aid in searches and data recovery.
  • It has a built-in search engine that allows searching for files and directories on all media, including drives that are currently offline in a way that makes finding the specific device a given file is on easy.
  • The combination of the device labeling and search engine makes TLE-BU a very effective archival program even for very large dataset with many different destination devices.
  • The program uses an internal user account system that allows an administrator to limit other user's access to the program.
  • It uses an internal web server on, by default, TCP port 853 to simplify install and prevent conflicts with other web-based apps on the server.
  • It uses 'rsync' to perform the actual copy of data which means files backup from previous backup jobs that haven't changed are not copied again increasing backup times by a fair amount.
  • No concern about incremental and full backups. Data is always in an easy to restore state.
  • TLE-BU is not needed to restore data from a destination drive (though it certainly can restore the data for you or a user). The data is not compressed or modified in any way to make restore as simple and quick as possible.
  • It has 'tooltips' that explain all key words, items, buttons and options to quickly help any user understand how the program works.

Enough 'What', lets hear some 'How'

Now that I have covered the why and what, I will cover the how.

Installation

The TLE-BU source comes with a simple installer program that should work on most systems. At the most basic, just download the source and extract it the same way you would any other tarball, ie:

tar -xvzf tle-bu_0.2_xx.tar.gz

(replace 'xx' with the version you downloaded)

Then change into the new sub directory 'tle-bu_0.2' and run the installer as root:

./install.pl

The default options should be fine for most systems. If you want more details on how to install the program and what switches are available either read the 'README' file that came with the source and/or run:

./intsall.pl --help

There is an uninstaller as well called 'uninstall.pl'. You should save this script before deleting the source in case later you want to remove TLE-BU. To run the uninstaller run as root:

./uninstall.pl

If you want a complete install run:

./uninstall.pl -c=both

Which will remove both TLE-BU configuration files.

There are several options in the install files that allow you to adjust how TLE-BU runs. To explain them all would require a lot more time then I have so I will leave specific questions until after the presentation or recommend you read the TLE-BU v0.2 Manual available on the website which goes into all the options in greater detail.

At this point you should be able to start using TLE-BU by opening up a web browser and browsing to: http://localhost:853. For security reasons TLE-BU only listens for connections on '127.0.0.1' which effectively blocks any remote access to the program. If you want to make TLE-BU available to other networked machines either change the 'Listen' parameter in the '/etc/tb_httpd.conf' file or run the installer with the 'ip=<address>' switch. Enter the IP address of the network interface you want to accept connections on.

It's installed, now what? Go 'Home'

The first screen you will see if the login screen asking you for a user name and password. By default the user name is 'Admin' and the password is 'tle-bu'. The very first thing you should do is change the password! Once you login the 'Home' menu will appear and the first option will be 'Account'. Click on this and you will see a short list of information you can enter about the 'Admin' user.

The top line is where you can change the password. First enter the current password in the left box and then the new password in both right boxes. Nothing fancy here. The only other really important thing to do is to change the email address to a real one. The email address assigned to the 'Admin' user will get messages from the backup job.

If you wish to give access to another user, for example an office staff member who will maintain the program, you can click on the 'Users' button in the 'Home' menu. You can then add a new user and decide what parts of the program they have access to. For example, an office worker should not need to edit the settings, view the logs, add/edit/delete other users and so on. It is always a good idea to restrict access as much as possible "just in case".

As the administrator you will see the 'Settings' menu where most of the program settings you may wish to change on any kind of a regular basis can be found. All the options have tooltips that explain in some depth what each option does so I will just mention the settings are there and move on for now.

fig. 3.0.a, The 'Home' Menu:

The 'Home' Menu

Dealing with partitions, the 'Partitions' menu

One you get user accounts settled away you will probably want to setup partitions next. The 'Partitions' label in TLE-BU is a little deceptive and will change in the next version but for now I will use 'partitions' to refer to actual partitions on a disk and to network shares over SMB or NFS.

Before TLE-BU will use a partition it needs to be assigned as either a 'Source' partition or as a 'Destination' partition. As their names imply, the files and directories selected on source partitions will be backed up to partitions assigned as destination devices. By default start off as being unassigned, meaning that beyond gathering basic partition information like mount point and usage information TLE-BU will not use the partition in any way.

Here is a sample list of partitions taken from my laptop with an SMB share mounted. You can see two currently offline (unmounted) NFS partitions, too. In this case I have not yet added any external disk drives and only my 'test' partition is selected as a destination device.

fig. 3.1.a, The 'Partitions' Menu:

The 'Partitions' Menu

To make the list of devices a little more compact, specially for people with many devices, you can switch to a compact layout by clicking on the "Show Basic Partition Information" button.

fig. 3.1.b, The 'Partitions' Menu, Compact payout:

The 'Partitions' Menu, Compact payout

To start you need to select which partitions you want to save data from and assign them as a "source" partitions by simply clicking on the 'Source' button. Next you do the same thing to the partitions with the devices you want to save data onto by clicking on their 'Destination' button ('Dest.' in the compact layout). If you did nothing else you would already be one step away from doing everything you needed to start backing up data. That last step being to create a backup job.

By default, when you select a partition as a 'Source' partition every directory and file in it are automatically selected for backup. You only need to manually select directories and files if you want certain ones to be excluded from a source. I will cover select files for backup or recovery a little later.

It is very helpful if you add a comment (or catalog number) of your choice to each source and destination partition. You can do this from here as well by clicking on the 'Edit' button near the commentand the page will reload with a text box where you can add or edit the partition's comment.

fig. 3.1.c, Editing a Comment:

Editing a Comment

When are finished editing just click on the 'Save' button and the changes will be saved. This will help identify a partition when you search for a file or directory, as you will see shortly.

I Want Granularity! The File Browser and how it works

Before I move on I want to cover how you can select what files you want to backup or recover using the built-in file browser. I will take some extra time to explain the inner workings of this section because it is probably the most interesting part of the program from a technical point of view.

When you want to use the file browser the program will first scan the contents of the partition in question if it is online. Depending on the speed of the machine running TLE-BU, the number of files on the device being scanned and the speed of the disks containing the files being scanned this could potentially take some time. The trade off though is that some of the more unique benefits of TLE-BU become possible.

Generally speaking relatively modern, realistic systems can scan scan roughly 2,000 to 3,000 files a second. In TLE-BU the overall time it takes to scan the files on a partition is calculated as the 'U.Rate' which you can use to help you tune your system. Even modest systems, like the Pentium3 1GHz laptop that TLE-BU was primarily developed on (and these screen shots are taken on) can run at a U.Rate of roughly 500 to 800 depending on what's running in the background.

fig. 3.2.a, A partition being scanned:

A partition being scanned

From a technical point of view there where two steps in developing TLE-BU that were very challenging and which, to this day, I am most actively looking for ways to improve on. The two challenges are loading the data into the database while retaining backup, recovery and display states for all files as efficiently as possible and reading out the directories in the file browser fast enough to make the directory tree usable over a stateless interface.

First, I would like to cover the first challenge. Specifically what, technically, happens when a partition is scanned and updated.

TLE-BU is technically only at version 0.2 but it is actually in it's third incarnation. Both re-writes occurred before TLE-BU was even declared 'beta' because until now the performance of the update was just too slow. The first 'functional' version of TLE-BU had a U.Rate of 17! The key to increasing the performance to an acceptable level came from my Cameron Bessey when he suggested my biggest mistake was re-using the same PostgreSQL table and suggested instead that I dump the database each time. Bingo!

The biggest problem I had with scanning a partition's files wasn't the actual scan time. That alone could happen quite fast. The trick was that I had to have a way to remember whether a file was selected for backup or recovery or not and, to a lesser degree, whether a directory was selected to show it's subdirectories or not. The eventual answer was to use a set of four multi-dimensional arrays using the format '$hash{parent_dir}{file_name}{file_type}'. The first one to check is a file has been scanned before and the other three to store an existing file's 'backup', 'restore' and 'display' states.

Once the existing data on all the files from the partition is loaded into the hashes the existing file is dropped and then re-created. This is done because a complete delete and vacuum is very slow where as a drop and rebuild is nearly instant. A connection is opened to the database and a 'COPY...' is started to re-write all the files now on the system. Then the actual scan begins using perl's 'opendir/readdir' commands.

As each file is read it is checked to see if it is a symlink, file or directory. Then it is 'stat'ed to gather information like size, mode and ownership. Then the first hash is checked to see if the file has been scanned before. If not then a new entry is added to the 'COPY' job with the 'backup', 'restore' and 'display' values set to 'i' (inherit). If the file was seen before then the other three hashes are read in for that file and the file information is written to the 'COPY' job with the previous 'backup', 'restore' and 'display' values. Lastly if the file being read is a directory the program dives into to it, scans it's contents and then returns to finish scanning the current directory using a re-entrant sub routine.

If there was ever a 'secret' to TLE-BU, that was it. As simple as it sounds that took me several months to develop (though it could be argued that I have occasion to miss the forest for the trees).

Back to the program then!

Once the scan is finished you can proceed to the actual file browser. This is in the familiar split vertical frames format often used by other file browsers. From here a directory, it's sub directories, it's files and individual files can be selected for backup or recovery. As mentioned earlier this ability to granularly select what you want to backup or recover was a feature I could not find in existing backup programs. I did not see how a backup program could be considered "user-friendly" without this style of an interface.

As I just mentioned, the second biggest challenge to TLE-BU's development was finding a way to deliver the file browser. One option was using Java or a similar stateful interface but I decided against it rather quickly because many client machines do not have full-blown Java installed on their machines. It was very important to me to make TLE-BU as easy for end-users to use as possible so I decided on using only Javascript and regular HTML which most modern browsers readily support. The challenge with this decision is that even with Javascript I had a stateless interface and a file browser very much needed a stateful interface.

Here is where credit is due in large part to Emma Jane Hobgin. One late evening she took the time to hear my predicament and even connect to my system remotely to help me get the Javascript working that would prove critical in making the file browser behave like other file browsers. Specifically this meant changing the state of check boxes of files and directories currently displayed when their parent directory's check box was selected. With that critical bit in place I was able to start working on the back end.

fig. 3.2.b, The File Browser:

The File Browser

In short, the file browser works like most others of it's type. If you click on the folder icon to the left of a directory name it's sub directories (if any) will be shown or hidden. If you click on a directory's name then any files in that directory will appear in the file panel on the right. If you toggle the check box of a directory it's backup or restore state will change and all the directories and files under it will also change to match it's new state. If you toggle the check box of a single file then it's state alone will change.

When a new file or directory is found during a scan it takes the state of it's parent. For example, if you had the directory '/foo/bar' and at some time in the past you selected it to be backed up and later you added the directory '/foo/bar/baz' it would inherit it's parent's backup state and also be backed up. Pretty straight forward.

Technically a lot has to happen in the background to make this happen smoothly. The time spent tweaking and improving this section is second only to the work put into the initial loading of the data. The tweaking of this section was *greatly* assisted by the folks on the TLUG, TPM and PostgreSQL mailing lists. This is probably the single biggest source of the questions I asked along the way!

This section has two challenges; the first was finding a way to quickly cause child objects to inherit the status of a parent when it was changed and the second was finding a way of displaying the directory tree quickly when a directory was expanded or contracted.

First though, a little on how it works.

All directories and files are tagged with an 'ID' whose value is the file or directory's parent directory. When the check box on a given directory is toggle a Javascript is run that looks at all the files and directories currently being displayed to see is it's ID is or starts with the directory that was just toggled. If it does it's check box is also toggled to match.

At the same time a call is made to the server which sends parent directory, name and type of the file or directory that was just toggled. The script reads this value and makes a call to the database that says "change this file to have backup/recovery state X". If it's a directory a second call is then made that says "change the backup/restore state of any file with a parent directory matching or starting with the 'dir' to 'i' (inherit)'. An index makes this update happen quite quickly even when the root directory is toggled and all items in the database table need to be checked.

To display the directory tree a call is made to the database is made to read in all the directories and their display states on a partition into two multi-dimensional hashes using the format 'hash{parent_dir}{dir_name}'. The first stores the directory's display state and the second stores it's backup or restore state used to decided if it's check box is to be checked or not. I also use an anonymous array called '@{$parents{parent_dir}}' which I use to keep a list of what directories are directly beneath a given parent directory.

I use a special, non-existent directory '/.' to store the root directory's display and backup or restore status. I always display this record. The next step is too see if it's display state is true and if so I read in the values in the anonymous array '@{$parents{'/'}}' to get the list of directories under it. As I display each directory I check their display state and if one of them is also set to be displayed I read it's anonymous array, draw those sub directories and then return and continue drawing the rest of the directories under the root directory.

The challenges where

First, how to update the backup or recover state of all the directories and files under a directory quickly. Originally I stored a simple 'true' or 'false' for each state in the database. This meant that every record under the directory had to be changed. When the directory being toggled only had a few hundred items under it this wasn't a concern. On the other hand if there was hundreds of thousands of files under the directory no index could make that a quick process.

The solution to this first problem was to use an 'inherit' value and create an index on the 'backup' and 'restore' fields. This meant that when a directory was toggled I could say 'update the file_info_# table and set any record whose parent directory starts with '/foo' and whose state is not inherit to inherit'. This meant that a mere fraction of the files had to be updates and an index on the parent directory and backup/restore fields could make that update happen very quickly.

The second solution was the drawing of the directories. Originally I would call the database to see if the root directory was expanded. If so, I would call the database again to get a list directories under it. As I displayed each I would check it's display state and if it was displayed I would call the database again to get the directories under it (see a bad pattern here?). This worked fine when only a few directories where being displayed but very quickly it started taking longer and longer to expand and contract directories.

The use of the MD hashes and the anonymous array meant that I could load all the directory information into memory in one swoop and then call the hashes/arrays to build the tree much, much faster even when hundreds of directories where opened at one. On my humble test system the program could quickly draw a directory tree with over 35,000 directories in roughly three to five seconds.

If there is anything innovative in TLE-BU it would be these database tricks. Thanks to them (and the people who helped me create them) a stateless interface was successfully turned into a stateful file browser requiring no client-side plugins!

Once you refine refine the directories and files you want to backup or recover you are ready to move on the next step.

Creating and Scheduling Backup Jobs

After you have assigned your source and destination partitions and refined what you want to backup the next step is to create a 'Backup Job'. You can create multiple backup jobs and each can use some or all of the source and destination partitions which runs manually and at a set time.

The scheduler is, at it's most basic, a front end to the 'cron' scheduler.

Each scheduled job keeps a list of what source and destination partitions to use, how to treat newly assigned partitions found when a job runs and when the job is to be run. Once this information is set the program writes a text file for 'cron' which is loaded with 'crontab'.

The scheduler has four main sections; Adding, editing and deleting jobs, editing the job's name, selecting source and destination partitions to use in the job and the time the job will run.

fig. 4.0.a, The main Schedule menu:

The main Schedule menu

The main schedule menu is where you can edit, run or delete existing backup jobs or add a new one.

When you run the backup job you will want to make sure that you can leave the client's browser open until the backup job is finished. For this reason it is recommended that instead you run the job manually from the command line one the TLE-BU server by running the command:

/usr/share/tle-bu/cgi-bin/backup.pl --id=#

(Replace '#' with the backup job number)

When you add or edit a backup job you will see exactly the same menus. The only difference is the existing settings when editing a job will be selected in each window as opposed to the defaults. The first menu you will see will let you create or edit the backup job name. This name is strictly for your benefit so you can use more or less anything you want. By default the program suggests a name with the date and time it was created to keep it unique.

fig. 4.0.b, Edit the backup job's name:

Edit the backup job's name

The next two menus are essentially the same. They let you select which source and which destination partitions you want to use when this job is run. There are also two options; one to tell TLE-BU what to do with new source or partition devices and one to tell TLE-BU if it should fail the backup job if any of the source or destination partitions selected to be used in the job are not online when the job runs.

fig. 4.0.c, The source partition selection menu:

The source partition selection menu

The destination menu is essentially the same so I will save space and skip a screen capture of it.

You can see here one of the reasons why commenting partitions in the 'Partitions' menu comes in handy. It lets you quickly identify which device is which without trying to remember what cryptic things like block paths.

The 'Use Strict' check box is the one where you can tell TLE-BU to not run if any of the selected source or destination partitions are not online. This is useful if you want to make sure that the same destination partitions are available in a given job to make sure that data is assigned the same way each time the backup job is run.

The rest of the options are pretty self-explanatory. Partitions with their check box checked will be used and ones without will not be used.

fig. 4.0.d, Choosing when a job will run:

Choosing when a job will run

Anyone familiar with how 'cron' handles scheduling will already be familiar with the TLE-BU scheduler. Basically, set the time in the left most panel to set the time of day that you want the backup job to run. The 'Days of the Week' panel lets you set the weekdays the job will run. If, for example, your office is closed on weekends you can uncheck 'Saturday' and 'Sunday'. The 'Days of the Month' panel lets you select certain numbered days of the month to run or not run the backup job on. Lastly, the 'Months of the Year' panel on the right lets you select which months the job will run. If, for example, you work at a school you might want to uncheck the summer months.

How the Backup Engine Works

There are several main steps that the 'backup.pl' script goes through when it runs a backup job. Some of these steps may be interesting from a technical point of view so I would like to cover them in a little detail.

When the time comes to run a backup job the 'cron' program calls the 'backup.pl' script and passes the job ID it needs to run at that time. The 'backup.pl' script uses that ID to call the database and retrieve the details. With that info it checks to see if any source and destination partitions have been added or removed and updates the backup job with the new list of partitions.

With the up to date list of source and destination partitions it can use the next step is to see which ones are online. It scans the system to get a current list of which partitions are online. It then checks to see which source and destination partitions selected to be used by the current job are online. If all some of the source or destination partitions are not online then the 'use strict' setting is checked. If it is set then the backup job will end there.

If there is at least one source and one destination partition online and the 'strict' check passed it moves on to building the list of directories and files to backup from source partitions to destination partitions. Before it does that though the program scans each source and destination partition it is going to use to make sure that it has up to date information on the files and directories on each partition.

The first step in the assigning is to build a list of directories on all the source partitions with the size of the files in that directory selected for backup. The size is based on the files in the specific directory and do not include sub directories. Also, as the total of each directory is calculated a check is made to see if that directory has been previously backed up on one of the available destination partitions. If it is, then the size of the files in the destination folder are compared to the size of the files in the source partition and a delta is calculated. If there is enough space on the destination then that source is immediately assigned to that destination and the size is recorded as just the delta.

Once all the size of the files in the source directories are calculated they are sorted by largest to smallest. The next step is the program looks at the 'Backup Strategy' you selected in the settings menu. By default the backup strategy is set to optimize for backup speed. In this case the destination partitions are sorted by size, smallest to largest and the program will try to spread the source data as evenly as it can over the destination devices.

It does this by dividing the amount of source data by the number of destination partitions online and checking to see if the remainder will fit on the smallest destination partition. If it does then the program sets a 'soft' limit on each destination partition equal to 1/Nth the number of destinations. If it doesn't fit the program will check the free space, assign that much to the smallest destination partition and then take the remainder and divide it by the number of remaining destination partitions and continue checking in the same manner.

On the other hand if you set the backup strategy to optimize the backup for disk use efficiency. In this case the program will sort the destination partitions from largest to smallest and try to fill each destination before moving on to the next one. If you only have one destination partition available then the backup strategy has no effect and is ignored.

With the source directories destination partitions sorted the program starts to assign source files by directory to the first destination. After each directory is assigned the remaining space on the destination is checked. If it goes beyond the soft limit or the actual free space less the 'declare full...' space but it is still within the actual free space the destination is marked as 'full' and the next destination partition starts being used. This repeats until either all the source data is assigned or the free space on all the destination partitions has been accounted for.

As the data is assigned plain text files are created and a list of source files are written into it. One file is created for each combination of source and destination partitions used. Once all the source data is assigned the backup engine starts calling 'rsync' for each text file. The 'rsync' program reads the files listed in each text file and handles the actual copying of files from the source to the destination. TLE-BU waits for each called 'rsync' job to exit before continuing. After 'rsync' finishes moving data to a destination partition that device is rescanned to update the database with the new contents on that device.

The option 'Run multiple streams...' tells TLE-BU whether to run one 'rsync' stream at a time or if it should run all the backup streams at once. The default it to run one at a time and this should be left this way unless you have a very fast machine running TLE-BU or if the machine running TLE-BU is dedicated to the program. When multiple streams run at once it can place a very heavy load on the machine.

If at the end of the backup run there is source data that wasn't assigned to destination partitions then the backup program will not end. Instead it will wait a period of time defined by the 'Wait ## sec. between destination checks'. After waiting the backup program will rescan the available destination partitions and if a new one (or more) has become available the remaining source data will be assigned in a similar fashion as described above. If no new destinations are online the backup engine will go into another wait period. Likewise, if there is still source data to assign after using the new destination partitions the backup will also wait again and try yet again.

How many times the backup engine will wait and try again depends on the value assigned to 'Check for destination partitions ## times'. By default the wait period is ten minutes (600 seconds) and it will try 72 times for a total of roughly two hours. The amount of time that actually passes from the start of the job until it finally gives up is actually two hours plus the program's run time.

Searching for Files

A backup program is ultimately only as good as it's ability to recover data. As mentioned, the actual recovery of files can be handled within TLE-BU or manually using other tools. That recovery though assumes that you know where to find the files you need to restore. When you want to restore only a certain file or subset of files it can be a little more tricky finding specifically what you want to restore.

That is where the search engine comes in handy.

A good example of when the search engine would come in particularly handy is this common scenario. At some point or another most of us have modified a file like a text document, a few days by and then you realize that you need part of that document back. You can search for that file by it's name and sort the results by the last modified date. Find the search result with a modified date earlier than when you made the change and see which device it is on by reading the comment.

Just like that you will know what destination drive to bring online even if you have dozens or hundreds of different destination drives!

The search engine in TLE-BU uses 'Search Profiles' to handle sorting results. By default the search engine will only look for files on destination partitions, sort the results by file name in ascending order and show 25 results per page.

You can edit the default search profile by clicking on the 'Set Preferences' button on the main search engine page. Under the 'Set Preferences' menu there is a 'Manage Profiles' button where you can create one or more additional search profiles that each perform differently. If you have two or more search profiles you can set which is going the be the default profile. You can select a non-default profile to use with a given search by simply choosing it from the 'switch profile' select list before actually running the search.

When the results are returned each result is numbered and the information is displayed in a block.

fig. 5.0.a, A sample search result block:

A sample search result block

The first line starts with a key indicating if the result is a file ('File'), a symlink ('Sym.') or a directory ('Dir.') followed by the relative path to and the file name. The tilde '~' at the start of the directory is a reminder that the directory is relative to the partition's mount point containing the file. The second and third lines show information about the file. The last two lines show information about the partition that the file is on.

Pretty strait forward, eh?

The actual search itself is nothing fancy from a technical point of view. In fact, it is something I am planning to put some time into improving in the upcoming version of TLE-BU. For now though it works well enough.

Restoring Data Within TLE-BU

TLE-BU was built on a 'never delete' principle. This has arguably been taken to an extreme in that it will not even overwrite previously recovered data. This means that how TLE-BU handles data recovery is a little different from most other backup programs.

There is a setting called the 'Restore Directory' that defines where restored data will be placed. The default is '/usr/share/tle-bu/restore' but can easily be changed to any other directory. Ideally this would be a dedicated partition but that is up to the user to do.

A user can use the file browser to select a subset of directories and files to recover the same way that he or she would select which files are selected to be backed up. The difference is that the restoration of the data must be run manually. Once you or the user has finished selecting what data you want to restore from a destination media click on the 'Run Now' button.

fig. 6.0.a, The top menu in the file browser when viewing a destination partition:

The top menu in the file browser when viewing a destination partition

This will start the restore module which will build a list of files to restore and the amount of space that data will need. First the program will show where the data will be restored to on the server and ask the user to confirm before the restore starts.

fig. 6.0.b, The restore job confirmation menu:

The restore job confirmation menu

If there is not enough space left on the partition with the restore directory the recovery will not be allowed to proceed. There is a setting called 'Declare the restore partition full within ## MB' that is set to '1,024' by default. This space is subtracted from the actual free space when the check is made to see if there is enough space on the partition with the restore directory.

Once the restore job is finished the user is will see this screen indicating that they can now access their recovered data:

fig. 6.0.c, The post-restore screen:

The post-restore screen

If you, as the administrator, want to let your users immediately access their recover data you will also need to have some mechanism for letting them get access to the data under the restore directory (ie: an 'NFS' or 'SMB' share) or else you will need to manually move the data into a location they can access (which is potentially safer anyway).

The restored data is placed under a directory using the format '/restore_dir/user_name/date_num'. This is done so that if two different users recover data or even if the same user recovers data twice in the same day they will not overwrite previously restored data. The trade off to this 'never delete' policy is that it is relatively easy to use up a lot of space fairly quickly. As the administrator you will want to check the restore directory to make sure a user isn't carelessly restoring data they don't need and leaving it behind. You can decide who can restore data in the first place in the user permissions menu. You may wish to hand out that privilege carefully.

About "The Linux Experience"

Ed.: TLE is no longer in business.

TLE was a migration, support and open-source software development company in the greater Toronto area. They specialized in helping clients migrate away from proprietary solutions in favor of Open Source Software. TLE contributed to the collection of Open Source Software through the development of it's "TLE-BU" backup software. TLE offered end-to-end support, training and services during all stages of migration including extended support post-migration. TLE served as a single point of contact for all of their client's technology needs.

Helpful Links

  • Note: TLE-BU is no longer under active development and it is not recommended that you use it.

Madison Kelly,
Lead Technician

-= The Linux Experience =-
http://thelinuxexperience.com

 

Any questions, feedback, advice, complaints or meanderings are welcome.
Us: Alteeve's Niche! Support: Mailing List IRC: #clusterlabs on Freenode   © Alteeve's Niche! Inc. 1997-2019
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.
Personal tools
Namespaces

Variants
Actions
Navigation
projects
Toolbox