User Tools

Site Tools


manuals:vps:datasets

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
manuals:vps:datasets [2016/10/30 14:47] tomsmanuals:vps:datasets [2023/08/02 18:22] (current) – [Mounts] do not mention vpsadminos Aither
Line 1: Line 1:
 +<page>manuals:vps:datasets</page>
 ====== Datasets ====== ====== Datasets ======
 +
 +Dataset is a term from the ZFS filesystem that we're using everywhere. You can imagine it as
 +a formatted partition on disk containg directories and files. For example, btrfs has
 +a similar concept called //subvolumes//.
  
 The dataset in vpsAdmin directly represents the ZFS dataset on the hard drive. Datasets The dataset in vpsAdmin directly represents the ZFS dataset on the hard drive. Datasets
-are used for VPS and NAS data. The concept of a dataset replaces NAS +are used for VPS (each VPS has its own dataset) and NAS data. A VPS dataset can be used 
-exports. A VPS dataset can be used the same way as an NAS.+the same way as an NAS, but are located in different locations (VPS details and the NAS menu). 
 +The operations you can carry out with them are the same, such as creating snapshots, restoring 
 +to snapshots or mounting datasets to VPS.
  
 {{:navody:vps:dataset_vps.png?300|}} {{:navody:vps:dataset_vps.png?300|}}
Line 10: Line 17:
 quotas and ZFS properties for various data/apps. quotas and ZFS properties for various data/apps.
  
-VPS datasets are located in the VPS details and NAS datasets are in the NAS menu. +VpsAdmin allows users to create subdatasets and configure ZFS properties.
-The operations you can carry out with them are the same. VpsAdmin enables +
-creating subdatasets and configuring ZFS properties.+
  
 {{:navody:vps:dataset.png?300|}} {{:navody:vps:dataset.png?300|}}
Line 51: Line 56:
 is applied and the subdataset quotas can have any settings. is applied and the subdataset quotas can have any settings.
  
-===== Snapshots ===== +===== Mount NAS Dataset to a VPS ==== 
- +For VPS using [[manuals:vps:vpsadminos|vpsAdminOS]]create an [[manuals:vps:exports|export]] 
-Backups are made using ZFS snapshotswhich can be seen in the Backups +and mount it inside the VPS using NFS.
-menu. They can be created in the very same menu. The created VPS snapshots cannot be +
-deleted, you have to wait until they are automatically overwritten by further daily backups.  +
- +
-{{:navody:vps:backups.png?300|}} +
- +
-VPS backups are made every day at 1:00 AM, when one node +
-creates a snapshot of all the datasets at once. Then the snapshots are moved to +
-backuper.prg. +
- +
-Attention! NAS **is not backed up** to backuper.prg. Snapshots are +
-local only and their only purpose is protection against the damage or unwanted deletion of data.+
  
 ===== Mounts ===== ===== Mounts =====
 +vpsAdmin mounts are used only to mount VPS subdatasets.
 +NAS datasets and snapshots are mounted using [[manuals:vps:exports|exports]].
  
-Mounts can be seen in the VPS details. Both datasets +Mounts can be seen in the VPS details:
-and snapshots can be mounted. Any dataset or snapshot can be mounted to any +
-VPS. Mounts of individual snapshots replace a permanent backup mount to +
-/vpsadmin_backuper.+
  
 {{:navody:vps:mounts.png?300|}} {{:navody:vps:mounts.png?300|}}
  
-Each snapshot can only have one mount at any given moment, datasets +We do not recommend nesting mount points in the incorrect order. The situation when 
-have no such limitation. +''one/two'' dataset is mounted above the ''one'' dataset has not been solved.
- +
-do not recommend nesting mount points in the incorrect order. The situation when +
-one/two” dataset is mounted above the one” dataset has not been solved.+
  
 {{:navody:vps:mounts_detail.png?300|}} {{:navody:vps:mounts_detail.png?300|}}
Line 86: Line 76:
 persistent between VPS restarts. persistent between VPS restarts.
  
-===== Restoring Backups ===== +===== More Information ===== 
- +  * [[manuals:vps:backups|Backups]] 
-Restoring a VPS from a backup (snapshot) works the same way as it has until now. Restoring +  * [[manuals:vps:exports|Exports of datasets and snapshots]]
-always works on the dataset level. If a VPS has subdatasets, rootfs is restored +
-from the backup, subdatasets are not restored. I.e. it is possible to restore +
-any dataset and this doesn’t have any effect on other datasets. During the restore process, +
-all snapshots are stored thanks to the fact that backups in the backuper are branched. +
- +
-You can only make snapshots of an NAS **manually**. Since it is not backed up to the +
-backuper, the restore process behaves the same way as ''zfs rollback -r'', i.e. restoring to +
-an older snapshot **deletes** all newer snapshots. It is an **irreversible** operation. +
- +
-In order to restore data from a backup on an NAS without deleting snapshots, mount the selected  +
-snapshot to a VPS and copy the data. +
- +
-===== Downloading Backups ===== +
- +
-Backups can be either downloaded through an online interface or [[navody:vps:api#cli|a CLI]]. +
-The CLI has the advantage of not having to wait for an e-mail with a link to the backup download location – +
-we can start the download immediately or automate the whole process. We can either download +
-a tar.gz archive or the ZFS data stream directly (even incrementally). +
- +
-==== Incremental Backups ==== +
- +
-An incremental backup only contains the data that has changed since the previous snapshot. +
-In order to help the client identify which snapshots can be downloaded incrementally, +
-each snapshot contains a //history indicator// (number). Snapshots with the same +
-identifier can be moved incrementally. The history flow can be interrupted by a VPS reinstallation +
-or using a backup to restore. Afterwards, the history identifier is increased by 1 and  +
-the full backup needs to be downloaded again. +
- +
-The history identifier is shown in the table with a list of snapshots in the Backups menu. +
- +
-==== Downloading the Backup as a File ==== +
- +
-<code> +
-$ vpsfreectl snapshot download --help +
-snapshot download [SNAPSHOT_ID     Download a snapshot as an archive or a stream +
- +
-Command options: +
-    -f, --format FORMAT              archive, stream or incremental_stream +
-    -I, --from-snapshot SNAPSHOT_ID  Download snapshot incrementally from SNAPSHOT_ID +
-    -d, --[no-]delete-after          Delete the file from the server after successful download +
-    -F, --force                      Overwrite existing files if necessary +
-    -o, --output FILE                Save the download to FILE +
-    -q, --quiet                      Print only errors +
-    -r, --resume                     Resume cancelled download +
-    -s, --[no-]send-mail             Send mail after the file for download is completed +
-    -x, --max-rate N                 Maximum download speed in kB/s +
-        --[no-]checksum              Verify checksum of the downloaded data (enabled) +
-</code> +
- +
-If the snapshot ID isn’t passed on to the program as an argument, it displays an interactive +
-prompt: +
- +
-<code> +
-$ vpsfreectl snapshot download +
-Dataset 14 +
-  (1) @2015-12-04T00:00:02Z +
-VPS #198 +
-  (2) @2015-12-01T09:08:28Z +
-  (3) @2015-12-01T09:10:10Z +
-  (4) @2015-12-01T11:25:55Z +
-  (5) @2015-12-01T11:36:03Z +
-  (6) @2015-12-01T11:54:51Z +
-  (7) @2015-12-01T11:55:19Z +
-  (8) @2015-12-01T12:02:27Z +
-  (9) @2015-12-01T12:27:50Z +
-  (10) @2015-12-01T12:37:50Z +
-  (11) @2015-12-01T12:55:46Z +
-  (12) @2016-02-29T09:56:03Z +
-  (13) @2016-02-29T10:08:31Z +
-  (14) @2016-02-29T10:08:35Z +
-Pick a snapshot for download: +
- +
-</code> +
- +
-We will be downloading the 4th snapshot (''@2015-12-01T11:25:55Z''): +
- +
-<code> +
-Pick a snapshot for download: 4 +
-The download is being prepared... +
-Downloading to 198__2015-12-01T12-25-56.tar.gz +
-Time: 00:01:37 Downloading 0.3 GB: [=====================================================================================] 100% 992 kB/s +
-</code> +
- +
-Using the ''%%--format%%'' option we choose whether we want to download a tar.gz archive, a data +
-stream or an incremental data stream. Under default settings, the tar.gz archive +
-is downloaded. +
- +
-We can either let vpsAdmin name the file +
-([[https://api.vpsfree.cz/v2.0/#root-snapshot_download-show|SnapshotDownload#Show.file_name]]), +
-or choose our own location using the ''%%--output%%'' option. If ''%%--output=-%%'' is used, +
-the output is redirected to stdout. +
- +
-The program enables pausing the download (you need to use ''CTRL+C'') and then resuming it +
-again. If the ''%%--resume%%'' or ''%%--force%%'' options are not used, +
-the program asks the user whether it should resume the download or start +
-over. +
- +
-<note> +
-The download can only start if the prepared file has not been deleted on the server +
-in the meantime (the ''%%-[no-]delete-after%%'' option), which takes as long as a week since the first  +
-download attempt. +
-</note> +
- +
-==== ZFS Data Stream ==== +
- +
-<code> +
-$ vpsfreectl snapshot send --help +
-snapshot send SNAPSHOT_ID            Download a snapshot stream and write it on stdout +
- +
-Command options: +
-    -I, --from-snapshot SNAPSHOT_ID  Download snapshot incrementally from SNAPSHOT_ID +
-    -d, --[no-]delete-after          Delete the file from the server after successful download +
-    -q, --quiet                      Print only errors +
-    -s, --[no-]send-mail             Send mail after the file for download is completed +
-    -x, --max-rate N                 Maximum download speed in kB/s +
-        --[no-]checksum              Verify checksum of the downloaded data (enabled) +
-</code> +
- +
-The difference from ''snapshot download'' is that a data stream is written directly to stdout +
-in an uncompressed form so that we can mount it directly from +
-''zfs recv'': +
- +
-<code> +
-$ vpsfreectl snapshot send <id> | zfs recv -F <dataset> +
-</code> +
- +
-An incremental data stream can be requested using the ''%%-I, --from-snapshot%%'' option +
- +
-<code> +
-$ vpsfreectl snapshot send <id2> -- --from-snapshot <id1> | zfs recv -F <dataset> +
-</code> +
- +
-==== Automated Backup Downloads ==== +
- +
-Automated backup downloads are performed using the ''backup vps'' and +
-''backup dataset'' commands. They are used the same way, the only difference being that the former uses the VPS ID as its argument +
-while the latter uses the dataset ID. +
- +
-These commands require ZFS to be installed, zpool to be created and root permissions. +
-The program can be run directly under root, otherwise it will use sudo when running. +
- +
-Upon startup, snapshots with the current history identifier are downloaded, as long as they +
-do not exist locally yet. If possible, they are downloaded incrementally. In order for +
-incremental transfer to work, the program must find the snapshot which is present locally and +
-on the server at the same time. This means that backups have to be downloaded at least +
-once every 14 days since the newest local snapshot gets deleted from the server after that time period +
-and the program won’t be able to resume downloading backups – there won’t be +
-any common snapshot. +
- +
-<code> +
-$ vpsfreectl backup vps --help +
-backup vps [VPS_ID] FILESYSTEM       Backup VPS locally +
- +
-Command options: +
-    -p, --pretend                    Print what would the program do +
-    -r, --[no-]rotate                Delete old snapshots (enabled) +
-    -m, --min-snapshots N            Keep at least N snapshots (30) +
-    -M, --max-snapshots N            Keep at most N snapshots (45) +
-    -a, --max-age N                  Delete snapshots older then N days (30) +
-    -x, --max-rate N                 Maximum download speed in kB/s +
-    -q, --quiet                      Print only errors +
-    -s, --safe-download              Download to a temp file (needs 2x disk space) +
-        --retry-attemps N            Retry N times to recover from download error (10) +
-    -i, --init-snapshots N           Download max N snapshots initially +
-        --[no-]checksum              Verify checksum of the downloaded data (enabled) +
-    -d, --[no-]delete-after          Delete the file from the server after successful download (enabled) +
-        --no-snapshots-as-error      Consider no snapshots to download as an error +
-        --[no-]sudo                  Use sudo to run zfs if not run as root (enabled) +
-</code> +
- +
-If the program does not receive the VPS/Dataset ID as an argument, it either asks the user for it +
-or it tries to identify the ID itself. The ''FILESYSTEM'' argument always needs to be +
-provided. It should contain the name of the dataset where we want to store the backups. +
- +
-<note warning> +
-The first argument of the program is the **VPS/dataset ID**, which can be confusing if a dataset +
-is used, since the ID is not identical to the dataset name, but both are usually +
-numbers. +
-</note> +
- +
-Before we actually run the program, the ''%%--pretend%%'' option might come in handy – it  +
-shows us what the program would do, i.e. which snapshots it would download and potentially +
-delete. +
- +
-The ''%%--[no-]rotate%%'' option can be used to (de)activate the deletion of older snapshots in order to +
-make room for the new ones. Unless we change other settings, we will have at least +
-30 snapshots (which currently means 30 daily histories) and a maximum of 45 snapshots +
-(if we create some ourselves) and snapshots older than 30 days will be +
-deleted. +
- +
-The content of the ''FILESYSTEM'' dataset is managed by the program itself and the user +
-should not create more subdatasets/snapshots in it. The downloaded snapshots are placed in +
-subdatasets, which are named according to the history identifier. +
- +
-=== Usage Example === +
- +
-<code> +
-$ vpsfreectl backup vps storage/backup/199 +
-(1) VPS #198 +
-(2) VPS #199 +
-(3) VPS #202 +
-Pick a dataset to backup: 2 +
-Will download 8 snapshots: +
-  @2016-03-07T18:12:58 +
-  @2016-03-07T18:13:21 +
-  @2016-03-07T18:18:35 +
-  @2016-03-10T10:18:03 +
-  @2016-03-10T10:18:30 +
-  @2016-03-10T11:49:00 +
-  @2016-03-10T14:28:00 +
-  @2016-03-10T14:33:12 +
- +
-Performing a full receive of @2016-03-07T18:12:58 to storage/backup/199/+
-The download is being prepared... +
-Time: 00:00:56 Downloading 0.3 GB: [====================================================================================] 100% 1755 kB/s +
-Performing an incremental receive of @2016-03-07T18:12:58 - @2016-03-10T14:33:12 to storage/backup/199/+
-The download is being prepared... +
-Time: 00:00:00 Downloading 0.0 GB: [=======================================================================================] 100% 0 kB/s +
-</code> +
- +
-We can notice that the program downloads the first snapshot in full size and +
-all the following ones incrementally. +
- +
-A list of snapshots can be displayed using ''zfs list'': +
-<code> +
-$ sudo zfs list -r -t snapshot storage/backup/199 +
-NAME                                       USED  AVAIL  REFER  MOUNTPOINT +
-storage/backup/199/1@2016-03-07T18:12:58     8K      -   284M +
-storage/backup/199/1@2016-03-07T18:13:21     8K      -   284M +
-storage/backup/199/1@2016-03-07T18:18:35     8K      -   284M +
-storage/backup/199/1@2016-03-10T10:18:03     8K      -   285M +
-storage/backup/199/1@2016-03-10T10:18:30     8K      -   285M +
-storage/backup/199/1@2016-03-10T11:49:00   160K      -   285M +
-storage/backup/199/1@2016-03-10T14:28:00   160K      -   285M +
-storage/backup/199/1@2016-03-10T14:33:12      0      -   285M +
-</code> +
- +
-We can access our own data using the special ''.zfs'' folder: +
-<code> +
-$ ls -1 /storage/backup/199/1/.zfs/snapshot +
-2016-03-07T18:12:58 +
-2016-03-07T18:13:21 +
-2016-03-07T18:18:35 +
-2016-03-10T10:18:03 +
-2016-03-10T10:18:30 +
-2016-03-10T11:49:00 +
-2016-03-10T14:28:00 +
-2016-03-10T14:33:12 +
-</code> +
- +
-Cron can be used to download backups regularly. The crontab record can look like +
-this: +
- +
-<code> +
-MAILTO=your@email +
- +
-# Example of job definition: +
-# .---------------- minute (0 - 59) +
-# |  .------------- hour (0 - 23) +
-# |  |  .---------- day of month (1 - 31) +
-# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ... +
-# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat +
-# |  |  |  |  | +
-# *  *  *  *  * user-name command to be executed +
-  0  7  *  *  * root      vpsfreectl backup vps 199 storage/backup/199 -- --max-rate 1000 +
-</code> +
- +
-This means that the program runs every day at 7:00 AM (at this point, the backups +
-in vpsFree should already have been be moved to backuper.prg) the backups will be downloaded using a maximum speed of +
-1 MB/s. Cron will send us the output of the command to the email set in the +
-''MAILTO'' variable. However, if we just want to check if it works, it is unnecessary for the email to be sent every day. +
-This is why the program has the ''%%--quiet%%'' option, which ensures that only potential +
-errors are printed. +
- +
-<note> +
-If we download the backups using the root user, but ''vpsfreectl'' was installed +
-by a standard user, the program has to be +
-[[navody:vps:api#cli|nainstalled]] again with all dependencies. +
-</note> +
- +
-=== Downloading a Full Backup Using a Slow/Unreliable Connection === +
- +
-With default settings, the ''backup'' command does not enable us to pause and resume the download +
-since it doesn’t download the data to a file, but it directly sends it to ZFS. If we only have +
-a slow and unreliable connection at our disposal, it can happen that the download +
-fails and it is necessary to start over. However, we can use the +
-''%%--safe-download%%'' option to help ourselves. The option first downloads the data as a file and only then +
-does it send it to ZFS. Because of this, the download can be paused and later +
-resumed at any point. The disadvantage of this procedure is that it requires twice as much space on the hard drive since the data +
-is simultaneously stored in a temporary file and the ZFS dataset. The temporary file is created +
-in the folder from which the program is running. +
- +
-Another problem can be encountered after a long period of downloading. This is because when the program is first started, +
-it downloads all snapshots from the oldest to the newest one. If, however, +
-downloading the oldest snapshot takes too long, it can be deleted from the +
-server, which causes us to be unable to use it for incremental downloads later on and +
-we have to download the full backup again. To solve cases like this one, there is the +
-''%%--init-snapshots N%%'' option, which tells the program that we only want to download ''N'' +
-most recent snapshots. The safest method is using ''%%--init-snapshots 1%%'', then +
-we have 14 days to finish the download (the last pause can occur after 7 +
-days). However, this is no cure-all since if the program is closed and run again on a different day, +
-the last snapshot will be different and the download process will start over unless +
-''%%--init-snapshots%%'' has the proper value. +
- +
-=== Detecting Missing Backups === +
- +
-Sometimes it can happen that the daily backup doesn’t occur and so the program doesn’t have +
-anything to download. This situation typically isn’t considered an error – all the +
-snapshots have simply been downloaded and the program doesn’t have anything to do. However, if +
-backups are downloaded automatically using Cron, we have no way of finding out that +
-no backups are being downloaded. This is why the program has the +
-''%%--no-snapshots-as-error%%'' option, which ensures that if the program doesn’t have anything to  +
-download, it returns an error. Errors are not hidden by the ''%%--quiet%%'' option, +
-so Cron will send it to us via email and we will find out about the outage. +
- +
-=== Downloading backups with a standard user account using sudo === +
- +
-If we don’t want to install or run ''vpsfree-client'' using the root user, the program can +
-run under an unprivileged user as well. In this case, sudo is used in order to work with ZFS. +
- +
-In the following example, we will install and use the program under the user  +
-''vpsfree''. First we create the user and install ''vpsfree-client'': +
- +
-<code> +
-# useradd -m -d /home/vpsfree -s /bin/bash vpsfree +
-# su vpsfree +
-$ gem install --user-install vpsfree-client +
-</code> +
- +
-Add the following lines to ''/etc/sudoers.d/zfs'': +
- +
-<code> +
-Defaults:vpsfree !requiretty +
-vpsfree ALL=(root) NOPASSWD: /sbin/zfs +
-</code> +
- +
-The user ''vpsfree'' will be able to run ''zfs'' as a root user even without the password, +
-which is necessary if we want to run it using Cron. +
- +
-Now we’ll try to run the program manually and then move it to the crontab. Let’s try +
-requesting and saving an authentication token: +
- +
-<code> +
-# su vpsfree +
-$ vpsfreectl --auth token --new-token --token-lifetime permanent --save user current +
-</code> +
- +
-If you get an error stating that the program doesn’t exist, you will need to specify the whole +
-path or add the correct directory to ''$PATH''. Gems are installed to +
-''~/.gem/ruby/<Ruby version>/'', on my system the path to the executables is specifically +
-''/home/vpsfree/.gem/ruby/2.0.0/bin''+
- +
-When we have a working client, we can download the first backup to the dataset that we +
-have created. In this example, VPS #123 will be backed up to the ''storage/backup/vps/123'' +
-dataset. +
- +
-<code> +
-# su vpsfree +
-$ sudo zfs create -p storage/backup/vps/123 +
-$ vpsfreectl backup vps 123 storage/backup/vps/123 +
-</code> +
- +
-We will use Cron for regular downloads of further backups. +
-Open the ''etc/cron.d/vpsfree'' file and add: +
- +
-<code> +
-PATH=/bin:/usr/bin:/home/vpsfree/.gem/ruby/2.0.0/bin +
-MAILTO=your@email +
-HOME=/home/vpsfree +
- +
-0 7 * * * vpsfree vpsfreectl backup vps storage/backup/vps/123 -- --quiet +
-</code> +
- +
-''PATH'' states the directory containing ''vpsfreectl''. Note that +
-we no longer need to provide the VPS ID for the program – the program stores it when it runs the first time. +
- +
-=== Downloading Backups Under a Standard User by Delegating Permissions === +
- +
-Solaris/OpenIndiana and FreeBSD enable delegating the permissions to control datasets +
-to various users. In this case, the program does not need root permissions at all, and neither does it need +
-sudo. +
- +
-We will assign the required permissions to the user ''vpsfree''+
- +
-<code> +
-# zfs create storage/backup/123 +
-# zfs allow vpsfree create,mount,destroy,receive storage/backup/123 +
-</code> +
- +
-In order for the user to be able to create subdatasets and connect them, the user needs permissions on +
-the directory and file levels: +
- +
-<code> +
-# chown vpsfree:vpsfree /storage/backup/123 +
-</code> +
- +
-<note> +
-It is necessary to modify the kernel settings in FreeBSD so that it allows users to mount: +
- +
-<code> +
-# sysctl vfs.usermount=1 +
-</code> +
-</note> +
- +
-Now we can start downloading the backups. We use the ''%%--no-sudo%%'' option to ensure that the +
-program doesn’t try to use sudo. +
- +
-<code> +
-# su vpsfree +
-$ vpsfreectl backup vps 123 storage/backup/123 -- --no-sudo +
-</code> +
- +
- +
-==== General Options ==== +
-  * ''%%--[no-]delete-after%%'' decides whether the downloaded file should be deleted from the server after a successful download attempt +
-  * ''%%--[no-]send-mail%%'' indicates whether we want to receive emails informing us that the backup on the server is ready for download +
-  * ''%%--max-rate N%%'' sets the maximum download speed in kB/s +
-  * ''%%--quiet%%'' disables all outputs, only errors are displayed +
-  * ''%%--no-checksum%%'' turns off the checksum count and checks (sha256, which can cause delays) +
- +
-===== Restoring Downloaded Backups ===== +
- +
-Restoring VPS from a downloaded backup so far isn’t automated in any way. One of the +
-possible ways is mounting the VPS dataset that we want to restore to a different VPS +
-(Playground) and copying the data. This method is described in the +
-[[navody:vps:oprava#pripojeni_rootfs|manual for repairing VPS]].+
  
manuals/vps/datasets.1477838841.txt.gz · Last modified: 2016/10/30 14:47 by toms