A way to exclude public_html from web domains, but not everything else?

Ok, so this may sound a bit weird! Currently I have my backups set with the following exclusions:

This works fine in the extent that it excludes the web/* domains. The problem I have, is that I still want the domains re-created with the SSL, configs etc, when doing a restore. Is there a way to do this? I tried doing a rule such as:

*/public_html/*

or

public_html/*

But they just come up with an error about not having enough disk space.

Is there an alternative method I could maybe do, that would backup the user + configs, without all the actual files in public_html (100gb+ worth, which we already backup via another method)

Thanks!

Andy

*:public_html

Does that work?

Thanks for the quick reply. Unfortunatly not:

v-backup-user chambres
Error: not enough diskspace available to perform the backup.

Current space on the drive:

df -h ./
Filesystem Size Used Avail Use% Mounted on
/dev/sda 315G 138G 161G 47% /

The actual ā€œchambresā€ user account is:

/home/chambres/web# du -sh ./
109G    ./

The DBs are quite large too (maybe a couple of Gb)

Cheers

Andy

1 Like

Looking at the code, Iā€™m not sure there is a way to do it currently?

        # Backup files
        if [ "$BACKUP_MODE" = 'zstd' ]; then
            tar "${fargs[@]}" -cpf- * | pzstd -"$BACKUP_GZIP" - > $tmpdir/web/$domain/domain_data.tar.zst
        else
            tar "${fargs[@]}"  -cpf- * | gzip -"$BACKUP_GZIP" - > $tmpdir/web/$domain/domain_data.tar.gz
        fi

I hacked mine by adding in exclude:

        # Backup files
        if [ "$BACKUP_MODE" = 'zstd' ]; then
            tar "${fargs[@]}" --exclude=public_html  -cpf- * | pzstd -"$BACKUP_GZIP" - > $tmpdir/web/$domain/domain_data.tar.zst
        else
            tar "${fargs[@]}" --exclude=public_html -cpf- * | gzip -"$BACKUP_GZIP" - > $tmpdir/web/$domain/domain_data.tar.gz
        fi

That seems to do the trick, but will get lost on any update. Iā€™m just trying to decide if there is a better way to do this that is future proof

UPDATE: Oh actually, you do have fargs ā€¦ I wonder why thats not working

Cheers

Andy

It is working with:
domain.com:public_html

I do the same for my own serverā€¦

Ok so doing some more testing. If I exclude:

domain.org:public_html

That works fine. Printing out fargs I get:

--exclude=./logs/* --exclude=public_html/*

So the issue seems to be around the * syntax (for all domains). Iā€™ll do some digging to see if I can find a workaround.

Looks like it sorts that part out

Yeah. The problem seems to be with:

exlusion=$(echo -e "$WEB" |tr ',' '\n' |grep "^$domain:")

Its looking for $domain - which is fine if you put the domain. But its missing the * wildcard for all. Iā€™ve tried playing with:

exlusion=$(echo -e "$WEB" |tr ',' '\n' |grep "^$domain|\*:")

But that doesnā€™t work :confused:

We need to include in the check if * is used we should select the domain alwaysā€¦

Yeah. It works if we do:
echo -e "*:public_html" |tr ',' '\n' |grep "^\*:"

But not with the |:

echo -e "*:public_html" |tr ',' '\n' |grep "^foo.com|\*:"

Do I need to escape something? Iā€™ve tried wrapping it:

echo -e "*:public_html" |tr ',' '\n' |grep "^(foo.com|\*):"

But still no joy

UPDATE: I got it :slight_smile:

exlusion=$(echo -e ā€œ$WEBā€ |tr ā€˜,ā€™ ā€˜\nā€™ |grep ā€œ^$domain|*:ā€)

Looks like I needed to escape | as well. Iā€™ll see if I can create a PR for it for review (just trying to remember how I did the fork last time ;))

Cheers

Andy

1 Like

Please go ahead

1 Like

PR submitted :slight_smile:

The problem still exists around the ā€œnot enough disk spaceā€, so I have to comment that out. Iā€™m not sure what a better solution would be.

1 Like

Thanks

Looking at it - there are more places that could do with this tweak as well. This also fixed the disk space calculations as well (as its correctly ignoring public_html when working out the required space). I just did a tweak to the file, but my editor changed all of the tabs to spaces! Now Iā€™ve fixed that but Iā€™m trying to work out how to edit my file before submitting that

Hmmnm I canā€™t figure it out. There are weird things going on when I compare:

You can see my modified file here:

Iā€™m not sure where its getting the |\ stuff in the file comparison?

Ok so it got late last night, so picking up again today. It looks like something didnā€™t quite happen right:

v-restore-user chambres chambres.2023-01-26_05-16-01.tar

-- WEB --
**ALL MISSING HERE**

-- MAIL --
2023-01-26 06:39:22 cdn-org.e-reserve.org
2023-01-26 06:39:23 chambresdhotes.org
2023-01-26 06:39:55 resa.chambresdhotes.org

-- DB --
2023-01-26 06:39:57 chambres_community
/backup/tmp.zVr79uU6OC/db/chambres_community/chambres_community.mysql.sql.zst: 4                                                                                                                                                             52828034 bytes
2023-01-26 06:40:14 chambres_links
/backup/tmp.zVr79uU6OC/db/chambres_links/chambres_links.mysql.sql.zst: 5884334498 bytes

-- CRON --
2023-01-26 06:51:09 0 cron jobs

-- USER FILES --
2023-01-26 06:51:09 fix_property_types.log
2023-01-26 06:51:09 .npm
2023-01-26 06:51:09 move-mobi-to-https
2023-01-26 06:51:09 .vscode-server
2023-01-26 06:51:09 .wget-hsts
2023-01-26 06:51:09 make-new-villages-live.cgi
2023-01-26 06:51:09 .bash_history
2023-01-26 06:51:09 category_structure.log
2023-01-26 06:51:09 .ssh
2023-01-26 06:51:10 remote-backup.sh
2023-01-26 06:51:10 .bashrc
2023-01-26 06:51:10 closest_towns.log
2023-01-26 06:51:10 update_distances.log
2023-01-26 06:51:10 .profile
2023-01-26 06:51:10 modules.cgi
2023-01-26 06:51:10 build-changed.sh
2023-01-26 06:51:10 .composer
2023-01-26 06:51:10 .local
2023-01-26 06:51:10 description_urls.log
2023-01-26 06:51:10 remote-backup.sh.bak
2023-01-26 06:51:10 export-local-csv-files.sh
2023-01-26 06:51:10 test.cgi
2023-01-26 06:51:10 copy-site.sh
2023-01-26 06:51:10 symlinks.txt
2023-01-26 06:51:11 build-all.sh
2023-01-26 06:51:11 counters.log
2023-01-26 06:51:11 .cache
2023-01-26 06:51:11 nohup.out
2023-01-26 06:51:11 learn-spam.sh
2023-01-26 06:51:11 update_search_sorts.log
2023-01-26 06:51:11 .config
2023-01-26 06:51:11 dbs
2023-01-26 06:51:32 backup
2023-01-26 06:51:32 .bash_logout

Iā€™m just having a look to see if I can work it out. Maybe something needs tweaking in v-restore-user, to the same effect of what I changed in v-backup-user? (i.e around the * )?

Interestingly, actually, it looks like it DID workā€¦ but the bit missing is in the /conf folder?

I found the problem - but I canā€™t figure out how to get around it. Basically, we have:

		# Define exclude arguments
		exlusion=$(echo -e "$WEB" | tr ',' '\n' | grep "^$domain\|\*:")
		set -f
		fargs=()
		fargs+=(--exclude='./logs/*')
		if [ -n "$exlusion" ]; then
			xdirs="$(echo -e "$exlusion" | tr ':' '\n' | grep -v $domain)"
			for xpath in $xdirs; do
				if [ -d "$xpath" ]; then
					fargs+=(--exclude=$xpath/*)
					echo "$(date "+%F %T") excluding directory $xpath"
					msg="$msg\n$(date "+%F %T") excluding directory $xpath"
				else
					echo "$(date "+%F %T") excluding file $xpath"
					msg="$msg\n$(date "+%F %T") excluding file $xpath"
					fargs+=(--exclude=$xpath)
				fi
			done
		fi
		set +f

In my case, $exclusion is: *:public_html

What I guess it needs to do, is replace * with $domain if it exists in the string. Doing a test this works:

sed "s/\\*/foo/g" <<< *:public_html
foo:public_html

But in the bash script it doesnā€™t:

sed "s/\\*/$domain/g" <<< $exlusion

Iā€™ve tried * , just *, etc. Iā€™ve even tried it as:

exclusion="${exlusion//\\*/$domain}"

Yet it remains as *:public_html.

Any pointers? As mentioned before, bash isnā€™t my programming language :slight_smile:

Ok I got it!

			if [[ "$exclusion" =~ '*' ]]; then
				exclusion="${exclusion/\*/$domain}"
			fi

Iā€™m still playing with it to see if this actually works this time (its a 5gb backup, so takes a while to download from the live server, put up on the new one, and try a restore :)) (there is also a typo in that script - excusion should be exclusion - which is where part of my problem came from and I was doing the replacement on $exclusion

Iā€™ve cancelled the old branch (to get rid of all the weirdness that happened yesterday), and just created a new one with the correct changes. It all looks in order now :slight_smile: Hopefully someone can give it a go and push it through. Iā€™ve tested it my end and it works perfectly now for the web folder exclusions :sunglasses:

Although, Iā€™m not sure if this is normal for a submission?

No it doesnā€™t look like:

But this should work: