Proxmox-VE: Difference between revisions
| Line 194: | Line 194: | ||
To remove, use the webui | To remove, use the webui | ||
=Troubleshooting and Error Handling Tips= | =Troubleshooting and Error Handling Tips= | ||
==Too many open files== | ==Too many open files, OOM Killer, Out Of Memory== | ||
If you start receiving errors regarding "too many open files", try applying these fixes. | If you start receiving errors regarding "too many open files", try applying these fixes. | ||
| Line 209: | Line 209: | ||
</nowiki> | </nowiki> | ||
<nowiki>nano / | <nowiki>nano /proc/sys/fs/inotify/max_user_watches</nowiki> | ||
Make the contents of the file: | |||
<nowiki>1048576</nowiki> | |||
<nowiki>nano /usr/lib/sysctl.d/override.conf</nowiki> | |||
Add to the bottom of the file: | Add to the bottom of the file: | ||
<nowiki> | <nowiki> | ||
| Line 221: | Line 227: | ||
</nowiki> | </nowiki> | ||
Apply the New Settings: | Apply the New Settings: | ||
<nowiki>sysctl -p</nowiki> | <nowiki>sysctl -p</nowiki> | ||
<nowiki>sysctl --system</nowiki> | <nowiki>sysctl --system</nowiki> | ||
==increase ulimit or decrease threads== | ==increase ulimit or decrease threads== | ||
<nowiki>lxc.prlimit.nofile = 20000</nowiki> | <nowiki>lxc.prlimit.nofile = 20000</nowiki> | ||
Latest revision as of 10:49, 28 April 2023
Terminology
'Node' is a computer running proxmox.
'Cluster' is a group of proxmox nodes, that can being used together for easy migration or high availability.
'Container' is a guest operating system that shares the kernel with the proxmox host, can give a performance boost compared to a normal Virtual OS.
VM/LXC Specifics
Installation
Download the ISO and install on bare metal. Alternatively, you can install on top of an existing Debian
Extra Notes
If ZFS is the storage system of choice, it helps to install on SSD for OS, and set up a ZFS pool manually later.
Advice on Container/VM Numbering
Create a VLAN for your containers, and allow the container number to reflect the IP address assigned to it. This will save you a great deal of headache.
Example, say Plex, sonarr, radarr, lidarr, sabnzbd, etc are all on VLAN 12, and Plex's IP is 10.0.12.90 you can number the plex container 12090 to reflect VLANID+IP
Enable Non-Subscription Updates
Complete the following steps to your source lists to get updates.
Enable No Subscription Repo
nano /etc/apt/sources.list.d/pve-nosub.list
add:
deb http://download.proxmox.com/debian bullseye pve-no-subscription
nano /etc/apt/sources.list.d/pve-enterprise.list
Comment out Enterprise Repo
add a # symbol in front) of this line:
#deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise
Install any and all available updates
apt-get update && apt-get dist-upgrade -y
Non-production Bleeding Edge
PVE Testing Repo
nano /etc/apt/sources.list.d/pve-testing.list
add:
deb http://download.proxmox.com/debian/pve bullseye pvetest
Debian Sid
nano /etc/apt/sources.list
add:
deb http://deb.debian.org/debian sid main non-free contrib deb-src http://deb.debian.org/debian/ sid main contrib non-free
Install any and all available updates
apt-get update && apt-get dist-upgrade -y
Fix Proxmox Web Interface
If you see:
unable to parse codename from '/etc/os-release' (500)
Then Modify the following file and add this line:
nano /etc/os-release
add:
VERSION_CODENAME=bullseye
Disable subscription nag after login
wget https://git.deathbybandaid.net/attachments/3d08b1b4-b896-497b-893b-c9c0dbb830ca dpkg -i pve-fake-subscription_*.deb echo "127.0.0.1 shop.maurer-it.com" | tee -a /etc/hosts pve-fake-subscription
Update Template Listing
pveam update
Download Virtio iso
Headless Laptop
If you want a laptop node, you may want to disable the lid closing action, or your node will go into standby.
nano /etc/systemd/logind.conf
Change the lines below.
HandleLidSwitch=ignore HandleLidSwitchDocked=ignore
restart the logind service.
systemctl restart systemd-logind.service
Change Network Settings without reboot
apt install ifupdown2
Add an L2 virtual switch
apt-get install openvswitch-switch
Add libraries for virgl
apt install libgl1 libegl1 -y
Create sudo user
adduser sysop
usermod -aG sudo sysop
Install a desktop environment (optional)
This takes a few minutes... Make sure you have a non-root user!
apt-get install tasksel
tasksel install desktop kde-desktop
systemctl set-default graphical.target
reboot
Dark Theme (Optional)
Two Node Cluster
nano /etc/pve/corosync.conf
look for the `quorum` section and add
two_node: 1 wait_for_all: 0
pvesm add cifs DrivePoolold --server 192.168.1.101 --share DrivePool --username Administrator --password ********* --smbversion 2.1
Clustering
Create a cluster
pvecm create YOUR-CLUSTER-NAME
Add a node to a cluster
From the node you want to add. (new nodes must be empty)
pvecm add IP-ADDRESS-CLUSTER
Remove a node from a cluster
Remove all containers from the node, and login to a node that you are keeping.
it does not hurt to run these commands on all remaining nodes
Delete the node
pvecm delnode hp4 NODE_NAME_TO_DELETE
Delete node configuration files
cd /etc/pve/nodes
rm -r NODE_NAME_TO_DELETE
This step is purely to help with if you want to readd a new node with the same IP address as one you've removed
nano /etc/ssh/ssh_known_hosts
And remove ip, and node name lines
pvecm updatecerts
ssh-keygen -R 192.168.1.103
ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "192.168.1.103" ssh-keygen -f "/etc/ssh/ssh_known_hosts" -R "NODE_NAME_TO_DELETE"
This step is when the replacement node is running:
ssh -o 'HostKeyAlias=pve5' root@10.0.4.5
Set Node to not wait for cluster votes
pvecm expected 1
ZFS Tricks and Tips
Proxmox Default ZFS setup
zpool create -f -o ashift=12 rpool /dev/sda /dev/sdb
zfs create rpool/data
Directory mounted on ZFS
zfs create rpool/downloads -o mountpoint=/downloads
backups directory in ZFS
Create Directory
zfs create tank/bkup -o mountpoint=/bkup
Add to storage.cfg
nano /etc/pve/storage.cfg
dir: bkup
path /bkup
content vztmpl,iso,backup
maxfiles 4
shared 0
Add ZFS swap space
This will add 8GB of swap space for ZFS to use.
Verify Current
root@Proxmox1:~# swapon -s Filename Type Size Used Priority /dev/zd0 partition 8388604 4036884 -2
Add and activate
root@Proxmox1:~# zfs create -V 8G rpool/swap2
mkswap /dev/zvol/rpool/swap2
swapon /dev/zvol/rpool/swap2
Verify New
root@Proxmox1:~# swapon -s Filename Type Size Used Priority /dev/zd0 partition 8388604 4036884 -2 /dev/zd96 partition 8388604 0 -3
Make Load at boot
nano /etc/fstab
/dev/zvol/rpool/swap2 none swap sw 0 0
Rename a pool
zpool export [poolname]
As an example, for a pool named tank which we wish to rename notankshere:
zpool export tank
Then run:
zpool import [poolname] [newpoolname]
e.g.:
zpool import tank notankshere
The pool will be imported as “notankshere” instead.
Find and remove unused container disks
Find:
pct rescan
Look for lines that say "add unreferenced volume"
To remove, use the webui
Troubleshooting and Error Handling Tips
Too many open files, OOM Killer, Out Of Memory
If you start receiving errors regarding "too many open files", try applying these fixes.
These errors increase when you start running large numbers of containers.
nano /etc/security/limits.conf
Add to the bottom of the file:
* soft nofile 1048576 * hard nofile 1048576 root soft nofile 1048576 root hard nofile 1048576 * soft memlock unlimited * hard memlock unlimited
nano /proc/sys/fs/inotify/max_user_watches
Make the contents of the file:
1048576
nano /usr/lib/sysctl.d/override.conf
Add to the bottom of the file:
fs.inotify.max_queued_events=1048576 fs.inotify.max_user_instances=1048576 fs.inotify.max_user_watches=1048576 vm.max_map_count=262144 kernel.dmesg_restrict=1 net.core.rmem_max = 16777216 fs.file-max = 1048576
Apply the New Settings:
sysctl -p
sysctl --system
increase ulimit or decrease threads
lxc.prlimit.nofile = 20000
Container Names missing in left pane
service pvedaemon restart
service pvestatd restart
service pveproxy restart
Backup Fails because ZFS Snapshot is already there
I got an error on backup that said:
INFO: starting new backup job: vzdump 190 --remove 0 --compress lzo --node Proxmox1 --storage DBB-Proxmox --mode snapshot
INFO: filesystem type on dumpdir is 'cifs' -using /var/tmp/vzdumptmp29423 for temporary files
INFO: Starting Backup of VM 190 (lxc)
INFO: status = running
INFO: CT Name: sonarr.dbb.local
INFO: excluding bind mount point mp0 ('/Drivepool') from backup
INFO: excluding bind mount point mp1 ('/Downloads') from backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
snapshot create failed: starting cleanup
no lock found trying to remove 'backup' lock
ERROR: Backup of VM 190 failed - zfs error: cannot create snapshot 'rpool/data/subvol-190-disk-1@vzdump': dataset already exists
INFO: Backup job finished with errors
TASK ERROR: job errors
The solution was to run:
root@Proxmox1:~# zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT rpool/data/subvol-190-disk-1@vzdump 86.5M - 2.13G - root@Proxmox1:~# zfs destroy rpool/data/subvol-190-disk-1@vzdump root@Proxmox1:~# zfs list -t snapshot no datasets available
Found stale copy during migration
I got this message
ERROR: found stale volume copy 'local-zfs:subvol-2032-disk-0' on node 'Proxmox-HP'
Solution (on host you are migrating TO):
qm rescan --vmid 2032 OR for LXC pct rescan --vmid 2032
zfs list -rt all rpool zfs destroy rpool/data/subvol-2032-disk-0