this post was submitted on 10 Oct 2023

12 points (83.3% liked)

Selfhosted

38832 readers

148 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

Proxmox VM's hanging when one has issues (lemmy.world)

submitted 10 months ago by root@lemmy.world to c/selfhosted@lemmy.world

19 comments fedilink hide all child comments

I've noticed that sometimes when a particular VM/ service is having issues, they all seem to hang. For example, I have a VM hosting my DNS (pihole) and another hosting my media server (jellyfin). If Jellyfin crashes for some reason, my internet in the entire house also goes down because it seems DNS is unable to be reached for a minute or so while the Jellyfin VM recovers.

Is this expected, and is there a way to prevent it?

top 19 comments

sorted by: hot top controversial new old

[–] NeoNachtwaechter@lemmy.world 5 points 10 months ago (1 children)

Your services may run in separate VMs, but there are still some dependencies between them. You need to know, and think about, all the dependencies between your VMs.

For example, they share a common network interface (the one of the host machine). That is a dependency. If one VM is able to clog the network interface (and maybe your crashing one is doing exactly that), then it is clogged for all the other VMs too.

To resolve that dependency, you can either put another network interface card in your host machine and let only the pihole VM use it, or run the pihole on a real physical Pi.

You could also resolve the jellyfin's own problem. But resolving the dependency might give you a more reliable system.

[–] root@lemmy.world 2 points 10 months ago

That's a good point; My Virtualization server is running on a (fairly beefy) Intel NUC, and it has 2 eth ports on it. One is for management, and the other I plug my VLAN trunk into, which is where all the traffic is going through. I will limit the connection speed of the client that is pulling large video files in hopes the line does not saturate, and long term I'll try to get a different box where I can separate the VLAN's onto their own ports instead of gloming them all into one port.

[–] apigban@lemmy.dbzer0.com 4 points 10 months ago* (last edited 10 months ago) (1 children)

I'd check high I/O wait, specially if your all of the vms are on HDDs.

one of the solution I had for this issue was to have multiple DNS servers. solved it by buying a raspberry pi zero w and running a 2nd small instance of pihole there. I made sure that the piZeroW is plugged on a separate circuit in my home.

[–] root@lemmy.world 2 points 10 months ago (2 children)

Good point. I just checked and streaming something to my TV causes IO delay to spike to like 70%. I'm also wondering if maybe me routing my Jellyfin (and some other things) through NGINX (also hosted on Proxmox) has something to do with it.. Maybe I need to allocate more resources to NGINX(?)

The system running Proxmox has a couple Samsung Evo 980s in it, so I don't think they would be the issue.

[–] apigban@lemmy.dbzer0.com 1 points 10 months ago (1 children)

lemme know if you need some tshooting remotely, if schedules permit, we can do screenshares

[–] root@lemmy.world 2 points 10 months ago (1 children)

Very nice of you to offer. I made a few changes (routing my problem Jellyfin client directly to the Jellyfin server and cutting out the NGINX hop, as well as limiting the bandwidth of that client incase the line is getting saturated).

I'll try to report back if there's any updates.

[–] apigban@lemmy.dbzer0.com 1 points 10 months ago

hey yeah, no stress!

just lemme know if you'd want someone to brainstorm with.

[–] apigban@lemmy.dbzer0.com 0 points 10 months ago

I had this issue when I used kubernetes, sata SSDs cant keep up, not sure what Evo 980 is and what it is rated for but I would suggest shutting down all container IO and do a benchmark using fio.

my current setup is using proxmox, rusts configured in raid5 on a NAS, jellyfin container.

all jf container transcoding and cache is dumped on a wd750 nvme, while all media are store on the NAS (max. BW is 150MBps)

you can monitor the IO using IOstat once you've done a benchmark.

[–] MangoPenguin@lemmy.blahaj.zone 3 points 10 months ago

If the VM crashing is because of high CPU usage on all cores, high IO delay on the storage, or an out of memory situation on the host, that would cause all of the other VMs to struggle as well.

[–] krolden@lemmy.ml 2 points 10 months ago (1 children)

I'd start at figuring out why your jellyfin VM is crashing

[–] root@lemmy.world 2 points 10 months ago (1 children)

Yeah, I've been looking into it for some time. It seems to normally be an issue on the client side (Nvidia shield), the playback will stop randomly and then restart, and this may happen a couple times (no one really knows why, it seems). I recently reinstalled that server on a new VM and a new OS (Debian) with nothing else running on it, and the only client to seem to be able to cause the crash is the TV running the Shield. It's hard to find a good client for Jellyfin on the TV it seems :(

[–] krolden@lemmy.ml 1 points 10 months ago* (last edited 10 months ago)

So is the VM crashing or just jellyfin? Sounds like you may be having other network issues.

[–] lemming741@lemmy.world 2 points 10 months ago (1 children)

How many cores do you have configured for jellyfin?

[–] root@lemmy.world 1 points 10 months ago

4 currently with 8GB RAM and no pass through for transcoding (only direct play)

[–] Decronym@lemmy.decronym.xyz 1 points 10 months ago* (last edited 10 months ago) (1 children)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters	More Letters
DNS	Domain Name Service/System
LXC	Linux Containers
NAS	Network-Attached Storage
NUC	Next Unit of Computing brand of Intel small computers
SSD	Solid State Drive mass storage

[Thread #205 for this sub, first seen 10th Oct 2023, 06:25] [FAQ] [Full list] [Contact] [Source code]

[–] root@lemmy.world 1 points 10 months ago

Good bot.

[–] kowcop@aussie.zone 0 points 10 months ago

This happens to me when there is an app keeping a file opened on NFS storage mapping

[–] manwichmakesameal@lemmy.world -3 points 10 months ago (1 children)

Why are you running full VMs for something that can be put in a container? Sounds to me (without having any evidence or proof) that you’re running out of memory and you’re swapping and it’s taking forever. That’s what causes the VMs to slow/stop.

[–] root@lemmy.world 3 points 10 months ago

I typically prefer VM's just because I can change the kernel as I please (containers such as LXC will use the host kernel). I know it's overkill, but I have the storage/ memory to spare. Typically I'm at about 80% (memory) utilization under full load.