[PR #490] Fix HA VM Migration Race Condition #505

Open
opened 2026-02-28 00:42:26 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/Telmate/proxmox-api-go/pull/490
Author: @pavel-z1
Created: 10/20/2025
Status: 🔄 Open

Base: masterHead: fix/ha-migration-race-condition


📝 Commits (1)

  • 0e03c15 fix: wait for migration lock release on HA VMs

📊 Changes

2 files changed (+77 additions, -0 deletions)

View changed files

📝 proxmox/client.go (+30 -0)
📝 proxmox/config__qemu.go (+47 -0)

📄 Description

Fix HA VM Migration Race Condition

This pull request resolves a race condition that occurs when migrating High Availability (HA) virtual machines.

The Problem

When a Terraform plan modifies the target_node of a proxmox_vm_qemu resource with HA enabled, the provider initiates a migration. However, it would then immediately attempt to apply further configuration updates to the VM on the new node.

Due to cluster synchronization delays, the VM's configuration file might not be immediately available on the destination node, or the VM might still be locked by the migration process. This resulted in intermittent errors, such as:

  • 500 Configuration file 'nodes/...' does not exist
  • 500 VM is locked (migrate)

This pull request addresses issue #1343.

The Solution

To ensure the provider waits until the migration is fully complete, this change introduces a robust polling mechanism. After initiating a migration, the provider will now:

  1. Poll the cluster status until the VM is reported as being on the correct destination node.
  2. Once the VM is on the target node, continue polling until the migration lock (lock: migrate) is released from the VM's status.

This ensures that the provider only proceeds with subsequent configuration updates after the Proxmox cluster has fully finalized the migration and the VM is ready for new commands. A generous 10-minute timeout has been implemented to accommodate large or slow migrations.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/Telmate/proxmox-api-go/pull/490 **Author:** [@pavel-z1](https://github.com/pavel-z1) **Created:** 10/20/2025 **Status:** 🔄 Open **Base:** `master` ← **Head:** `fix/ha-migration-race-condition` --- ### 📝 Commits (1) - [`0e03c15`](https://github.com/Telmate/proxmox-api-go/commit/0e03c15696f91aa17c787759a00e7e8c4c8e92ce) fix: wait for migration lock release on HA VMs ### 📊 Changes **2 files changed** (+77 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `proxmox/client.go` (+30 -0) 📝 `proxmox/config__qemu.go` (+47 -0) </details> ### 📄 Description ### Fix HA VM Migration Race Condition This pull request resolves a race condition that occurs when migrating High Availability (HA) virtual machines. #### The Problem When a Terraform plan modifies the `target_node` of a `proxmox_vm_qemu` resource with HA enabled, the provider initiates a migration. However, it would then immediately attempt to apply further configuration updates to the VM on the new node. Due to cluster synchronization delays, the VM's configuration file might not be immediately available on the destination node, or the VM might still be locked by the migration process. This resulted in intermittent errors, such as: - `500 Configuration file 'nodes/...' does not exist` - `500 VM is locked (migrate)` This pull request addresses issue [#1343](https://github.com/Telmate/terraform-provider-proxmox/issues/1343). #### The Solution To ensure the provider waits until the migration is fully complete, this change introduces a robust polling mechanism. After initiating a migration, the provider will now: 1. Poll the cluster status until the VM is reported as being on the correct destination node. 2. Once the VM is on the target node, continue polling until the migration lock (`lock: migrate`) is released from the VM's status. This ensures that the provider only proceeds with subsequent configuration updates after the Proxmox cluster has fully finalized the migration and the VM is ready for new commands. A generous 10-minute timeout has been implemented to accommodate large or slow migrations. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/proxmox-api-go#505
No description provided.