Əsas məzmuna keçin

Virtualizasiya Hardware Dəstəyi

Virtualizasiya Nədir?

Virtualizasiya - fiziki hardware resurslarını bir neçə virtual environment arasında bölüşdürmək.

graph TB
Hardware[Physical Hardware] --> Hypervisor[Hypervisor<br/>VMM]

Hypervisor --> VM1[VM 1<br/>Guest OS 1]
Hypervisor --> VM2[VM 2<br/>Guest OS 2]
Hypervisor --> VM3[VM 3<br/>Guest OS 3]

VM1 --> App1[Applications]
VM2 --> App2[Applications]
VM3 --> App3[Applications]

Faydaları:

  • Resource utilization - Hardware-ı tam istifadə et
  • Isolation - VM-lər bir-birindən təcrid olunub
  • Flexibility - Asanlıqla VM yaradıb silmək
  • Cost savings - Az fiziki serverlə çox iş
  • Disaster recovery - Snapshot, backup, migration

Virtualizasiya Növləri

graph TB
Virt[Virtualization Types] --> Full[Full Virtualization<br/>Hardware-assisted]
Virt --> Para[Paravirtualization<br/>Guest OS modified]
Virt --> OS[OS-level<br/>Containers]
Virt --> Hardware[Hardware Partitioning<br/>Physical isolation]

Full --> VTx[Intel VT-x<br/>AMD-V]
Para --> Xen[Xen<br/>Faster but needs modification]
OS --> Docker[Docker<br/>LXC]
Hardware --> LPAR[IBM LPAR<br/>Oracle LDOM]

CPU Virtualization

Problem: Privilege Levels

x86 ring model:

graph TB
Ring0[Ring 0<br/>Kernel Mode<br/>Privileged] --> Ring1[Ring 1<br/>Unused]
Ring1 --> Ring2[Ring 2<br/>Unused]
Ring2 --> Ring3[Ring 3<br/>User Mode<br/>Unprivileged]

style Ring0 fill:#f66
style Ring3 fill:#6f6

Problem:

Guest OS Ring 0-da çalışmalı (privileged instructions)
Amma hypervisor var Ring 0-da!

Solution: Hardware virtualization

Intel VT-x (VMX)

Intel Virtualization Technology for x86

graph TB
VMX[VMX Operation] --> Root[VMX Root Mode<br/>Hypervisor]
VMX --> NonRoot[VMX Non-Root Mode<br/>Guest OS]

Root --> Ring0_Host[Ring 0: Hypervisor]
NonRoot --> Ring0_Guest[Ring 0: Guest OS]
NonRoot --> Ring3_Guest[Ring 3: Guest Apps]

NonRoot -.VM Exit.-> Root
Root -.VM Entry.-> NonRoot

VMCS (Virtual Machine Control Structure):

struct vmcs {
// Guest state
uint64_t guest_rip;
uint64_t guest_rsp;
uint64_t guest_cr3;
// ... all registers

// Host state (hypervisor)
uint64_t host_rip;
uint64_t host_rsp;
uint64_t host_cr3;

// VM execution controls
uint32_t pin_based_controls;
uint32_t proc_based_controls;
uint32_t exit_controls;
uint32_t entry_controls;

// Exit information
uint32_t exit_reason;
uint64_t exit_qualification;
};

VM Exit Reasons:

- Privileged instruction (e.g., CPUID, HLT)
- I/O instruction (IN, OUT)
- Access to control registers (MOV to CR3)
- Interrupt/Exception
- EPT violation (memory access)
- VMCALL (hypercall)

VT-x Instructions:

; Enter VMX operation
vmxon [vmxon_region]

; Load VMCS
vmptrld [vmcs_address]

; Launch VM
vmlaunch

; Resume VM (after VM exit)
vmresume

; Exit VMX operation
vmxoff

; Hypercall from guest
vmcall

AMD-V (SVM)

AMD Secure Virtual Machine

graph TB
SVM[SVM Operation] --> Host[Host Mode<br/>Hypervisor]
SVM --> Guest[Guest Mode<br/>Guest OS]

Host --> VMRUN[VMRUN instruction]
VMRUN --> Guest

Guest -.VMEXIT.-> Host
Host -.VMRUN.-> Guest

VMCB (Virtual Machine Control Block):

struct vmcb {
// Control area
struct {
uint32_t intercept_cr_reads;
uint32_t intercept_cr_writes;
uint32_t intercept_exceptions;
uint64_t intercept_instruction0;
uint64_t intercept_instruction1;
// ...
uint64_t exitcode;
uint64_t exitinfo1;
uint64_t exitinfo2;
} control;

// Save state area
struct {
uint64_t rip;
uint64_t rsp;
uint64_t rflags;
uint64_t cr0, cr2, cr3, cr4;
// ... all registers
} save_state;
};

AMD-V Instructions:

; Load VMCB
vmsave [vmcb_address]

; Run guest
vmrun [vmcb_address]

; Load VMCB after exit
vmload [vmcb_address]

; Hypercall
vmmcall

Intel VT-x vs AMD-V

XüsusiyyətIntel VT-xAMD-V
Control structureVMCS (in memory)VMCB (in memory)
Enter guestVMLAUNCH/VMRESUMEVMRUN
Exit guestAutomatic (VM Exit)Automatic (VMEXIT)
HypercallVMCALLVMMCALL
Tagged TLBVPIDASID
Nested pagingEPTNPT (RVI)
PerformanceSimilarSimilar

Memory Virtualization

Problem: Address Translation

Guest Virtual Address (GVA)
↓ (Guest page table)
Guest Physical Address (GPA)
↓ (Need another translation!)
Host Physical Address (HPA)

Shadow Page Tables (Software)

sequenceDiagram
participant Guest
participant Hypervisor
participant HW as Hardware

Guest->>Guest: Access GVA
Note over Guest: Uses shadow page table
Guest->>HW: GVA → HPA (direct)

Note over Hypervisor: Maintain shadow PT<br/>GVA → HPA

Guest->>Guest: Update guest PT (GVA → GPA)
Note over Guest: Page table write (trapped)
Guest->>Hypervisor: VM Exit
Hypervisor->>Hypervisor: Update shadow PT

Çatışmazlıqlar:

  • High overhead (VM exits)
  • Memory overhead (shadow page tables)
  • Complex

EPT / NPT (Hardware)

EPT - Extended Page Tables (Intel)
NPT - Nested Page Tables (AMD, also called RVI - Rapid Virtualization Indexing)

graph TB
GVA[Guest Virtual Address] -->|Guest PT| GPA[Guest Physical Address]
GPA -->|EPT/NPT| HPA[Host Physical Address]

style GVA fill:#9cf
style GPA fill:#fc9
style HPA fill:#9f9

2D Page Walk:

1. Guest page walk (GVA → GPA)
- Each step may trigger EPT walk (GPA → HPA)

2. EPT page walk (GPA → HPA)

Example:
GVA: 0x00007fff12345678

├─ Guest PT level 4: GVA[47:39] → GPA_1
│ └─ EPT walk: GPA_1 → HPA_1 (read entry)

├─ Guest PT level 3: GPA_1[entry] + GVA[38:30] → GPA_2
│ └─ EPT walk: GPA_2 → HPA_2

├─ Guest PT level 2: GPA_2[entry] + GVA[29:21] → GPA_3
│ └─ EPT walk: GPA_3 → HPA_3

└─ Guest PT level 1: GPA_3[entry] + GVA[20:12] → GPA (page)
└─ EPT walk: GPA → HPA (final)

HPA: 0x00000001abcde678

Worst case: 4 (guest levels) × 5 (EPT levels each) = 20 memory accesses!

TLB optimization: Cache GVA → HPA directly (VPID/ASID)

VPID / ASID

VPID (Virtual Processor ID) - Intel
ASID (Address Space ID) - AMD

graph LR
Without[Without VPID/ASID] --> Flush1[TLB flush on every<br/>VM switch]

With[With VPID/ASID] --> Tag[TLB entries tagged<br/>with VM ID]
Tag --> NoFlush[No flush needed]

TLB entry with VPID:

[VPID: 1] GVA 0x1000 → HPA 0xabcd1000  (VM 1)
[VPID: 2] GVA 0x1000 → HPA 0xef012000 (VM 2)

I/O Virtualization

Emulation (Software)

sequenceDiagram
participant Guest
participant Hypervisor
participant Driver as Hypervisor Driver
participant Device as Physical Device

Guest->>Guest: I/O instruction (OUT)
Guest->>Hypervisor: VM Exit (I/O)
Hypervisor->>Driver: Emulate device
Driver->>Device: Real I/O
Device-->>Driver: Response
Driver->>Hypervisor: Update guest state
Hypervisor->>Guest: VM Entry

Çatışmazlıqlar:

  • Slow (VM exits)
  • High CPU overhead

Paravirtualization

Guest OS bilir ki, virtualizasiya olunub → special drivers istifadə edir.

graph TB
Guest[Guest OS] --> Para[Paravirtual Driver<br/>virtio, vmxnet3]
Para --> Hypervisor[Hypervisor]
Hypervisor --> Device[Physical Device]

virtio (Linux):

// Guest driver
struct virtio_device {
struct virtqueue *vq;
// ...
};

// Add buffer to queue
virtqueue_add_buf(vq, sg, out, in, data);

// Kick hypervisor
virtqueue_kick(vq);

// Hypervisor processes queue

Üstünlüklər:

  • Faster than emulation
  • Lower overhead

Çatışmazlıqlar:

  • Guest OS must be modified
  • Paravirtual drivers needed

SR-IOV (Single Root I/O Virtualization)

Hardware-level I/O virtualization.

graph TB
Physical[Physical Function<br/>PF]

Physical --> VF1[Virtual Function 1<br/>VF]
Physical --> VF2[Virtual Function 2<br/>VF]
Physical --> VF3[Virtual Function N<br/>VF]

VF1 --> VM1[VM 1<br/>Direct access]
VF2 --> VM2[VM 2<br/>Direct access]
VF3 --> VM3[VM N<br/>Direct access]

Physical --> Config[Hypervisor<br/>Configuration only]

Xüsusiyyətlər:

  • Direct device access from VM
  • Near-native performance
  • Hardware isolation
  • No hypervisor overhead (after setup)

Example: Network card (NIC)

# Enable SR-IOV on physical device
echo 4 > /sys/class/net/eth0/device/sriov_numvfs

# Assign VF to VM
virsh attach-interface vm1 hostdev 0000:03:10.0

Comparison:

MethodPerformanceOverheadGuest SupportHardware
EmulationSlowHighAny OSAny
ParavirtualizationMediumMediumModified OSAny
SR-IOVFastLowNative driverSR-IOV capable

Hypervisor Types

Type 1: Bare-Metal Hypervisor

Runs directly on hardware.

graph TB
Hardware[Hardware] --> Hypervisor[Type 1 Hypervisor<br/>VMware ESXi, Xen, Hyper-V]

Hypervisor --> VM1[VM 1]
Hypervisor --> VM2[VM 2]
Hypervisor --> VM3[VM 3]

Xüsusiyyətlər:

  • Direct hardware access
  • Better performance
  • Enterprise/datacenter

Examples:

  • VMware ESXi
  • Microsoft Hyper-V
  • Xen
  • KVM (with Linux as host)

Type 2: Hosted Hypervisor

Runs on top of an OS.

graph TB
Hardware[Hardware] --> OS[Host OS<br/>Windows, Linux, macOS]
OS --> Hypervisor[Type 2 Hypervisor<br/>VirtualBox, VMware Workstation]

Hypervisor --> VM1[VM 1]
Hypervisor --> VM2[VM 2]

OS --> Apps[Host Applications]

Xüsusiyyətlər:

  • Easier to use
  • Desktop/development
  • Lower performance

Examples:

  • Oracle VirtualBox
  • VMware Workstation / Fusion
  • Parallels Desktop

KVM (Kernel-based Virtual Machine)

Linux kernel-də hypervisor.

graph TB
Hardware[Hardware] --> Linux[Linux Kernel + KVM]

Linux --> QEMU1[QEMU Process = VM 1]
Linux --> QEMU2[QEMU Process = VM 2]
Linux --> Apps[Regular Apps]

QEMU1 --> Guest1[Guest OS]
QEMU2 --> Guest2[Guest OS]

Xüsusiyyətlər:

  • Type 1 performance
  • Linux kernel integration
  • Open source
  • Wide adoption (OpenStack, etc.)
# Check KVM support
lsmod | grep kvm

# Create VM with KVM
qemu-system-x86_64 -enable-kvm -m 2048 -hda disk.img

Containers vs VMs

graph TB
subgraph VMs
HW1[Hardware] --> HV[Hypervisor]
HV --> VM1[VM 1<br/>Guest OS]
HV --> VM2[VM 2<br/>Guest OS]
VM1 --> App1[App]
VM2 --> App2[App]
end

subgraph Containers
HW2[Hardware] --> OS[Host OS]
OS --> Engine[Container Engine<br/>Docker, containerd]
Engine --> C1[Container 1<br/>App + Libs]
Engine --> C2[Container 2<br/>App + Libs]
end

Comparison

AspectVirtual MachinesContainers
IsolationStrong (separate OS)Weaker (shared kernel)
StartupMinutesSeconds
SizeGBs (full OS)MBs (app + libs)
PerformanceNear-nativeNative
Resource usageHighLow
PortabilityGoodExcellent
Use caseDifferent OSes, isolationMicroservices, scale

Container Technologies

Linux Namespaces:

// Isolate resources
clone(CLONE_NEWPID); // PID namespace (separate process tree)
clone(CLONE_NEWNET); // Network namespace
clone(CLONE_NEWNS); // Mount namespace (filesystem)
clone(CLONE_NEWUTS); // Hostname
clone(CLONE_NEWIPC); // IPC
clone(CLONE_NEWUSER); // User/Group IDs

cgroups (Control Groups):

# Limit CPU
echo 50000 > /sys/fs/cgroup/cpu/container1/cpu.cfs_quota_us

# Limit memory
echo 512M > /sys/fs/cgroup/memory/container1/memory.limit_in_bytes

Union filesystems (OverlayFS):

Lower layer: Base image (read-only)
Upper layer: Container changes (read-write)
Merged view: Combined filesystem

When to use VMs vs Containers

Use VMs when:

  • Need different OS kernels (e.g., Linux + Windows)
  • Strong isolation required (security, multi-tenancy)
  • Legacy applications
  • Long-running services

Use Containers when:

  • Same OS kernel
  • Fast startup needed
  • Microservices architecture
  • CI/CD pipelines
  • Scaling (kubernetes)

Hybrid: VMs with containers inside (common in cloud)

Nested Virtualization

VM içində VM çalışdırmaq.

graph TB
Hardware[Physical Hardware] --> L0[L0: Hypervisor]
L0 --> L1_VM[L1: VM<br/>Guest Hypervisor]
L1_VM --> L2_VM1[L2: Nested VM 1]
L1_VM --> L2_VM2[L2: Nested VM 2]

Use cases:

  • Development/testing of hypervisors
  • Cloud providers (customer runs VMs inside rented VM)
  • Training

Performance: Worse than regular VMs (double overhead)

Support:

  • Intel: VT-x supports nested (VMCS shadowing)
  • AMD: AMD-V supports nested
# Enable nested virtualization (Intel)
modprobe -r kvm_intel
modprobe kvm_intel nested=1

# Check
cat /sys/module/kvm_intel/parameters/nested

Live Migration

VM-i bir host-dan digərinə köçürmək (downtime olmadan).

sequenceDiagram
participant Source as Source Host
participant Dest as Destination Host
participant VM

Source->>Dest: Pre-copy (most memory)
Note over Source: VM continues running

Source->>Dest: Iterative copy (dirty pages)
Note over Source: Track modified pages

Source->>Source: Pause VM
Source->>Dest: Final sync (remaining dirty pages)

Dest->>Dest: Resume VM
Note over Dest: VM running on new host

Phases:

  1. Pre-copy: Copy memory while VM runs
  2. Iterative: Copy dirty pages (modified during copy)
  3. Stop-and-copy: Pause VM, copy final state
  4. Resume: Start VM on destination

Downtime: ~100ms - 1s (depends on memory size, network)

Requirements:

  • Shared storage (or storage migration)
  • Same CPU architecture
  • Network connectivity
# KVM/QEMU live migration
virsh migrate --live vm1 qemu+ssh://dest-host/system

Performance Considerations

1. CPU Overhead

VM exit/entry: ~1000-2000 cycles
Frequent exits → performance degradation

Optimization:

  • Use paravirtual drivers
  • Enable VT-x/AMD-V features (VPID, EPT)
  • Pin vCPUs to physical cores (CPU affinity)

2. Memory Overhead

Shadow page tables: 2-10% memory overhead
EPT/NPT: Minimal overhead, but 2D page walk

Optimization:

  • Use EPT/NPT (always)
  • Large pages (2MB, 1GB)
  • Memory ballooning (reclaim unused memory)

3. I/O Performance

Emulation: 10-50% of native
Paravirtual: 80-90% of native
SR-IOV: 95-99% of native

Optimization:

  • Use virtio drivers
  • Use SR-IOV if available
  • NVMe for storage

4. Network Performance

# Enable virtio
<interface type='network'>
<model type='virtio'/>
</interface>

# Use SR-IOV for high performance
<interface type='hostdev'>
<source>
<address type='pci' domain='0x0000' bus='0x03' slot='0x10'/>
</source>
</interface>

Security Considerations

VM Escape

VM-dən host-a çıxmaq (worst-case scenario).

Attack vectors:

  • Hypervisor bugs
  • Shared resources (cache, speculative execution)
  • Device emulation vulnerabilities

Mitigations:

  • Keep hypervisor updated
  • Minimize attack surface
  • Use hardware virtualization features
  • Security patches (Spectre, Meltdown)

Side-Channel Attacks

VM 1 (attacker) → Shared cache → VM 2 (victim)
Measure cache timing → leak information

Examples:

  • Spectre, Meltdown
  • Cache timing attacks

Mitigations:

  • Core scheduling (same VM on both hyperthreads)
  • Flush caches on context switch
  • Disable hyperthreading

Isolation Best Practices

  1. Separate sensitive workloads
  2. Use different physical hosts
  3. Security updates
  4. Monitoring and auditing
  5. Network segmentation

Praktik Tools

# Check virtualization support
lscpu | grep Virtualization

# Intel
grep vmx /proc/cpuinfo

# AMD
grep svm /proc/cpuinfo

# KVM
lsmod | grep kvm
virsh list --all

# Docker
docker ps
docker run -it ubuntu bash

# Performance monitoring
virsh domstats vm1
perf kvm stat record -a
perf kvm stat report

Best Practices

  1. Enable hardware virtualization

    • VT-x/AMD-V in BIOS
    • EPT/NPT support
  2. Right-size VMs

    • Don't over-provision resources
    • Monitor actual usage
  3. Use paravirtual drivers

    • virtio for Linux
    • VMware Tools / Hyper-V Integration Services
  4. CPU pinning (for latency-sensitive workloads)

    <vcpu placement='static' cpuset='0-3'>4</vcpu>
  5. NUMA awareness

    # Pin VM to NUMA node
    numactl --cpunodebind=0 --membind=0 qemu-system-x86_64 ...
  6. Monitoring

    • CPU usage (steal time)
    • Memory (ballooning, swapping)
    • I/O performance

Əlaqəli Mövzular

  • CPU Architecture: Privilege levels, rings
  • Memory Hierarchy: Page tables, TLB
  • I/O Systems: Device access
  • Security: Isolation, side-channels
  • Performance: Overhead, optimization