Əsas məzmuna keçin

I/O Sistemləri

I/O Systems Nədir?

I/O (Input/Output) sistemləri - CPU və xarici cihazlar (disk, klaviatura, network card) arasında məlumat mübadiləsini təmin edir.

graph TB
CPU[CPU] <--> Memory[Memory]
CPU <--> IOController[I/O Controller]
IOController <--> Disk[Disk]
IOController <--> Network[Network Card]
IOController <--> USB[USB Devices]

Memory <-.DMA.-> IOController

I/O Device Types

1. Block Devices

Bloklar (məsələn, 512 bytes, 4KB) şəklində məlumat ötürür.

Xüsusiyyətlər:

  • Random access
  • Bufferable
  • Addressable

Nümunələr:

  • Hard disk (HDD)
  • Solid State Drive (SSD)
  • USB flash drive

2. Character Devices

Byte stream şəklində məlumat ötürür.

Xüsusiyyətlər:

  • Sequential access
  • No random access
  • Not addressable

Nümunələr:

  • Keyboard
  • Mouse
  • Serial port
  • Network card

3. Network Devices

Paketlər (packets) şəklində məlumat ötürür.

Nümunələr:

  • Ethernet card
  • Wi-Fi adapter

I/O Methods

1. Programmed I/O (Polling)

CPU aktiv şəkildə cihazın statusunu yoxlayır.

// Polling example
void read_data() {
while (!(io_status_register & READY_BIT)) {
// Busy wait (waste CPU cycles!)
}
data = io_data_register;
}
sequenceDiagram
participant CPU
participant Device

loop Polling
CPU->>Device: Check status
Device-->>CPU: Not ready
CPU->>Device: Check status
Device-->>CPU: Not ready
end

CPU->>Device: Check status
Device-->>CPU: Ready!
CPU->>Device: Read data
Device-->>CPU: Data

Üstünlüklər:

  • Sadə implementasiya
  • Aşağı latency (əgər cihaz tez cavab verirsə)

Çatışmazlıqlar:

  • CPU cycles waste
  • Inefficient (xüsusilə slow devices üçün)
  • Başqa işlər görə bilmir

2. Interrupt-Driven I/O

Cihaz hazır olduqda CPU-ya interrupt göndərir.

sequenceDiagram
participant CPU
participant Device
participant ISR as Interrupt Handler

CPU->>Device: Start I/O operation
Note over CPU: Continue other work

Device->>Device: Perform I/O
Device->>CPU: Send interrupt

CPU->>ISR: Jump to handler
ISR->>Device: Read data
Device-->>ISR: Data
ISR->>CPU: Return from interrupt

Note over CPU: Resume work

Interrupt Handling Steps:

  1. Save context - Registers, program counter
  2. Identify interrupt - Which device?
  3. Run ISR (Interrupt Service Routine)
  4. Restore context - Continue execution

x86 Interrupt Example:

; Interrupt Descriptor Table (IDT)
idt_entry:
dw isr_address_low
dw code_segment
db 0
db flags
dw isr_address_high

; Interrupt Service Routine
isr_keyboard:
push rax
push rbx
; ... save registers

in al, 0x60 ; Read from keyboard port
; Process keystroke

mov al, 0x20 ; EOI (End Of Interrupt)
out 0x20, al ; Send to PIC

; ... restore registers
pop rbx
pop rax
iret ; Return from interrupt

Üstünlüklər:

  • CPU multitasking edə bilir
  • Efficient
  • Low CPU overhead

Çatışmazlıqlar:

  • Context switch overhead
  • Interrupt storm (çox interrupt)
  • Latency (interrupt handling time)

3. Direct Memory Access (DMA)

Cihaz birbaşa memory-yə yazır, CPU-nun müdaxiləsi olmadan.

sequenceDiagram
participant CPU
participant DMA as DMA Controller
participant Device
participant Memory

CPU->>DMA: Configure transfer<br/>(source, dest, size)
Note over CPU: Continue other work

loop Transfer
Device->>DMA: Data
DMA->>Memory: Write data
end

DMA->>CPU: Send interrupt (transfer complete)

DMA Configuration:

struct dma_descriptor {
uint64_t source_address;
uint64_t dest_address;
uint32_t byte_count;
uint32_t control;
};

void setup_dma_transfer() {
dma_descriptor desc;
desc.source_address = disk_buffer_address;
desc.dest_address = memory_address;
desc.byte_count = 4096; // 4KB
desc.control = DMA_READ | DMA_INTERRUPT_ON_COMPLETE;

// Start DMA
dma_controller->start(&desc);
}

DMA Transfer Modes:

  1. Burst mode - Bütün transfer bir dəfəyə
  2. Cycle stealing - Hər cycle bir byte
  3. Transparent mode - CPU idle olduqda

Üstünlüklər:

  • CPU-nu azad edir
  • High throughput
  • Low CPU overhead

Çatışmazlıqlar:

  • Bus contention (CPU və DMA bus-da rəqabət)
  • Cache coherency issues
  • Kompleks hardware

Memory-Mapped I/O

I/O registers memory address space-də görünür.

graph TB
AddressSpace[Address Space] --> RAM[RAM<br/>0x00000000-0x7FFFFFFF]
AddressSpace --> MMIO[Memory-Mapped I/O<br/>0x80000000-0xFFFFFFFF]

MMIO --> UART[UART<br/>0x10000000]
MMIO --> GPIO[GPIO<br/>0x20000000]
MMIO --> Timer[Timer<br/>0x30000000]
MMIO --> Disk[Disk Controller<br/>0x40000000]

Example: UART Communication

// Memory-mapped UART registers
#define UART_BASE 0x10000000
#define UART_DATA (*(volatile uint32_t*)(UART_BASE + 0x00))
#define UART_STATUS (*(volatile uint32_t*)(UART_BASE + 0x04))
#define UART_CONTROL (*(volatile uint32_t*)(UART_BASE + 0x08))

#define UART_TX_READY (1 << 0)
#define UART_RX_READY (1 << 1)

void uart_send_char(char c) {
// Wait until transmitter ready
while (!(UART_STATUS & UART_TX_READY));

// Write character
UART_DATA = c;
}

char uart_recv_char() {
// Wait until data available
while (!(UART_STATUS & UART_RX_READY));

// Read character
return UART_DATA;
}

Üstünlüklər:

  • Unified address space
  • Standard load/store instructions
  • Cache-able (if appropriate)

Çatışmazlıqlar:

  • Address space consumption
  • Cache issues (need volatile)

Port-Mapped I/O (x86)

Ayrı I/O address space.

; x86 IN/OUT instructions
in al, 0x60 ; Read from port 0x60 (keyboard)
out 0x64, al ; Write to port 0x64 (keyboard controller)

in eax, dx ; Read from port in DX register
out dx, eax ; Write to port in DX register

Comparison:

XüsusiyyətMemory-MappedPort-Mapped
Address spaceShared with RAMSeparate
InstructionsLoad/StoreIN/OUT (x86)
CachePossible issueNot cached
ExamplesARM, RISC-Vx86 (legacy)

Interrupt Handling

Interrupt Types

graph TD
Interrupts[Interrupts] --> Hardware[Hardware Interrupts]
Interrupts --> Software[Software Interrupts]
Interrupts --> Exceptions[Exceptions]

Hardware --> External[External<br/>IRQ from devices]
Hardware --> Maskable[Maskable<br/>Can be disabled]
Hardware --> NonMaskable[Non-Maskable<br/>NMI]

Software --> SyscallInt[System Call<br/>int 0x80]

Exceptions --> Fault[Fault<br/>Page fault]
Exceptions --> Trap[Trap<br/>Breakpoint]
Exceptions --> Abort[Abort<br/>Hardware error]

Interrupt Controller

8259 PIC (Programmable Interrupt Controller) - Legacy

#define PIC1_COMMAND 0x20
#define PIC1_DATA 0x21
#define PIC2_COMMAND 0xA0
#define PIC2_DATA 0xA1

void pic_end_of_interrupt(uint8_t irq) {
if (irq >= 8) {
// Send EOI to slave
outb(PIC2_COMMAND, 0x20);
}
// Send EOI to master
outb(PIC1_COMMAND, 0x20);
}

APIC (Advanced Programmable Interrupt Controller) - Modern

#define APIC_BASE 0xFEE00000
#define APIC_EOI (APIC_BASE + 0xB0)

void apic_eoi() {
*(volatile uint32_t*)APIC_EOI = 0;
}

Interrupt Priority

graph TD
Priority[Interrupt Priority] --> P1[1. NMI<br/>Non-maskable]
Priority --> P2[2. Exceptions<br/>Faults, Traps]
Priority --> P3[3. Hardware IRQ<br/>By priority level]
Priority --> P4[4. Software<br/>int instruction]

P3 --> High[High: Timer, Keyboard]
P3 --> Low[Low: Disk, Network]

Nested Interrupts

sequenceDiagram
participant CPU
participant ISR1 as Low Priority ISR
participant ISR2 as High Priority ISR

Note over CPU: Running normal code

activate ISR1
Note over ISR1: Low priority interrupt

activate ISR2
Note over ISR2: High priority interrupt<br/>(preempts ISR1)
deactivate ISR2

Note over ISR1: Resume ISR1
deactivate ISR1

Note over CPU: Resume normal code

Enabling nested interrupts:

void isr_handler() {
// Save context
save_registers();

// Re-enable interrupts (allow nesting)
enable_interrupts();

// Handle interrupt
handle_device();

// Disable interrupts
disable_interrupts();

// Send EOI
apic_eoi();

// Restore context
restore_registers();
}

Bus Architecture

Bus Types

graph TB
Buses[Buses] --> System[System Bus]
Buses --> IO[I/O Bus]
Buses --> Expansion[Expansion Bus]

System --> FSB[Front-Side Bus<br/>CPU-Memory]
System --> Memory[Memory Bus]

IO --> PCI[PCI/PCIe]
IO --> SATA[SATA]
IO --> USB[USB]

Expansion --> PCIeSlots[PCIe Slots]
Expansion --> M2[M.2 Slots]

Bus Signals

3 növ signal:

  1. Data lines - Məlumat
  2. Address lines - Address
  3. Control lines - Read/Write, Clock, etc.
graph LR
CPU[CPU] -->|Address| Bus[Bus]
CPU -->|Data| Bus
CPU -->|Control| Bus

Bus -->|Address| Memory[Memory]
Bus -->|Data| Memory
Bus -->|Control| Memory

Bus -->|Address| Device[I/O Device]
Bus -->|Data| Device
Bus -->|Control| Device

Bus Arbitration

Bir neçə cihaz bus-dan istifadə etmək istəyərsə:

1. Daisy Chain

graph LR
CPU[CPU/Arbiter] -->|Grant| D1[Device 1]
D1 -->|Grant| D2[Device 2]
D2 -->|Grant| D3[Device 3]

D1 -.Request.-> CPU
D2 -.Request.-> CPU
D3 -.Request.-> CPU

2. Centralized Arbitration

graph TB
Arbiter[Bus Arbiter]

D1[Device 1] -->|Request| Arbiter
D2[Device 2] -->|Request| Arbiter
D3[Device 3] -->|Request| Arbiter

Arbiter -->|Grant| D1
Arbiter -->|Grant| D2
Arbiter -->|Grant| D3

3. Distributed Arbitration

Hər cihaz özü arbitration edir (məsələn, Ethernet CSMA/CD).

PCIe (PCI Express)

Modern high-speed serial bus.

PCIe Topology

graph TB
CPU[CPU] <--> Root[PCIe Root Complex]

Root <--> GPU[GPU<br/>x16]
Root <--> NVMe[NVMe SSD<br/>x4]
Root <--> Switch[PCIe Switch]

Switch <--> NIC[Network Card<br/>x1]
Switch <--> Sound[Sound Card<br/>x1]

PCIe Lanes

ConfigurationLanesBandwidth (PCIe 3.0)Bandwidth (PCIe 4.0)
x11~1 GB/s~2 GB/s
x44~4 GB/s~8 GB/s
x88~8 GB/s~16 GB/s
x1616~16 GB/s~32 GB/s

PCIe Generations:

graph LR
PCIe1[PCIe 1.0<br/>250 MB/s per lane] --> PCIe2[PCIe 2.0<br/>500 MB/s]
PCIe2 --> PCIe3[PCIe 3.0<br/>~1 GB/s]
PCIe3 --> PCIe4[PCIe 4.0<br/>~2 GB/s]
PCIe4 --> PCIe5[PCIe 5.0<br/>~4 GB/s]

PCIe Configuration Space

// PCIe Configuration Space Access
uint32_t pcie_read_config(uint8_t bus, uint8_t device,
uint8_t function, uint8_t offset) {
uint32_t address = (1 << 31) | (bus << 16) |
(device << 11) | (function << 8) |
(offset & 0xFC);
outl(0xCF8, address);
return inl(0xCFC);
}

Configuration registers:

  • Vendor ID, Device ID
  • Command, Status
  • Base Address Registers (BAR) - Memory-mapped I/O addresses
  • Interrupt Line

I/O Performance

Latency vs Throughput

graph TB
Metric[I/O Metrics] --> Latency[Latency<br/>Time for one operation]
Metric --> Throughput[Throughput<br/>Operations per second]
Metric --> IOPS[IOPS<br/>I/O Operations Per Second]
Metric --> Bandwidth[Bandwidth<br/>MB/s or GB/s]

Latency --> L1[Seek time<br/>Rotational latency<br/>Transfer time]

Throughput --> T1[Queue depth<br/>Parallelism<br/>Caching]

I/O Bottlenecks

1. CPU overhead

// High CPU usage
while (data_available()) {
process(read_data()); // Polling
}

2. Bus saturation

PCIe x1 bandwidth: ~1 GB/s
Multiple devices competing → bottleneck

3. Device speed

HDD: ~100 MB/s, 100 IOPS
SSD: ~3000 MB/s, 500k IOPS
NVMe: ~7000 MB/s, 1M IOPS

Optimizations

1. Batching

// Instead of:
for (int i = 0; i < 1000; i++) {
write_one_byte(data[i]); // 1000 I/O operations
}

// Do:
write_buffer(data, 1000); // 1 I/O operation

2. Async I/O

// Linux io_uring
struct io_uring ring;
io_uring_queue_init(128, &ring, 0);

// Submit multiple requests
for (int i = 0; i < 10; i++) {
struct io_uring_sqe *sqe = io_uring_get_sqe(&ring);
io_uring_prep_read(sqe, fd, buffers[i], size, offset);
}
io_uring_submit(&ring);

// Wait for completions
struct io_uring_cqe *cqe;
io_uring_wait_cqe(&ring, &cqe);

3. Zero-Copy

// sendfile() - kernel-to-kernel transfer
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);

// splice() - pipe-based zero-copy
ssize_t splice(int fd_in, loff_t *off_in, int fd_out,
loff_t *off_out, size_t len, unsigned int flags);

Real-World Examples

1. Network Card (NIC)

sequenceDiagram
participant App
participant Kernel
participant NIC
participant Network

App->>Kernel: send(socket, data)
Kernel->>Kernel: Copy to kernel buffer
Kernel->>NIC: DMA transfer
NIC->>Network: Transmit packet

Network->>NIC: Receive packet
NIC->>Kernel: DMA to ring buffer
NIC->>CPU: Interrupt (packet received)
Kernel->>App: recv() returns data

Ring buffer:

struct rx_descriptor {
uint64_t buffer_address;
uint16_t length;
uint16_t checksum;
uint8_t status;
};

struct rx_ring {
struct rx_descriptor descriptors[256];
uint16_t head;
uint16_t tail;
};

2. Disk I/O (NVMe)

graph TB
App[Application] --> VFS[VFS Layer]
VFS --> FS[File System<br/>ext4, xfs]
FS --> Block[Block Layer]
Block --> NVMe[NVMe Driver]
NVMe --> Device[NVMe Device]

Device -.DMA.-> Memory[Memory]
Device -.Interrupt.-> CPU[CPU]

NVMe submission queue:

struct nvme_command {
uint8_t opcode;
uint8_t flags;
uint16_t command_id;
uint32_t nsid; // Namespace ID
uint64_t metadata;
uint64_t prp1; // Physical Region Page
uint64_t prp2;
// Command-specific fields
};

void submit_nvme_read(uint64_t lba, uint32_t block_count) {
struct nvme_command cmd = {0};
cmd.opcode = NVME_CMD_READ;
cmd.nsid = 1;
cmd.prp1 = buffer_physical_address;
// ... set LBA and block count

// Write to submission queue
submission_queue[tail] = cmd;
tail = (tail + 1) % queue_size;

// Ring doorbell
writel(tail, nvme_doorbell_register);
}

Best Practices

  1. Use DMA for large transfers

    if (size > 4096) {
    dma_transfer(src, dst, size);
    } else {
    memcpy(dst, src, size);
    }
  2. Minimize interrupts

    • Interrupt coalescing
    • Polling mode for high-speed devices
  3. Async I/O

    • Overlap computation with I/O
    • Use io_uring, epoll, select
  4. Cache I/O requests

    • Page cache (Linux)
    • Write-back caching
  5. NUMA awareness

    // Allocate memory close to device
    numa_alloc_onnode(size, device_numa_node);

Əlaqəli Mövzular

  • Memory Hierarchy: DMA və cache coherency
  • Cache Memory: I/O buffer caching
  • Storage Systems: Disk controllers
  • Virtualization: I/O virtualization (SR-IOV)
  • Performance: I/O bottlenecks