Video Encoder

Bir video kodlayıcı (encoder), ham, sıkıştırılmamış video karelerini sıkıştırılmış dijital video formatına dönüştüren özel bir cihaz veya yazılım programıdır. Video kodlamanın temel amacı, kabul edilebilir bir görsel kalite seviyesini korurken video verilerinin dosya boyutunu küçültmek, böylece depolama, ağlar üzerinden iletim ve oynatma için uygun hale getirmektir.

Neden İhtiyaç Duyulur?
Sıkıştırılmamış video verileri aşırı derecede büyüktür. Örneğin, 30 kare/saniye (FPS) ve 24-bit renk derinliğine sahip bir 1080p video, saniyede yüzlerce megabayt tüketebilir. Sıkıştırma olmasaydı, yüksek çözünürlüklü video akışı veya kısa klipleri bile depolamak, bant genişliği sınırlamaları ve depolama maliyetleri nedeniyle pratik olmazdı. Video kodlayıcılar, gereksiz bilgileri ortadan kaldırmak için çeşitli algoritmalar kullanarak veri boyutunu önemli ölçüde küçültür.

Nasıl Çalışır (Temel Kavramlar):
Video kodlama, hem *uzaysal* (spatial) hem de *zamansal* (temporal) sıkıştırma tekniklerine dayanır ve genellikle bunları *kayıplı* (lossy) bir şekilde birleştirir (yani bazı bilgiler kalıcı olarak atılır).

1. Uzaysal Sıkıştırma (Kare içi - Intra-frame): Resim sıkıştırmaya (JPEG gibi) benzer şekilde, bu teknik tek bir video karesi içindeki fazlalığı azaltır.
* Renk Alanı Dönüşümü: Genellikle RGB'yi YUV'ye (Luma-Krominans) dönüştürür, burada insan gözü luma'ya (parlaklık) krominans'tan (renk) daha duyarlıdır. Krominans bileşenleri daha sonra alt örneklemeye tabi tutulabilir (örn. 4:2:0).
* Dönüşüm (Transform): Uzaysal piksel verilerini frekans alanı katsayılarına dönüştürmek için matematiksel dönüşümler (Ayrık Kosinüs Dönüşümü - DCT veya Dalgacık Dönüşümü) uygular. Bu, enerjiyi daha az katsayıda gruplandırır.
* Kuantizasyon (Quantization): Kayıplı sıkıştırmanın birincil *kayıp* kaynağıdır. Dönüşüm katsayılarının hassasiyetini, onları bir kuantizasyon adım boyutuna bölerek ve yuvarlayarak azaltır. Daha yüksek kuantizasyon, daha fazla sıkıştırma ancak daha düşük kalite anlamına gelir.
* Entropi Kodlama: Kuantize edilmiş katsayıların boyutunu daha da küçültmek için kayıpsız sıkıştırma teknikleri (Huffman kodlaması veya aritmetik kodlama gibi) uygular.

2. Zamansal Sıkıştırma (Kareler arası - Inter-frame): Bu teknik, art arda gelen video kareleri arasındaki fazlalığı kullanır. Çoğu video karesi, hemen önceki veya sonraki karelere çok benzerdir.
* Hareket Tahmini/Telafisi: Tüm yeni kareyi kodlamak yerine, kodlayıcı önceki veya sonraki karelerden hareket etmiş piksel bloklarını (makrobloklar veya daha küçük bölümler) bulmaya çalışır. Daha sonra sadece *hareket vektörlerini* (hareketin yönünü ve mesafesini gösteren) ve *kalan farkı* (hareketle açıklanamayan küçük değişiklikler) kodlar.
* Kare Tipleri (GOP - Resim Grupları):
* I-frame (Intra-coded frame): Tamamen bağımsız bir kare olup, yalnızca uzaysal sıkıştırma kullanılarak, tek başına bir JPEG görüntüsü gibi kodlanır. Video akışında rastgele erişim noktaları olarak görev yaparlar.
* P-frame (Predicted frame): Önceki I veya P karelerine başvurarak kodlanır. Geçmiş bir referans kareye göre hareket vektörlerini ve kalıntıları depolar.
* B-frame (Bi-directionally predicted frame): Hem önceki hem de sonraki I veya P karelerine başvurarak kodlanır, genellikle en yüksek sıkıştırma oranını sağlar ancak daha karmaşık kod çözme gerektirir.
* Bir GOP tipik olarak bir I-kare ile başlayan, ardından P ve B karelerin geldiği bir dizidir.

Temel Kavramlar:
* Kodek (Coder-Decoder): Dijital verileri kodlamak ve kod çözmek için kullanılan bir algoritma veya cihaz. Yaygın video kodekleri arasında H.264 (AVC), H.265 (HEVC), VP9 ve AV1 bulunur.
* Bit Hızı (Bit Rate): Birim zamanda kodlanan veri miktarı, genellikle saniyede bit (bps) olarak ölçülür. Daha yüksek bit hızları genellikle daha yüksek kalite ancak daha büyük dosya boyutları anlamına gelir.
* Çözünürlük (Resolution): Video karesinin boyutları (örn. Full HD için 1920x1080).
* Kare Hızı (Frame Rate): Saniyede görüntülenen kare sayısı (örn. 24, 30, 60 FPS).

Yazılım vs. Donanım Kodlayıcılar:
* Yazılım Kodlayıcılar: Genel amaçlı bir CPU üzerinde çalışır. Yüksek esneklik, kalite ayarları ve kodek desteği sunarlar (örn. x264, x265 kütüphaneleri). Çok CPU yoğun olabilirler.
* Donanım Kodlayıcılar: Özel çiplerdir (örn. Nvidia GPU'lar için NVENC, AMD için VCE/AMF, Intel için Quick Sync Video, mobil cihazlarda ASIC'ler). Hız ve güç verimliliği için oldukça optimize edilmişlerdir ancak daha az ayarlama seçeneği sunabilir ve bazen aynı bit hızında en iyi yazılım kodlayıcılardan biraz daha düşük kalite sağlayabilirler.

Yaygın Kütüphaneler/Çerçeveler:
* FFmpeg: Yüksek düzeyde optimize edilmiş yazılım kodlayıcılar (örn. `libx264`, `libx265`, `libvpx`, `libaom`) içeren ve donanım kodlayıcılara arayüz sağlayan güçlü, açık kaynaklı bir multimedya çerçevesidir.
* GStreamer: Kodlama görevleri için kullanılabilen, genellikle FFmpeg veya donanım kodlayıcıları eklenti olarak kullanan, ardışık düzen tabanlı bir multimedya çerçevesidir.
* VAAPI (Video Hızlandırma API'si) / NVENC (NVIDIA Kodlayıcı) / AMF (AMD Medya Çerçevesi): Yazılımların altta yatan donanım video kodlayıcılarını kullanmasına izin veren API'ler.

Bir video kodlayıcı, dijital video iş akışında kritik bir bileşendir ve çeşitli platformlarda ve cihazlarda video içeriğinin verimli bir şekilde oluşturulmasını, dağıtılmasını ve tüketilmesini sağlar.

Example Code

```rust
use ffmpeg_next as ffmpeg;
use std::{path::Path, error::Error, fs};

// Function to generate a simple gradient frame (YUV 4:2:0)
// This simulates receiving raw video frames.
fn generate_gradient_frame(
    width: i32,
    height: i32,
    frame_number: usize,
) -> Result<ffmpeg::frame::Video, Box<dyn Error>> {
    // Create a new video frame with YUV420P pixel format
    let mut frame = ffmpeg::frame::Video::new(
        ffmpeg::format::Pixel::YUV420P,
        width,
        height,
    );
    // Set the presentation timestamp (PTS) for the frame
    frame.set_pts(Some(frame_number as i64));

    // Get mutable references to the data planes and their strides
    // YUV420P has three planes: Y (Luma), U (Chroma Blue), V (Chroma Red)
    // Y plane is full resolution, U and V planes are half width and half height.

    // Fill Y (Luma) plane: a simple vertical gradient from black to white
    let y_stride = frame.stride(0); // Stride for Y plane
    let y_data = frame.data_mut(0); // Data for Y plane
    for y in 0..height {
        for x in 0..width {
            // Calculate a gradient value based on x-coordinate
            let value = ((x as f32 / width as f32) * 255.0) as u8;
            y_data[(y * y_stride + x) as usize] = value;
        }
    }

    // Fill U and V (Chroma) planes: simple oscillating constant values for demonstration
    // U and V planes are typically quarter size for YUV420P (half width, half height)
    let uv_width = width / 2;
    let uv_height = height / 2;
    let u_stride = frame.stride(1); // Stride for U plane
    let u_data = frame.data_mut(1); // Data for U plane
    let v_stride = frame.stride(2); // Stride for V plane
    let v_data = frame.data_mut(2); // Data for V plane

    // Calculate oscillating U and V values to make the color change over time
    let u_val = (128.0 + (frame_number as f32 * 0.5).sin() * 60.0) as u8;
    let v_val = (128.0 + (frame_number as f32 * 0.5).cos() * 60.0) as u8;

    for y in 0..uv_height {
        for x in 0..uv_width {
            u_data[(y * u_stride + x) as usize] = u_val;
            v_data[(y * v_stride + x) as usize] = v_val;
        }
    }

    Ok(frame)
}

/
 * Encodes a series of generated video frames into an MP4 file using the H.264 codec.
 * 
 * This function demonstrates the core steps of video encoding using the `ffmpeg-next` crate:
 * 1. Initialize FFmpeg.
 * 2. Create an output format context for the MP4 file.
 * 3. Find and configure a video encoder (H.264 in this case).
 * 4. Add a video stream to the output context.
 * 5. Write the file header.
 * 6. Generate raw video frames and send them to the encoder.
 * 7. Receive compressed packets from the encoder and write them to the output file.
 * 8. Flush the encoder to ensure all buffered frames are processed.
 * 9. Write the file trailer.
 */
pub fn encode_video(output_path: &Path, width: i32, height: i32, frames: usize, fps: i32) -> Result<(), Box<dyn Error>> {
    // Initialize FFmpeg library
    ffmpeg::init()?;
    // Set FFmpeg logging level for verbose output
    ffmpeg::log::set_level(ffmpeg::log::Level::Info);

    // Create an output format context for the specified output path
    // FFmpeg determines the format (e.g., MP4) from the file extension.
    let mut octx = ffmpeg::format::context::Output::new(output_path)?; 

    // Find the H.264 video encoder by its ID
    let encoder_id = ffmpeg::codec::Id::H264;
    let encoder_codec = ffmpeg::encoder::find(encoder_id)
        .ok_or("H264 encoder not found. Ensure FFmpeg is built with libx264.")?;

    // Add a new video stream to the output context using the found encoder
    let mut stream = octx.add_stream(encoder_codec)?;

    // Create a new codec context to configure the encoder
    let mut codec_context = ffmpeg::codec::context::Context::new();
    codec_context.set_width(width); // Set video width
    codec_context.set_height(height); // Set video height
    codec_context.set_pixel_format(ffmpeg::format::Pixel::YUV420P); // Set pixel format (YUV 4:2:0 is common)
    codec_context.set_time_base(ffmpeg::media::TimeBase::new(1, fps)); // Set time base (e.g., 1/30 for 30 FPS)
    codec_context.set_frame_rate(Some(fps)); // Set desired frame rate
    codec_context.set_bit_rate(400_000); // Set target bitrate in bits per second (400 kbps)
    codec_context.set_gop_size(10); // Group of Pictures (GOP) size: Keyframe interval
    codec_context.set_max_b_frames(1); // Maximum number of B-frames between I/P frames
    // Set GLOBAL_HEADER flag, often required for formats like MP4 for proper stream headers
    codec_context.set_flags(ffmpeg::codec::Flags::GLOBAL_HEADER);

    // Create and open the video encoder using the configured context
    // .open_as() allows passing codec-specific options. Here we use automatic threading.
    let mut encoder = codec_context.encoder().video()?;
    encoder.open_as(encoder_codec, ffmpeg::codec::threading::Type::Auto)?; 

    // Apply the configured encoder parameters to the output stream
    stream.set_parameters(encoder.parameters());

    // Write the file header (metadata, stream info) to the output file
    octx.write_header()?;

    // Loop through and encode each frame
    for i in 0..frames {
        // Generate a dummy video frame
        let frame = generate_gradient_frame(width, height, i)?;

        // Send the raw frame to the encoder
        encoder.send_frame(&frame)?; 

        // Receive compressed packets from the encoder until it runs out
        let mut packet = ffmpeg::packet::Packet::empty();
        while encoder.receive_packet(&mut packet).is_ok() {
            // Set the stream index for the packet (important for muxing multiple streams)
            packet.set_stream_index(stream.index());
            // Write the compressed packet to the output file
            packet.write_interleaved(&mut octx)?;
        }
    }

    // Flush the encoder: Send a null frame (EOF) to ensure all buffered frames are processed
    encoder.send_eof()?; 
    let mut packet = ffmpeg::packet::Packet::empty();
    while encoder.receive_packet(&mut packet).is_ok() {
        packet.set_stream_index(stream.index());
        packet.write_interleaved(&mut octx)?;
    }

    // Write the file trailer (e.g., index table) to finalize the output file
    octx.write_trailer()?; 

    println!("Video successfully encoded to: {:?}", output_path);
    Ok(())
}

fn main() {
    let output_file = Path::new("output_video.mp4");
    let width = 640;
    let height = 480;
    let num_frames = 150; // Total frames (e.g., 5 seconds at 30 fps)
    let frame_rate = 30; // Frames per second

    // Clean up any previous output file to ensure a fresh run
    if output_file.exists() {
        fs::remove_file(output_file).unwrap();
    }

    // Call the video encoding function and handle potential errors
    match encode_video(output_file, width, height, num_frames, frame_rate) {
        Ok(_) => println!("Encoding process completed successfully."),
        Err(e) => eprintln!("Error during video encoding: {}", e),
    }
}
```

Example Code

Related Topics