Large Language Models don’t generate text deterministically - they use probabilistic sampling to select the next token based on prediction probabilities. How these probabilities are filtered and adjusted before sampling significantly impacts output quality.
This guide explains the key sampling parameters, how they interact, and provides recommended settings for different use cases.
Framework Reference Last updated: November 2025
Parameter Comparison Parameter llama.cpp Default Ollama Default MLX Temperature --temp 0.8 temperature 0.8 --temp Top P --top-p 0.9 top_p 0.9 --top-p Min P --min-p 0.1 min_p 0.0 --min-p Top K --top-k 40 top_k 40 --top-k Repeat Penalty --repeat-penalty 1.0 repeat_penalty 1.1 Unsupported Repeat Last N --repeat-last-n 64 repeat_last_n 64 Unsupported Presence Penalty --presence-penalty 0.0 presence_penalty - Unsupported Frequency Penalty --frequency-penalty 0.0 frequency_penalty - Unsupported Mirostat --mirostat 0 mirostat 0 Unsupported Mirostat Tau --mirostat-ent 5.0 mirostat_tau 5.0 Unsupported Mirostat Eta --mirostat-lr 0.1 mirostat_eta 0.1 Unsupported Top N Sigma --top-nsigma -1.0 Unsupported - Unsupported Typical P --typical 1.0 typical_p 1.0 Unsupported XTC Probability --xtc-probability 0.0 Unsupported - --xtc-probability XTC Threshold --xtc-threshold 0.1 Unsupported - --xtc-threshold DRY Multiplier --dry-multiplier 0.0 Unsupported - Unsupported DRY Base --dry-base 1.75 Unsupported - Unsupported Dynamic Temp --dynatemp-range 0.0 Unsupported - Unsupported Seed --seed -1 seed 0 - Context Size --ctx-size 2048 num_ctx 2048 - Max Tokens --predict -1 num_predict -1 - Notable Default Differences Parameter llama.cpp Ollama Note min_p 0.1 0.0 Ollama disables Min P by default repeat_penalty 1.0 1.1 Ollama applies penalty by default seed -1 (random) 0 Different random behaviour Feature Support Feature llama.cpp Ollama MLX Core (temp, top_p, top_k, min_p) ✓ ✓ ✓ Repetition penalties ✓ ✓ ✗ Presence/frequency penalties ✓ ✓ ✗ Mirostat ✓ ✓ ✗ Advanced (DRY, XTC, typical, dynatemp) ✓ ✗ Partial Custom sampler ordering ✓ ✗ ✗ Core Sampling Parameters Temperature Controls the randomness of token selection by modifying the probability distribution before sampling.
...