Understanding Results
This guide explains the structure of Magika.jl results and how to interpret them effectively.
Result Structure Overview
A MagikaResult contains three main components:
- Path information: The file path or identifier that was analyzed
- Status: Whether the identification was successful
- Prediction details (if successful): Comprehensive information about the detected content type
Checking Result Status
Always check the status before accessing prediction details:
result = identify_path(m, "file.txt")
if is_ok(result)
# Safe to access prediction details
println("Detection successful!")
else
println("Detection failed with status: $(result.status)")
endStatus Types
Magika.jl defines several status codes:
OK: Detection was successfulFILE_NOT_FOUND_ERROR: The file path doesn't existPERMISSION_ERROR: Unable to access the file due to permissionsUNKNOWN_ERROR: An unexpected error occurred
Prediction Details
When status is OK, the result.prediction field contains:
1. Deep Learning Output (dl)
The raw prediction from the neural network model:
label: The content type label (e.g.,python,jpeg)description: Human-readable description (e.g., "Python source")mime_type: Standard MIME type (e.g., "text/x-python")group: Content category (e.g., "code", "document", "image")extensions: Common file extensions for this typeis_text: Whether this is a text-based format
2. Final Output (output)
The final content type after applying confidence thresholds and overwrite rules:
- Same fields as
dl, but potentially modified based on confidence
3. Confidence Score
score: A float between 0.0 and 1.0 indicating prediction confidence- Higher scores indicate more reliable predictions
4. Overwrite Reason
Explains why the final output might differ from the raw model prediction:
NONE: No overwrite occurredLOW_CONFIDENCE: Downgraded due to low confidence scoreOVERWRITE_MAP: Changed according to predefined mapping rules
Example: Complete Result Analysis
function analyze_result(result::MagikaResult)
println("Path: $(result.path)")
if !is_ok(result)
println("Status: $(result.status)")
return
end
pred = result.prediction
println("\nRaw model prediction (dl):")
println(" Label: $(pred.dl.label)")
println(" Description: $(pred.dl.description)")
println(" MIME type: $(pred.dl.mime_type)")
println(" Group: $(pred.dl.group)")
println(" Extensions: $(join(pred.dl.extensions, \", \"))")
println(" Is text: $(pred.dl.is_text)")
println("\nFinal output:")
println(" Label: $(pred.output.label)")
println(" Description: $(pred.output.description)")
println("\nConfidence score: $(pred.score)")
println("\nOverwrite reason: $(pred.overwrite_reason)")
if pred.overwrite_reason != NONE
println(" Note: The raw prediction was modified because of low confidence")
println(" or predefined mapping rules. Consider the confidence score")
println(" when making decisions based on this result.")
end
# Practical interpretation
if pred.score > 0.95
println("\nInterpretation: Very high confidence prediction")
elseif pred.score > 0.85
println("\nInterpretation: High confidence prediction")
elseif pred.score > 0.7
println("\nInterpretation: Medium confidence - generally reliable")
else
println("\nInterpretation: Low confidence - consider verification")
end
# Security considerations
if pred.overwrite_reason == LOW_CONFIDENCE && pred.output.is_text
println("\nSecurity note: This was originally detected as binary content")
println("but downgraded to generic text due to low confidence.")
println("Exercise caution when processing this file.")
end
end
# Usage example
result = identify_path(MagikaConfig(), "unknown_file")
analyze_result(result)Common Result Patterns
1. High-Confidence Detection
# Raw prediction and final output are the same
# High confidence score (>0.9)
# Overwrite reason is NONE2. Low-Confidence Text File
# Raw prediction might be "python" but final output is "txt"
# Confidence score is medium (0.6-0.8)
# Overwrite reason is LOW_CONFIDENCE3. Low-Confidence Binary File
# Raw prediction might be "pebin" but final output is "unknown"
# Confidence score is low (<0.6)
# Overwrite reason is LOW_CONFIDENCE4. Mapping Override
# Raw prediction might be "windows_dll" but final output is "dll"
# Overwrite reason is OVERWRITE_MAP
# This happens due to predefined content type mappingsUnderstanding these patterns helps you make informed decisions about how to handle different file types in your applications while maintaining appropriate security postures.