EK9 Input Sanitization
EK9 implements a comprehensive "Rejection at the Source" security model that automatically detects and blocks common web attacks before malicious input can propagate through your application. This includes SQL injection, XSS, command injection, path traversal, XXE, and SSTI attacks.
The security framework is designed to be secure by default with zero configuration required, while remaining enterprise extensible for organizations that need custom security policies or specialized logging formats.
Contents
- The
sanitizedParameter Modifier - InputSanitizer Class
- Log Format Configuration
- SanitizationContext
- Integration Examples
- Best Practices
- Related Compiler Errors
The sanitized Parameter Modifier
The simplest way to add input sanitization is the sanitized modifier on incoming
String parameters. When a function or method is called with a sanitized parameter,
EK9 automatically creates a defensive copy of the String at the call site and sanitizes it
before passing it to the function.
#!ek9
defines module introduction
defines function
executeQuery()
-> sql as sanitized String
<- result as String?
//The 'sql' parameter is automatically sanitized at the call site
//before this function receives it - preventing SQL injection
...
processUserInput()
->
name as sanitized String
comment as sanitized String
<- success as Boolean?
//Both parameters are sanitized before use
...
//EOF
How it works: The sanitized modifier triggers automatic sanitization:
- The original caller's String is never modified
- A defensive copy is created at the call site
- The copy is passed through the
InputSanitizerthreat detection - If a threat is detected, the parameter becomes unset
- The function receives either the clean String or an unset value
Important Restrictions
- The
sanitizedmodifier can only be used on incoming String parameters - It cannot be used on fields/properties in classes, records, or other constructs
- It cannot be used on return parameters or local variables
- Direct assignment from a sanitized parameter to a local variable is blocked (see E07930)
- When overriding methods, the
sanitizedmodifier must match between the super method and the override (Liskov Substitution Principle)
Target Sites vs Call Sites: The "Rejection at the Source" Philosophy
The sanitized modifier is a target site annotation — it belongs on
function/method parameter definitions where untrusted data first enters the system.
It cannot be used at:
- Call sites:
myFunc(sanitized arg)— Wrong! The callee defines sanitization, not the caller. - Captures:
(sanitized capturedVar) extends Handler— Wrong! Captured variables are already in the trusted boundary. - Variable declarations:
name <- sanitized getValue()— Wrong! The function returning the value should sanitize its parameters.
Why this design? Security reasoning becomes simpler when sanitization happens in one place:
- Entry points define where untrusted data enters
- Everything inside the trust boundary is already safe
- The compiler enforces sanitization automatically at call sites
This eliminates ambiguity about who is responsible for sanitization. The function/method author specifies requirements; the compiler enforces them at all call sites.
#!ek9 //WRONG: Trying to sanitize at call site processInput(sanitized userInput) //E07942: Caller can't specify sanitization //WRONG: Trying to sanitize at capture handler <- (sanitized var) extends BaseHandler //E07941: Capture already trusted //WRONG: Trying to sanitize at variable declaration name <- sanitized getValue() //E07943: Wrong location for sanitization //CORRECT: Sanitize at the definition (entry point) processInput() -> data as sanitized String //OK: Target site annotation ...
Why Direct Assignment is Blocked
When you write local <- sanitizedParam, both variables point to the same
sanitized copy in memory. This creates "hidden aliasing" where mutations to one variable affect
the other — defeating the purpose of defensive copying.
#!ek9 //BAD: Creates hidden alias processInput() -> input as sanitized String local <- input // ERROR: 'local' is alias to 'input' local += " modified" // Also modifies 'input'! //GOOD: Explicit copy processInput() -> input as sanitized String local <- String(input) // OK: Explicit copy constructor local += " modified" // Only affects 'local'
InputSanitizer Class
The InputSanitizer class is the core threat detection engine in EK9. It provides
programmatic access to the same threat detection used by the sanitized modifier,
allowing explicit control over when and how sanitization occurs.
This class is particularly useful when you need to:
- Log threats to a specific output destination
- Check safety without modifying the input
- Get detailed threat type information
- Handle environment variables (skip path traversal checks)
Constructors
#!ek9
//Default - always logs threats to stderr
sanitizer <- InputSanitizer()
//With custom reporter (alternative destination)
sanitizer <- InputSanitizer(Stderr()) //explicit stderr
logFile <- TextFile("/var/log/security.log").output()
sanitizer <- InputSanitizer(logFile) //log to file
//EOF
The InputSanitizer always logs detected threats — this cannot be disabled.
By default, threats are logged to stderr. You can optionally provide a custom
StringOutput destination (file, stdout, etc.).
The log format is controlled by the EK9_SANITIZER_LOG_FORMAT environment variable
(see Log Format Configuration).
Methods
#!ek9
defines module introduction
defines program
SanitizerExample()
sanitizer <- InputSanitizer(Stderr())
//Sanitize input - returns unset if threat detected
userInput <- "some user input"
result <- sanitizer.sanitize(userInput)
if result?
//Safe to use
processData(result)
else
//Threat detected, handle appropriately
handleThreatDetection()
//Check safety without modification
if sanitizer.isSafe(userInput)
processData(userInput)
//Get threat type(s) as comma-separated string
threat <- sanitizer.detectThreat(userInput)
if threat?
logThreat(threat) //e.g., "SQL_INJECTION,XSS"
//For environment variables (skip path traversal checks)
envValue <- EnvVars().get("CONFIG_PATH")
if envValue?
cleanEnvValue <- sanitizer.sanitizeWithoutPathChecks(envValue.get())
//EOF
Threat Types Detected
The InputSanitizer detects the following threat categories, aligned with the
OWASP Top 10:
| Threat Type | Description | Example Pattern |
|---|---|---|
| SQL_INJECTION | SQL injection patterns | ' OR '1'='1, ; DROP TABLE |
| COMMAND_INJECTION | Shell/OS command injection | ; rm -rf /, | cat /etc/passwd |
| PATH_TRAVERSAL | Directory traversal attacks | ../../../etc/passwd |
| XSS | Cross-Site Scripting | <script>alert('xss')</script> |
| XXE | XML External Entity attacks | <!ENTITY xxe SYSTEM "file://"> |
| SSTI | Server-Side Template Injection | {{7*7}}, ${T(java.lang.Runtime)} |
When multiple threats are detected in a single input, the threat types are returned as a
comma-separated string (e.g., "SQL_INJECTION,XSS").
Log Format Configuration
The InputSanitizer log format is controlled by the EK9_SANITIZER_LOG_FORMAT
environment variable. This allows you to integrate with your existing SIEM infrastructure without
any code changes.
# Set log format (default: JSON) export EK9_SANITIZER_LOG_FORMAT=JSON # Simple JSON (universal, default) export EK9_SANITIZER_LOG_FORMAT=ECS # Elastic Common Schema export EK9_SANITIZER_LOG_FORMAT=CEF # Common Event Format export EK9_SANITIZER_LOG_FORMAT=SIMPLE # Simple bracket format [THREAT_TYPE] export EK9_SANITIZER_LOG_FORMAT=SILENT # Suppress all output (testing only)
JSON Format (Default)
Simple JSON format that works with any log shipper or monitoring tool:
{
"timestamp": "2026-01-21T10:30:45.123Z",
"level": "warn",
"threat": "SQL_INJECTION",
"message": "Dangerous input detected: ' OR '1'='1"
}
ECS Format (Elastic Common Schema)
ECS-aligned JSON for Elasticsearch, Splunk, Datadog, CloudWatch, and Google Cloud Logging:
{
"@timestamp": "2026-01-21T10:30:45.123Z",
"log.level": "warn",
"event.category": "intrusion_detection",
"event.type": "denied",
"event.action": "input_rejected",
"threat.indicator.type": "SQL_INJECTION",
"message": "Dangerous input detected",
"source.ip": "192.168.1.100",
"service.name": "UserService",
"ek9.field.name": "userId",
"ek9.input.value": "' OR '1'='1",
"ecs.version": "8.11"
}
CEF Format (Common Event Format)
CEF for ArcSight, Azure Sentinel, QRadar, and LogRhythm:
CEF:0|EK9|InputSanitizer|1.0|SQL_INJECTION|Dangerous input detected|8|src=192.168.1.100 svc=UserService cs1=userId cs1Label=FieldName msg=' OR '1'='1
CEF Severity Mapping:
- SQL_INJECTION: Severity 8 (High)
- COMMAND_INJECTION: Severity 9 (Very High)
- PATH_TRAVERSAL: Severity 6 (Medium)
- XSS: Severity 7 (High)
- XXE: Severity 8 (High)
- SSTI: Severity 7 (High)
SIMPLE Format
Minimal bracket format for simple scripts, debugging, or when JSON parsing is overkill:
[SQL_INJECTION] Dangerous input detected: ' OR '1'='1
No timestamp, no context fields — just the threat type and input value.
SILENT Format
Suppresses all sanitizer output. This is useful for testing where the sanitizer behavior
is not the focus of the test. For example, when testing bytecode generation for code that
uses the sanitized keyword, you may not want sanitizer logs polluting test output.
Warning: Do not use SILENT in production — you will lose visibility into attack attempts. This format is intended only for testing and development scenarios.
SanitizationContext
A record that captures metadata for security event logging. When you call
InputSanitizer methods with a SanitizationContext, the context
fields are included in the log output (for ECS and CEF formats).
#!ek9
defines module org.ek9.lang
defines record
SanitizationContext
timestamp as DateTime // When sanitization occurred
service as String // "UserService"
operation as String // "getUser"
fieldName as String // "userId"
fieldSource as String // "PATH", "QUERY", "HEADER", "CONTENT"
sourceIp as String // Client IP address
traceId as String // Request correlation ID
//EOF
Using SanitizationContext with InputSanitizer:
#!ek9
defines module introduction
defines program
ContextExample()
sanitizer <- InputSanitizer()
//Create context with rich metadata
context <- SanitizationContext(
DateTime(),
"UserService",
"getUser",
"userId",
"PATH",
"192.168.1.100",
"trace-abc-123"
)
userInput <- "some user input"
//Sanitize with context - logs include all context fields
result <- sanitizer.sanitize(userInput, context)
if result?
processData(result)
//Check safety with context
if sanitizer.isSafe(userInput, context)
processData(userInput)
//EOF
When context is provided, the log output includes all set fields. Fields that are not set are simply omitted from the log output. This provides rich metadata for security teams to investigate incidents, correlate attacks, and identify patterns.
Integration Examples
TextFile Automatic Sanitization
When reading files specified by user input, use sanitized paths to prevent path traversal:
#!ek9
defines module introduction
defines function
readUserFile()
-> filename as sanitized String
<- content as String?
if filename?
file <- TextFile(filename)
if file.exists() and file.isReadable()
content: file.readAll()
//EOF
Web Service Input Handling
In web services, use the sanitized modifier on path parameters, query parameters,
and request body fields:
#!ek9
defines module introduction
defines service
UserService :/users
getUser() :/{userId} as GET
-> userId as sanitized String
<- response as HTTPResponse?
//userId is automatically sanitized before reaching this method
...
searchUsers() :/search as GET
-> query as sanitized String
<- response as HTTPResponse?
//query parameter is sanitized
...
//EOF
Command-Line Argument Processing
When processing command-line arguments that will be used in file operations or external commands:
#!ek9
defines module introduction
defines program
ProcessFiles()
-> argv as List of String
sanitizer <- InputSanitizer(Stderr())
for arg in argv
cleanArg <- sanitizer.sanitize(arg)
if cleanArg?
processFile(cleanArg)
else
Stderr().println("Rejected potentially dangerous argument")
//EOF
Best Practices
Use the sanitized Modifier by Default
For any function or method that accepts user-provided String input, add the sanitized
modifier. This is the simplest and most effective defense:
#!ek9 //GOOD: Default to sanitized for user input processUserData() -> input as sanitized String ... //Only omit sanitized for trusted internal data processInternalData() -> trustedInput as String ...
Handle Unset Results Gracefully
When sanitization detects a threat, the parameter becomes unset. Design your business logic to handle this case with appropriate error messages:
#!ek9
createUser()
-> username as sanitized String
<- result as Boolean?
if username?
//Safe to process
result: doCreateUser(username)
else
//Threat detected - apply normal business validation message
//This provides defense in depth without information leakage
result: false
logValidationError("Invalid username format")
Use InputSanitizer for Environment Variables
Environment variables may legitimately contain paths. Use sanitizeWithoutPathChecks()
for these cases:
#!ek9
sanitizer <- InputSanitizer(Stderr())
envVars <- EnvVars()
//Path traversal patterns are legitimate in env vars
configPath <- envVars.get("CONFIG_PATH")
if configPath?
cleanPath <- sanitizer.sanitizeWithoutPathChecks(configPath.get())
...
Enterprise Logging Integration
For production systems, configure the log format via the EK9_SANITIZER_LOG_FORMAT
environment variable to integrate with your SIEM:
#Production deployment - route stderr to SIEM export EK9_SANITIZER_LOG_FORMAT=ECS #For Elasticsearch/Splunk/Datadog #or export EK9_SANITIZER_LOG_FORMAT=CEF #For ArcSight/Azure Sentinel/QRadar
In production, route stderr to your log aggregator (Fluentd, Filebeat, etc.)
and the security events will automatically flow to your SIEM.
Related Compiler Errors
The following compiler errors relate to the sanitized modifier:
- E07910 — Sanitized parameter must be String type.
The
sanitizedmodifier can only be applied to String parameters. - E07920 — Sanitized only valid on incoming parameters.
Cannot use
sanitizedon return parameters, local variables, or fields. - E07930 — Cannot assign from sanitized parameter. Direct assignment creates hidden aliasing. Use explicit copy constructor instead.
- E07940 — Sanitized modifier must match in overrides.
When overriding a method, the
sanitizedmodifier must match the super method (Liskov Substitution Principle). - E07941 — Sanitized not allowed in capture. Captured variables are already in the trusted boundary. Sanitize at the original entry point instead.
- E07942 — Sanitized not allowed at call site. The function/method definition specifies sanitization, not the caller.
- E07943 — Sanitized not allowed in variable declaration. Sanitize at the function/method entry point where data first enters the system.
Next Steps
For more details on related topics:
- Security Types — Overview of security components in the standard library
- Code Quality — How EK9 enforces quality at compile time
- Web Services — Building secure web services with EK9
- Components — Understanding the component model for enterprise extension