EK9 Security: Input Sanitization and Sensitive Data
EK9 provides a comprehensive security framework with two complementary layers:
- Input Sanitization — runtime detection and blocking of common web attacks
(SQL injection, XSS, command injection, path traversal, XXE, SSTI) using the
sanitizedmodifier andInputSanitizerclass - Sensitive Data Protection — compile-time detection of hardcoded secrets
(100+ credential patterns) plus the
Sensitivebuilt-in type for safe runtime credential handling with automatic redaction
Both layers follow a "Rejection at the Source" philosophy and are secure by default with zero configuration required, while remaining enterprise extensible for organizations that need custom security policies or specialized logging formats.
Contents
- The
sanitizedParameter Modifier - InputSanitizer Class
- Log Format Configuration
- SanitizationContext
- Integration Examples
- Best Practices
- Sensitive Type and Secret Detection
- Related Compiler Errors
The sanitized Parameter Modifier
The simplest way to add input sanitization is the sanitized modifier on incoming
String parameters. When a function or method is called with a sanitized parameter,
EK9 automatically creates a defensive copy of the String at the call site and sanitizes it
before passing it to the function.
#!ek9
defines module introduction
defines function
executeQuery()
-> sql as sanitized String
<- result as String?
//The 'sql' parameter is automatically sanitized at the call site
//before this function receives it - preventing SQL injection
...
processUserInput()
->
name as sanitized String
comment as sanitized String
<- success as Boolean?
//Both parameters are sanitized before use
...
//EOF
How it works: The sanitized modifier triggers automatic sanitization:
- The original caller's String is never modified
- A defensive copy is created at the call site
- The copy is passed through the
InputSanitizerthreat detection - If a threat is detected, the parameter becomes unset
- The function receives either the clean String or an unset value
Important Restrictions
- The
sanitizedmodifier can only be used on incoming String parameters - It cannot be used on fields/properties in classes, records, or other constructs
- It cannot be used on return parameters or local variables
- Direct assignment from a sanitized parameter to a local variable is blocked (see E07930)
- When overriding methods, the
sanitizedmodifier must match between the super method and the override (Liskov Substitution Principle)
Target Sites vs Call Sites: The "Rejection at the Source" Philosophy
The sanitized modifier is a target site annotation — it belongs on
function/method parameter definitions where untrusted data first enters the system.
It cannot be used at:
- Call sites:
myFunc(sanitized arg)— Wrong! The callee defines sanitization, not the caller. - Captures:
(sanitized capturedVar) extends Handler— Wrong! Captured variables are already in the trusted boundary. - Variable declarations:
name <- sanitized getValue()— Wrong! The function returning the value should sanitize its parameters.
Why this design? Security reasoning becomes simpler when sanitization happens in one place:
- Entry points define where untrusted data enters
- Everything inside the trust boundary is already safe
- The compiler enforces sanitization automatically at call sites
This eliminates ambiguity about who is responsible for sanitization. The function/method author specifies requirements; the compiler enforces them at all call sites.
#!ek9 //WRONG: Trying to sanitize at call site processInput(sanitized userInput) //E07942: Caller can't specify sanitization //WRONG: Trying to sanitize at capture handler <- (sanitized var) extends BaseHandler //E07941: Capture already trusted //WRONG: Trying to sanitize at variable declaration name <- sanitized getValue() //E07943: Wrong location for sanitization //CORRECT: Sanitize at the definition (entry point) processInput() -> data as sanitized String //OK: Target site annotation ...
Why Direct Assignment is Blocked
When you write local <- sanitizedParam, both variables point to the same
sanitized copy in memory. This creates "hidden aliasing" where mutations to one variable affect
the other — defeating the purpose of defensive copying.
#!ek9 //BAD: Creates hidden alias processInput() -> input as sanitized String local <- input // ERROR: 'local' is alias to 'input' local += " modified" // Also modifies 'input'! //GOOD: Explicit copy processInput() -> input as sanitized String local <- String(input) // OK: Explicit copy constructor local += " modified" // Only affects 'local'
InputSanitizer Class
The InputSanitizer class is the core threat detection engine in EK9. It provides
programmatic access to the same threat detection used by the sanitized modifier,
allowing explicit control over when and how sanitization occurs.
This class is particularly useful when you need to:
- Log threats to a specific output destination
- Check safety without modifying the input
- Get detailed threat type information
- Handle environment variables (skip path traversal checks)
Constructors
#!ek9
//Default - always logs threats to stderr
sanitizer <- InputSanitizer()
//With custom reporter (alternative destination)
sanitizer <- InputSanitizer(Stderr()) //explicit stderr
logFile <- TextFile("/var/log/security.log").output()
sanitizer <- InputSanitizer(logFile) //log to file
//EOF
The InputSanitizer always logs detected threats — this cannot be disabled.
By default, threats are logged to stderr. You can optionally provide a custom
StringOutput destination (file, stdout, etc.).
The log format is controlled by the EK9_SANITIZER_LOG_FORMAT environment variable
(see Log Format Configuration).
Methods
#!ek9
defines module introduction
defines program
SanitizerExample()
sanitizer <- InputSanitizer(Stderr())
//Sanitize input - returns unset if threat detected
userInput <- "some user input"
result <- sanitizer.sanitize(userInput)
if result?
//Safe to use
processData(result)
else
//Threat detected, handle appropriately
handleThreatDetection()
//Check safety without modification
if sanitizer.isSafe(userInput)
processData(userInput)
//Get threat type(s) as comma-separated string
threat <- sanitizer.detectThreat(userInput)
if threat?
logThreat(threat) //e.g., "SQL_INJECTION,XSS"
//For environment variables (skip path traversal checks)
envValue <- EnvVars().get("CONFIG_PATH")
if envValue?
cleanEnvValue <- sanitizer.sanitizeWithoutPathChecks(envValue.get())
//EOF
Threat Types Detected
The InputSanitizer detects the following threat categories, aligned with the
OWASP Top 10:
| Threat Type | Description | Example Pattern |
|---|---|---|
| SQL_INJECTION | SQL injection patterns | ' OR '1'='1, ; DROP TABLE |
| COMMAND_INJECTION | Shell/OS command injection | ; rm -rf /, | cat /etc/passwd |
| PATH_TRAVERSAL | Directory traversal attacks | ../../../etc/passwd |
| XSS | Cross-Site Scripting | <script>alert('xss')</script> |
| XXE | XML External Entity attacks | <!ENTITY xxe SYSTEM "file://"> |
| SSTI | Server-Side Template Injection | {{7*7}}, ${T(java.lang.Runtime)} |
When multiple threats are detected in a single input, the threat types are returned as a
comma-separated string (e.g., "SQL_INJECTION,XSS").
Log Format Configuration
The InputSanitizer log format is controlled by the EK9_SANITIZER_LOG_FORMAT
environment variable. This allows you to integrate with your existing SIEM infrastructure without
any code changes.
# Set log format (default: JSON) export EK9_SANITIZER_LOG_FORMAT=JSON # Simple JSON (universal, default) export EK9_SANITIZER_LOG_FORMAT=ECS # Elastic Common Schema export EK9_SANITIZER_LOG_FORMAT=CEF # Common Event Format export EK9_SANITIZER_LOG_FORMAT=SIMPLE # Simple bracket format [THREAT_TYPE] export EK9_SANITIZER_LOG_FORMAT=SILENT # Suppress all output (testing only)
JSON Format (Default)
Simple JSON format that works with any log shipper or monitoring tool:
{
"timestamp": "2026-01-21T10:30:45.123Z",
"level": "warn",
"threat": "SQL_INJECTION",
"message": "Dangerous input detected: ' OR '1'='1"
}
ECS Format (Elastic Common Schema)
ECS-aligned JSON for Elasticsearch, Splunk, Datadog, CloudWatch, and Google Cloud Logging:
{
"@timestamp": "2026-01-21T10:30:45.123Z",
"log.level": "warn",
"event.category": "intrusion_detection",
"event.type": "denied",
"event.action": "input_rejected",
"threat.indicator.type": "SQL_INJECTION",
"message": "Dangerous input detected",
"source.ip": "192.168.1.100",
"service.name": "UserService",
"ek9.field.name": "userId",
"ek9.input.value": "' OR '1'='1",
"ecs.version": "8.11"
}
CEF Format (Common Event Format)
CEF for ArcSight, Azure Sentinel, QRadar, and LogRhythm:
CEF:0|EK9|InputSanitizer|1.0|SQL_INJECTION|Dangerous input detected|8|src=192.168.1.100 svc=UserService cs1=userId cs1Label=FieldName msg=' OR '1'='1
CEF Severity Mapping:
- SQL_INJECTION: Severity 8 (High)
- COMMAND_INJECTION: Severity 9 (Very High)
- PATH_TRAVERSAL: Severity 6 (Medium)
- XSS: Severity 7 (High)
- XXE: Severity 8 (High)
- SSTI: Severity 7 (High)
SIMPLE Format
Minimal bracket format for simple scripts, debugging, or when JSON parsing is overkill:
[SQL_INJECTION] Dangerous input detected: ' OR '1'='1
No timestamp, no context fields — just the threat type and input value.
SILENT Format
Suppresses all sanitizer output. This is useful for testing where the sanitizer behavior
is not the focus of the test. For example, when testing bytecode generation for code that
uses the sanitized keyword, you may not want sanitizer logs polluting test output.
Warning: Do not use SILENT in production — you will lose visibility into attack attempts. This format is intended only for testing and development scenarios.
SanitizationContext
A record that captures metadata for security event logging. When you call
InputSanitizer methods with a SanitizationContext, the context
fields are included in the log output (for ECS and CEF formats).
#!ek9
defines module org.ek9.lang
defines record
SanitizationContext
timestamp as DateTime // When sanitization occurred
service as String // "UserService"
operation as String // "getUser"
fieldName as String // "userId"
fieldSource as String // "PATH", "QUERY", "HEADER", "CONTENT"
sourceIp as String // Client IP address
traceId as String // Request correlation ID
//EOF
Using SanitizationContext with InputSanitizer:
#!ek9
defines module introduction
defines program
ContextExample()
sanitizer <- InputSanitizer()
//Create context with rich metadata
context <- SanitizationContext(
DateTime(),
"UserService",
"getUser",
"userId",
"PATH",
"192.168.1.100",
"trace-abc-123"
)
userInput <- "some user input"
//Sanitize with context - logs include all context fields
result <- sanitizer.sanitize(userInput, context)
if result?
processData(result)
//Check safety with context
if sanitizer.isSafe(userInput, context)
processData(userInput)
//EOF
When context is provided, the log output includes all set fields. Fields that are not set are simply omitted from the log output. This provides rich metadata for security teams to investigate incidents, correlate attacks, and identify patterns.
Integration Examples
TextFile Automatic Sanitization
When reading files specified by user input, use sanitized paths to prevent path traversal:
#!ek9
defines module introduction
defines function
readUserFile()
-> filename as sanitized String
<- content as String?
if filename?
file <- TextFile(filename)
if file.exists() and file.isReadable()
content: file.readAll()
//EOF
Web Service Input Handling
In web services, use the sanitized modifier on path parameters, query parameters,
and request body fields:
#!ek9
defines module introduction
defines service
UserService :/users
getUser() :/{userId} as GET
-> userId as sanitized String
<- response as HTTPResponse?
//userId is automatically sanitized before reaching this method
...
searchUsers() :/search as GET
-> query as sanitized String
<- response as HTTPResponse?
//query parameter is sanitized
...
//EOF
Command-Line Argument Processing
When processing command-line arguments that will be used in file operations or external commands:
#!ek9
defines module introduction
defines program
ProcessFiles()
-> argv as List of String
sanitizer <- InputSanitizer(Stderr())
for arg in argv
cleanArg <- sanitizer.sanitize(arg)
if cleanArg?
processFile(cleanArg)
else
Stderr().println("Rejected potentially dangerous argument")
//EOF
Best Practices
Use the sanitized Modifier by Default
For any function or method that accepts user-provided String input, add the sanitized
modifier. This is the simplest and most effective defense:
#!ek9 //GOOD: Default to sanitized for user input processUserData() -> input as sanitized String ... //Only omit sanitized for trusted internal data processInternalData() -> trustedInput as String ...
Handle Unset Results Gracefully
When sanitization detects a threat, the parameter becomes unset. Design your business logic to handle this case with appropriate error messages:
#!ek9
createUser()
-> username as sanitized String
<- result as Boolean?
if username?
//Safe to process
result: doCreateUser(username)
else
//Threat detected - apply normal business validation message
//This provides defense in depth without information leakage
result: false
logValidationError("Invalid username format")
Use InputSanitizer for Environment Variables
Environment variables may legitimately contain paths. Use sanitizeWithoutPathChecks()
for these cases:
#!ek9
sanitizer <- InputSanitizer(Stderr())
envVars <- EnvVars()
//Path traversal patterns are legitimate in env vars
configPath <- envVars.get("CONFIG_PATH")
if configPath?
cleanPath <- sanitizer.sanitizeWithoutPathChecks(configPath.get())
...
Enterprise Logging Integration
For production systems, configure the log format via the EK9_SANITIZER_LOG_FORMAT
environment variable to integrate with your SIEM:
#Production deployment - route stderr to SIEM export EK9_SANITIZER_LOG_FORMAT=ECS #For Elasticsearch/Splunk/Datadog #or export EK9_SANITIZER_LOG_FORMAT=CEF #For ArcSight/Azure Sentinel/QRadar
In production, route stderr to your log aggregator (Fluentd, Filebeat, etc.)
and the security events will automatically flow to your SIEM.
Sensitive Type and Secret Detection
Input sanitization protects against malicious input at runtime. EK9 also provides a complementary layer that protects against leaked credentials — both at compile time and at runtime. Together, these two systems form a comprehensive security framework.
Compile-Time Secret Detection
The EK9 compiler automatically scans all string literals (including text segments within interpolated strings) for patterns matching known credential formats. If a hardcoded secret is detected, compilation fails immediately — the secret never reaches version control, build artifacts, or production.
Detected credential categories include:
| Error Code | Category | Example Patterns |
|---|---|---|
| E11080 | Cloud Provider | AWS keys (AKIA...), GCP API keys (AIza...), Azure connection strings |
| E11081 | Platform Token | GitHub (ghp_), GitLab (glpat-), Slack (xoxb-), npm, Shopify, Heroku |
| E11082 | Private Key | PEM headers: RSA, EC, DSA, PKCS8, OPENSSH private keys |
| E11083 | Database URL | postgres://user:pass@host, MySQL, MongoDB, Redis, JDBC |
| E11084 | JWT Token | eyJhbGci... three-part header.payload.signature structure |
| E11086 | API Key | Stripe (sk_test_), Anthropic (sk-ant-), SendGrid (SG.), 50+ services |
This detection covers 100+ distinct patterns across cloud providers, platform tokens, private keys, database URLs, JWT tokens, and API keys — comparable to commercial tools like GitGuardian and TruffleHog, but enforced at compile time rather than after the fact.
The Sensitive Type
Once secrets are removed from source code, they need to be loaded securely at runtime.
The Sensitive built-in type wraps secret values with automatic protection:
- Auto-redaction:
$and$$operators always return"***REDACTED***"— secrets cannot leak through logging, error messages, or string interpolation - Constant-time equality: comparison uses
MessageDigest.isEqual()to prevent timing attacks - Controlled construction: the only way to create a set
Sensitivevalue is throughEnvVars.sensitiveGet()— there is no String constructor visible to EK9 code - Gated access:
reveal()returns the raw secret but requires thePrivilegedmarker trait (E11090)
#!ek9
defines module introduction
defines class
HttpClient with trait of Privileged
apiKey as Sensitive?
HttpClient()
-> key as Sensitive
apiKey: key
sendRequest()
<- response as String?
//reveal() only works because HttpClient has the Privileged trait
if apiKey?
header <- apiKey.reveal()
response: doHttpCall(header)
defines function
demo()
env <- EnvVars()
//sensitiveGet() is the ONLY way to create a set Sensitive value
key <- env.sensitiveGet("API_KEY")
if key?
client <- HttpClient(key)
result <- client.sendRequest()
stdout <- Stdout()
//Safe: printing key shows "***REDACTED***", not the actual secret
stdout.println(`Key: ${key}`)
if result?
stdout.println(result)
//EOF
The Privileged trait creates an auditable access boundary — searching
for with trait of Privileged in any codebase gives a complete list of every class
that can access raw secret values.
Two Layers Working Together
The sanitization and sensitive data systems complement each other:
| Input Sanitization | Sensitive Type | |
|---|---|---|
| Protects against | Malicious input (SQL injection, XSS, etc.) | Credential leakage (API keys, passwords, tokens) |
| When | Runtime (at function entry points) | Compile time (literals) + Runtime (redaction) |
| Mechanism | sanitized modifier, InputSanitizer |
Sensitive type, Privileged trait |
| On failure | Parameter becomes unset + threat logged | Compilation fails (literals) or redacted output (runtime) |
Related Compiler Errors
The following compiler errors relate to the sanitized modifier:
- E07910 — Sanitized parameter must be String type.
The
sanitizedmodifier can only be applied to String parameters. - E07920 — Sanitized only valid on incoming parameters.
Cannot use
sanitizedon return parameters, local variables, or fields. - E07930 — Cannot assign from sanitized parameter. Direct assignment creates hidden aliasing. Use explicit copy constructor instead.
- E07940 — Sanitized modifier must match in overrides.
When overriding a method, the
sanitizedmodifier must match the super method (Liskov Substitution Principle). - E07941 — Sanitized not allowed in capture. Captured variables are already in the trusted boundary. Sanitize at the original entry point instead.
- E07942 — Sanitized not allowed at call site. The function/method definition specifies sanitization, not the caller.
- E07943 — Sanitized not allowed in variable declaration. Sanitize at the function/method entry point where data first enters the system.
The following compiler errors relate to secret detection and the Sensitive type:
- E11080 — Hardcoded cloud provider credential. AWS, GCP, or Azure credential pattern detected in a string literal.
- E11081 — Hardcoded platform token. GitHub, GitLab, Slack, npm, Shopify, Heroku, or other platform token detected.
- E11082 — Hardcoded private key material. PEM private key header (RSA, EC, DSA, PKCS8, OPENSSH) detected.
- E11083 — Hardcoded database credential. Database URL with embedded password (PostgreSQL, MySQL, MongoDB, Redis, JDBC) detected.
- E11084 — Hardcoded JWT token.
JWT three-part structure (
eyJ...) detected. - E11086 — Hardcoded API key. Known API key pattern (Stripe, Anthropic, OpenAI, SendGrid, etc.) detected.
- E11090 — Privileged access required.
reveal()called onSensitivewithoutPrivilegedtrait.
Next Steps
For more details on related topics:
- Built-in Types —
Sensitive,EnvVars, andPrivilegedtype documentation - Security Types — Overview of security components in the standard library
- Code Quality — How EK9 enforces quality at compile time
- Web Services — Building secure web services with EK9
- Components — Understanding the component model for enterprise extension