📁 File Management Vulnerabilities in AI-Generated Code
File management is a common area where insecure AI-generated code appears. While AI can speed up development, it often reproduces unsafe patterns from training data. Attackers can exploit these flaws to access sensitive files, upload malicious content, or expose credentials.
Below we outline major vulnerabilities, provide insecure vs. secure code examples, show how to detect them, and highlight which of our services can help mitigate the risk.
1. Path Traversal (CWE-22)
Concatenating user input directly into file paths allows attackers to access files outside allowed directories. AI-generated code often naively uses string concatenation because it mirrors many examples in its training data without considering security implications. This can unintentionally allow traversal sequences (like "../") that lead to sensitive files. AI may also fail to implement proper sanitization or validation, making exploitation easier.
AI Insecure Example (Python Flask):
@app.route("/view")
def view_file():
filename = request.args.get("file")
return send_file("/var/www/uploads/" + filename)
Safe Solution (Python Flask):
from werkzeug.utils import secure_filename
@app.route("/view")
def view_file():
filename = secure_filename(request.args.get("file"))
return send_from_directory("/var/www/uploads", filename)
Detection: Static analysis, code review, fuzzing for traversal payloads.
2. Improper File Permissions
Using overly permissive file permissions exposes files to unauthorized access. AI-generated code frequently sets default permissions without understanding the security impact, often mirroring permissive examples. As a result, sensitive configuration or data files may become world-readable or writable, increasing risk of data leakage or tampering. AI lacks contextual understanding of security principles, so it may even recommend 777 permissions as a "quick solution."
AI Insecure Example:
with open("config.json", "w") as f:
f.write(data)
os.chmod("config.json", 0o777)
Safe Solution:
with open("config.json", "w") as f:
f.write(data)
os.chmod("config.json", 0o600)
Detection: Permission audits, SAST, manual review.
3. Unrestricted File Uploads
Failing to validate file type or size can allow attackers to upload malicious files. AI-generated code may blindly implement file upload features without enforcing restrictions, simply copying patterns from examples. This can result in attackers uploading scripts, executables, or malware that run on the server. AI rarely adds sufficient checks for file extension, MIME type, or scanning, making systems highly vulnerable.
AI Insecure Example (PHP):
move_uploaded_file($_FILES["file"]["tmp_name"], "uploads/" . $_FILES["file"]["name"]);
Safe Solution (PHP):
$filename = basename($_FILES["file"]["name"]);
$ext = pathinfo($filename, PATHINFO_EXTENSION);
if(in_array($ext, ["jpg","png","txt"])) {
move_uploaded_file($_FILES["file"]["tmp_name"], "uploads/" . $filename);
}
Detection: SAST tools, penetration testing of upload endpoints.
4. Insecure Temporary Files
Predictable temp file names or unsafe directories can expose sensitive data. AI-generated code may use common or simple temporary file paths without randomness, reflecting patterns seen in training examples. This can allow attackers to predict file locations, read secrets, or perform race condition attacks. AI does not account for secure handling, leading to unintentional leaks of sensitive information.
AI Insecure Example (Python):
tmp_path = "/tmp/data.tmp"
with open(tmp_path, "w") as f:
f.write(secret)
Safe Solution (Python):
import tempfile
with tempfile.NamedTemporaryFile(delete=False) as f:
f.write(secret.encode())
tmp_path = f.name
Detection: Code review, SAST, temp file analysis.
5. Logging Sensitive Data
Logging passwords, tokens, or API keys can leak secrets to logs. AI-generated code may include debug statements from examples it has seen, or automatically log sensitive variables to assist developers. Without understanding the sensitivity, it can output credentials or keys in plain text, creating an easy attack vector for anyone with log access. AI does not differentiate between safe and sensitive data, increasing risk.
AI Insecure Example (Node.js):
console.log("User password: " + password);
Safe Solution (Node.js):
console.log("User logged in: " + username);
// Do not log passwords or sensitive tokens
Detection: Secret scanning, manual review, logging audits.
🔧 How Our Services Help
- SonarQube Setup Assistance: Detects insecure file handling, unsafe permissions, risky uploads, temp file issues, and sensitive logging.
- Source Code Review: Expert review of AI-generated code for all file management vulnerabilities.
- Software Composition Analysis: Detects vulnerable dependencies affecting file handling and uploads.
- Software Licence Analysis: Ensures compliance for third-party components in AI-generated projects.