🔑 Sensitive Data Exposure in AI-Generated Code

Sensitive Data Exposure is a critical vulnerability that occurs when secrets, credentials, or proprietary information are improperly handled in AI-generated code. AI tools, while speeding up development, often replicate insecure patterns from training data or provide naive code that exposes sensitive information.

Below we outline major vulnerabilities, provide insecure vs. secure AI-generated code examples, and show detection methods.

1. Hard-Coded Secrets / Credentials (CWE-798)

AI-generated code may hard-code API keys, passwords, or tokens directly in the source code. This exposes sensitive data and increases the risk of compromise if the code is shared, stored in repositories, or leaked.

AI Insecure Example:

const API_KEY = "12345-abcdef-67890";
fetch("https://api.example.com/data?key=" + API_KEY);

Safe Solution:

const API_KEY = process.env.API_KEY;
fetch(`https://api.example.com/data?key=${API_KEY}`);

Detection: Secret scanning, SAST, manual code review.

Reference: CWE-798

🔧 Services we offer: SonarQube Setup Assistance Source Code Review

2. Insecure Storage or Transmission of Sensitive Data (CWE-200)

AI-generated code may save sensitive information in plaintext or transmit it over unencrypted channels. This can lead to exposure of passwords, personal data, or proprietary information.

AI Insecure Example (Python):

with open("passwords.txt", "w") as f:
    f.write(user_password)
requests.post("http://example.com/login", data={"password": user_password})

Safe Solution:

import bcrypt, requests
hashed = bcrypt.hashpw(user_password.encode(), bcrypt.gensalt())
with open("passwords.txt", "wb") as f:
    f.write(hashed)
requests.post("https://example.com/login", data={"password": hashed})

Detection: Encryption review, network traffic monitoring, SAST.

Reference: CWE-200

🔧 Services we offer: SonarQube Setup Assistance Source Code Review

3. LLM Data Leakage (OWASP LLM06)

AI tools may inadvertently include sensitive project data in generated code or prompts. LLMs trained on internal repositories could generate code containing confidential snippets or credentials from training data.

AI Insecure Example:

# Generated function includes a real password from training data
def get_secret():
    return "SuperSecret123!"

Safe Solution:

# Do not embed sensitive data
def get_secret():
    return os.environ.get("SECRET_KEY")

Detection: Manual review, code scanning, secret detection tools.

Reference: OWASP LLM06

🔧 Services we offer: SonarQube Setup Assistance Source Code Review

4. Logging Sensitive Information

AI may generate debug statements that log passwords, API keys, or tokens, increasing exposure risk.

AI Insecure Example (Python):

print("User password:", user_password)

Safe Solution:

print("User logged in:", username)
# Avoid logging sensitive data

Detection: Secret scanning, log audits, SAST.

Reference: CWE-532

🔧 Services we offer: SonarQube Setup Assistance Source Code Review

5. Credentials in Source Repositories

AI-generated code may reference files containing credentials or secrets that are stored in repositories, risking exposure if the repository is public or improperly secured.

AI Insecure Example (Node.js):

const secrets = require('./secrets.json'); // contains API keys

Safe Solution:

const secrets = process.env; // load from environment variables

Detection: Repository scanning, SAST, manual review.

Reference: CWE-798

🔧 Services we offer: SonarQube Setup Assistance Source Code Review Software Composition Analysis

6. Hard-Coded Database Passwords

AI may generate code with database connection strings including plaintext passwords, which exposes critical infrastructure credentials.

AI Insecure Example (Java):

Connection conn = DriverManager.getConnection(
    "jdbc:mysql://localhost:3306/db", "root", "password123");

Safe Solution:

Connection conn = DriverManager.getConnection(
    System.getenv("DB_URL"), System.getenv("DB_USER"), System.getenv("DB_PASS"));

Detection: Secret scanning, static analysis.

Reference: CWE-798

🔧 Services we offer: SonarQube Setup Assistance Source Code Review

7. Insecure Data Transmission (CWE-319)

AI-generated code may send sensitive data over HTTP or unencrypted channels, exposing credentials and personal information.

AI Insecure Example (Python):

requests.post("http://example.com/login", data={"user": username, "pass": password})

Safe Solution (Python):

requests.post("https://example.com/login", data={"user": username, "pass": password})

Detection: Network monitoring, code review, SAST.

Reference: CWE-319

🔧 Services we offer: SonarQube Setup Assistance Source Code Review

🔧 How Our Services Help

SonarQube Setup Assistance: Detects hard-coded secrets, insecure storage/transmission, logging of sensitive data, repository leaks, database password exposure, and insecure transmission.
Source Code Review: Expert review of AI-generated code for all sensitive data vulnerabilities.
Software Composition Analysis: Detects vulnerable dependencies or misconfigured packages affecting sensitive data handling.
Software Licence Analysis: Ensures compliance for third-party components in AI-generated projects.

Space shortcuts

Page tree

🔑 Sensitive Data Exposure in AI-Generated Code

1. Hard-Coded Secrets / Credentials (CWE-798)

AI Insecure Example:

Safe Solution:

2. Insecure Storage or Transmission of Sensitive Data (CWE-200)

AI Insecure Example (Python):

Safe Solution:

3. LLM Data Leakage (OWASP LLM06)

AI Insecure Example:

Safe Solution:

4. Logging Sensitive Information

AI Insecure Example (Python):

Safe Solution:

5. Credentials in Source Repositories

AI Insecure Example (Node.js):

Safe Solution:

6. Hard-Coded Database Passwords

AI Insecure Example (Java):

Safe Solution:

7. Insecure Data Transmission (CWE-319)

AI Insecure Example (Python):

Safe Solution (Python):

🔧 How Our Services Help