
Loading
Ace Intelligence Systems
Preparing a calmer, clearer view of your automation workspace.

Loading
Preparing a calmer, clearer view of your automation workspace.
Ace Intelligence
A full-stack defensive cybersecurity chatbot built with a fine-tuned open-source LLM (Qwen2.5 1.5B, QLoRA), FastAPI backend, PostgreSQL chat persistence, and a React Gemini-clone frontend.
PralayAI is a cybersecurity-focused AI assistant designed to help students, developers, and security learners understand defensive cybersecurity workflows. The system uses a fine-tuned Qwen2.5 1.5B Instruct model trained with QLoRA on a curated cybersecurity instruction dataset. The model is deployed via dual inference paths — a local CUDA API for fast development and a public Hugging Face Space for demos — and served through a FastAPI backend with PostgreSQL persistence and a React frontend.
Clone the repo, install dependencies, configure your environment, and launch all 3 services with a single startup script.
git clone https://github.com/OMCHOKSI108/pralayAI
cd pralayAI
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
./start.sh
# Starts: Inference API (:5000) | Backend (:8000) | Frontend (:5173)PralayAI is built on Qwen2.5 1.5B Instruct, fine-tuned with QLoRA using the Unsloth framework on a cybersecurity conversational instruction dataset. The LoRA adapter is merged with the base model for deployment. The model repository and adapter are published on Hugging Face for reproducibility.
Base Model: Qwen/Qwen2.5-1.5B-Instruct
Fine-tuning: Unsloth + QLoRA
Adapter: OMCHOKSI108/Paralay1.1
Merged Model: OMCHOKSI108/Paralay1.1-Merged
Dataset: OMCHOKSI108/cybersecdata
Inference API: omchoksi108-pralayai-inference-api.hf.space/generateThe system follows a four-component architecture: React Gemini-clone frontend sends user messages to the FastAPI backend, which persists conversations in PostgreSQL and routes inference requests to either the local CUDA inference API (port 5000, ~4.5s latency) or the Hugging Face Space CPU API (~54s latency). The model generates a defensive cybersecurity response, which is saved and returned through the backend to the frontend.
React Frontend (:5173)
│
▼
FastAPI Backend (:8000) ──► PostgreSQL
│
├── Local CUDA API (:5000) ──► Merged Model
│ (~4.5s on GPU)
└── HF Space API (cloud) ──► Merged Model
(~54s on CPU)PralayAI includes a strict defensive-only safety policy. The model is trained to refuse requests involving phishing, credential theft, malware creation, ransomware, reverse shells, and evasion techniques. An automated evaluation notebook runs 8 defensive queries and 5 adversarial safety prompts, scoring responses on keyword coverage, structure, depth, and refusal quality.
Defensive Use Cases:
Incident Response | Log Analysis | Threat Detection
MITRE ATT&CK Mapping | Cloud Security | Malware Defense
Security Awareness | Hardening Guidance
Blocked Topics:
Phishing | Credential Theft | Malware | Ransomware
Reverse Shells | Evasion | ExploitationThe backend exposes a single POST /api/chat endpoint that accepts a message, optional conversation_id, and generation parameters. It applies safety filtering, routes to the inference engine, and returns a structured response with the assistant message, latency, and source. The inference API is also directly callable for testing.
POST /api/chat
{
"message": "Explain incident response in 5 steps.",
"conversation_id": null,
"max_new_tokens": 300,
"temperature": 0.7
}
Response: {
"assistant_message": "...",
"conversation_id": "uuid",
"latency_seconds": 4.5,
"source": "local-cuda"
}The model was fine-tuned on a curated cybersecurity conversational dataset covering incident response, log analysis, malware defense, cloud security, and MITRE ATT&CK explanations. Training used QLoRA for memory efficiency, with loss convergence tracked across fine-tuning steps. The model training summary and safety evaluation scores are documented in the repo.
Python powers the fine-tuning pipeline with Unsloth and QLoRA. FastAPI serves the backend with SQLAlchemy + PostgreSQL for persistence. React with Vite provides the Gemini-clone frontend. Hugging Face handles model hosting and public inference. Local CUDA inference runs via a Flask wrapper.
Python | FastAPI | React | Vite | PostgreSQL | Qwen2.5 | QLoRA | Unsloth | Hugging Face | DockerFull source code including the fine-tuning notebook, model merge script, HF Space deployment configuration, FastAPI backend, React frontend, and comprehensive evaluation notebook.
https://github.com/OMCHOKSI108/pralayAIBuilt by the Ace Intelligence founding team.
Om Choksi (CTO) — https://github.com/OMCHOKSI108