Karl Jürgen Mollan Neyra

Data Engineer & AI Researcher — building intelligent data systems, end-to-end ML pipelines, and generative AI solutions that turn complex data into strategic decisions.

5+
Years Experience
3
Publications
85%
ML Precision
75%
Latency Reduced
About

Data Engineer & AI Engineer specialized in RAG systems, autonomous agents, and data pipelines. I build end-to-end solutions from legacy data ingestion to conversational AI interfaces, formally evaluating each technique with precision and recall metrics.

Currently pursuing a Master's in Data Science at Universidad Nacional de Ingeniería (UNI), with a background in Mechatronics Engineering, Environmental Engineering, and Project Management. I lead a private AI research laboratory focused on search architectures, evaluation frameworks, and graph-based anomaly detection.

Lima, Peru · 882 followers on LinkedIn · 500+ connections

Publications & Research
Data Quality Over Algorithmic Complexity: Empirical Evidence from a Production Hybrid Search System
Karl J. Mollan Neyra2026Zenodo

Demonstrates that data quality improvements (ground truth correction, embedding deduplication) yielded a 27% improvement in Precision@5 — with zero code changes and zero cost — outperforming neural reranking approaches. Based on 28 formal evaluations with 52 reference questions.

Information RetrievalData QualityHybrid SearchHyDEEvaluation
Conceptual Mechatronics Design and Prototyping of Autonomous Inverted Pendulum-System Applied on Two-Wheeled Mobile Robot
Karl J. Mollan NeyraUSIL

Design and prototype of an autonomous two-wheeled mobile robot using inverted pendulum control systems, with stress simulations and material validation.

MechatronicsControl SystemsRoboticsPrototyping
GNN-Based Anomaly Detection for Fraud Identification in Payment Systems
Karl J. Mollan Neyra2026 (in progress)

Graph Neural Networks with temporal pattern analysis for detecting fraud rings in instant payment networks. Using 6.3M synthetic transactions modeled after regional patterns.

GNNFraud DetectionGraph AnomalyFintechPaySim
AI Lab — Live Demos

Semantic Similarity Explorer

Type two sentences and see how AI measures their meaning similarity using embedding vectors. Powered by open-source models via HuggingFace Inference API.

System Architecture

Hybrid Search Engine

Dual embedding (384d + 768d) with vector cosine similarity, full-text search, and Reciprocal Rank Fusion. Intelligent query routing based on complexity.

Vector DB · FTS · RRF · HyDE

Knowledge Graph

1,345 entities and 1,716 relationships extracted from domain documents. Enables relationship-aware queries beyond keyword matching.

GraphRAG · Entity Extraction · Relationship Mining

Evaluation Framework

52 ground truth questions across 10+ domains, 62 evaluation runs. Metrics: Precision@5, MRR, Hit Rate, Recall with stratified difficulty levels.

P@5: 0.704 · MRR: 0.853 · Automated Eval

ML Pipeline — Medical AI

End-to-end pipeline: image capture → preprocessing → ML classification → PDF report → email delivery. Anemia detection from nail images.

Computer Vision · 85% Precision · Automated Pipeline

Security by Design

Every system follows secure development principles: credential rotation, environment isolation (dev/staging/prod), encrypted secrets management, and automated QA checks before deployment.

Secret Vaulting · Env Isolation · Automated QA · OWASP

Abuse Prevention & Monitoring

Rate limiting per IP, CAPTCHA verification, request fingerprinting, anomaly detection on access patterns, and automatic circuit breakers for suspicious behavior.

Rate Limiting · IP Tracking · Circuit Breakers · Logging
Experience
Data Engineering Specialist · Izipay
Nov 2025 — Present · Data & Analytics
  • Design stored procedures for complex churn analytics with mobile period logic and weighted averages.
  • Build end-to-end pipelines with existence validation, incremental copy, and daily SLA compliance.
  • Execute cross-platform validations between cloud environments to ensure data consistency in critical migrations.
  • Develop and deploy generative AI APIs and conversational chatbots following internal code and deployment standards.
Analyst I — Planning & Commercial Intelligence · Izipay
Apr 2025 — Nov 2025
  • Built retention KPI pipelines processing multi-million-row datasets in 20 min (vs. 2h prior) — 85% latency reduction.
  • Refactored 8 legacy SQL/Python scripts, restoring 100% reliability and cutting validation to 5 min.
  • Automated segmentation workflows (hours → 15 min) with >98% precision.
  • Mapped >38 sensitive data flows per Ley N° 29733 compliance.
Innovation & AI Coordinator · LOLIMSA
Jan 2025 — Mar 2025
  • Designed 2 WebSocket APIs for legacy-AI interoperability, centralizing monitoring for 5+ clients.
  • Built "LOLIMSA Detect" — anemia detection from nail images, 85% precision (image → ML → PDF/email).
  • Implemented advanced prompt engineering for pharmaceutical data extraction (70% success rate).
R&D Coordinator · LOLIMSA
Jan 2023 — Feb 2025 · 2 years
  • Led cross-functional team delivering innovation projects with data science and generative AI.
  • Power BI dashboards improving executive decision speed by 30%.
  • Desktop → web migration increasing operational efficiency by 25%.
  • Predictive analytics models reducing response times by 20%.
IT Coordinator · LOLIMSA
Jul 2022 — Dec 2022
  • Managed 200+ monthly tickets, reducing downtime 30% with proactive KPI monitoring.
  • Built PMBOK/BPM hybrid framework: 73% faster incident resolution, 7% recurrence, $15K savings.
Junior Engineer · LOLIMSA
Jun 2021 — Jul 2022
  • Developed S-KODA smart tennis ball prototype — Autodesk Inventor, 23% cost reduction.
  • Migrated 5 legacy servers (Huawei → OVH) for hospital systems, 40% less downtime.
Featured Project
Distributed Semantic Inference Engine with WebSocket Orchestration for BI Democratization
Mar 2025 — Oct 2025Private Project

Distributed architecture for processing natural language queries (NLQ) to democratize BI in enterprise environments. Asynchronous WebSocket orchestration with adaptive timeouts; semantic engine using open-source LLMs for NLQ-SQL translation enriched with ontological schemas (50+ business terms) and hierarchical fallbacks. Conversational interface with persistent context and distributed tracking.

NLQ-SQLWebSocketsDistributed SystemsGenerative AI
Education

MSc Data Science · Universidad Nacional de Ingeniería (UNI)

2024 — 2026 · Big Data, Python, SQL, R, Data Analysis

MSc Project Management · Escuela de Postgrado UTP

2021 — 2023

BSc Mechatronics Engineering · Universidad San Ignacio de Loyola (USIL)

2020 — 2023

BSc Environmental Engineering · Universidad San Ignacio de Loyola (USIL)

2015 — 2019

Diploma in Agile Project Management · CENTRUM PUCP

2022 · Agile Product Management, Coaching, Agile Teams

Diploma in Biomedical Equipment Management · TECH SENATI

2021 — 2022 · Clinical & Hospital Engineering

Certifications
  • Autodesk Inventor Advanced — UNI
  • AutoCAD 2D & 3D — UNI
  • Professional Graphic Engineering — UNI
  • English Competency — USIL
Technical Skills

Languages & Data

  • Python
  • SQL (Advanced)
  • R
  • Power BI
  • ETL Pipelines

AI & ML

  • RAG Systems
  • Generative AI / LLMs
  • Prompt Engineering
  • Machine Learning
  • Computer Vision

Cloud & Infra

  • GCP (Cloud Run, BigQuery)
  • Azure (Synapse, ADF)
  • PostgreSQL / Docker
  • APIs RESTful / WebSockets

Management

  • PMBOK / Agile / Scrum
  • MS Project
  • BPM / MASP
  • Team Leadership
Leadership & Volunteering
Founder & President — Mechatronics Engineers Society (M.E.S.)USIL · Oct 2020 — Oct 2021 · 4 divisions: Events, Marketing, R&D, Academic
Teaching Assistant — Digital CircuitsUSIL · Mar 2021 — Aug 2021
Environmental Coach — Ecolegio ContestUSIL · May 2021 — Oct 2021
Volunteer — Crea+ PeruSocial Services · Oct 2020 — Nov 2020
Scientific Research Director — Environmental Engineering ClubUSIL · Aug 2016 — May 2017
Physics Tutor — General PhysicsUSIL · Aug 2016 — Dec 2016
Get in Touch

Let's connect

Whether you're a recruiter, researcher, or fellow engineer — I'd love to hear from you. Reach out through any of these channels:

Email Me in LinkedIn

Lima, Peru · Open to collaborations, research partnerships, and new opportunities.