ChatBot / personal_data /aprofile.md
krishnadhulipalla's picture
Add application file
249a397

πŸ‘‹ Hello, I'm Krishna Vamsi Dhulipalla

I’m a Machine Learning Engineer with over 3 years of experience designing and deploying intelligent AI systems, integrating backend infrastructure, and building real-time data workflows. I specialize in LLM-powered agents, semantic search, bioinformatics AI models, and cloud-native ML infrastructure.

I earned my M.S. in Computer Science from Virginia Tech in December 2024 with a 3.95/4.0 GPA, focusing on large language models, intelligent agents, and scalable data systems. My work spans the full ML lifecycleβ€”from research and fine-tuning transformer architectures to deploying production-ready applications on AWS and GCP.

I’m passionate about LLM-driven systems, multi-agent orchestration, and domain-adaptive ML, particularly in genomic data analysis and real-time analytics.


🎯 Career Summary

  • πŸ‘¨β€πŸ’» 3+ years of experience in ML systems design, LLM-powered applications, and data engineering
  • 🧬 Proven expertise in transformer fine-tuning (LoRA, soft prompting) for genomic classification
  • πŸ€– Skilled in LangChain, LangGraph, AutoGen, and CrewAI for intelligent agent workflows
  • ☁️ Deep knowledge of AWS (S3, Glue, Lambda, SageMaker, ECS, CloudWatch) and GCP (BigQuery, Dataflow, Composer)
  • ⚑ Experienced in real-time data pipelines using Apache Kafka, Spark, Airflow, and dbt
  • πŸ“Š Strong foundation in synthetic data generation, domain adaptation, and cross-domain NER

πŸ”­ Areas of Current Focus

  • Developing LLM-powered mobile automation agents for UI task execution
  • Architecting retrieval-augmented generation (RAG) systems with hybrid retrieval and cross-encoder reranking
  • Fine-tuning DNA foundation models like DNABERT & HyenaDNA for plant genomics
  • Building real-time analytics pipelines integrating Kafka, Spark, Airflow, and cloud services

πŸŽ“ Education

Virginia Tech β€” M.S. in Computer Science

πŸ“ Blacksburg, VA | Jan 2023 – Dec 2024
GPA: 3.95 / 4.0
Relevant Coursework: Distributed Systems, Machine Learning Optimization, Genomics, LLMs & Transformer Architectures

Anna University β€” B.Tech in Computer Science and Engineering

πŸ“ Chennai, India | Jun 2018 – May 2022
GPA: 8.24 / 10
Specialization: Real-Time Analytics, Cloud Systems, Software Engineering Principles


πŸ› οΈ Technical Skills

Programming: Python, R, SQL, JavaScript, TypeScript, Node.js, FastAPI, MongoDB
ML Frameworks: PyTorch, TensorFlow, scikit-learn, Hugging Face Transformers
LLM & Agents: LangChain, LangGraph, AutoGen, CrewAI, Prompt Engineering, RAG, LoRA, GANs
ML Techniques: Self-Supervised Learning, Cross-Domain Adaptation, Hyperparameter Optimization, A/B Testing
Data Engineering: Apache Spark, Kafka, dbt, Airflow, ETL Pipelines, Delta Lake, Snowflake
Cloud & Infra: AWS (S3, Glue, Lambda, Redshift, ECS, SageMaker, CloudWatch), GCP (GCS, BigQuery, Dataflow, Composer)
DevOps/MLOps: Docker, Kubernetes, MLflow, CI/CD, Weights & Biases
Visualization: Tableau, Shiny (R), Plotly, Matplotlib
Other Tools: Pandas, NumPy, Git, LangSmith, LangFlow, Linux


πŸ’Ό Professional Experience

Cloud Systems LLC β€” ML Research Engineer (Current role)

πŸ“ Remote | Jul 2024 – Present

  • Designed and optimized SQL-based data retrieval and batch + real-time pipelines
  • Built automated ETL workflows integrating multiple data sources

Virginia Tech β€” ML Research Engineer

πŸ“ Blacksburg, VA | Sep 2024 – Jul 2024

  • Developed DNA sequence classification pipelines using DNABERT & HyenaDNA with LoRA & soft prompting (94%+ accuracy)
  • Automated preprocessing of 1M+ genomic sequences with Biopython & Airflow, reducing runtime by 40%
  • Built LangChain-based semantic search for genomics literature
  • Deployed fine-tuned LLMs using Docker, MLflow, and optionally SageMaker

Virginia Tech β€” Research Assistant

πŸ“ Blacksburg, VA | Jun 2023 – May 2024

  • Built genomic ETL pipelines (Airflow + AWS Glue) improving research data availability by 50%
  • Automated retraining workflows via CI/CD, reducing manual workload by 40%
  • Benchmarked compute cluster performance to cut runtime costs by 15%

UJR Technologies Pvt Ltd β€” Data Engineer

πŸ“ Hyderabad, India | Jul 2021 – Dec 2022

  • Migrated batch ETL to real-time streaming with Kafka & Spark (↓ latency 30%)
  • Deployed Dockerized microservices to AWS ECS, improving deployment speed by 25%
  • Optimized Snowflake schemas to improve query performance by 40%

πŸ“Š Highlight Projects

  • LLM-Based Android Agent – Multi-step UI automation with memory, self-reflection, and context recovery (80%+ accuracy)

Real-Time IoT-Based Temperature Forecasting

  • Kafka-based pipeline for 10K+ sensor readings with LLaMA 2-based time series model (91% accuracy)
  • Airflow + Looker dashboards (↓ manual reporting by 30%)
  • S3 lifecycle policies saved 40% storage cost with versioned backups
    πŸ”— GitHub

Proxy TuNER: Cross-Domain NER

  • Developed a proxy tuning method for domain-agnostic BERT
  • 15% generalization gain using gradient reversal + feature alignment
  • 70% cost reduction via logit-level ensembling
    πŸ”— GitHub

IntelliMeet: AI-Powered Conferencing

  • Federated learning, end-to-end encrypted platform
  • Live attention detection using RetinaFace (<200ms latency)
  • Summarization with Transformer-based speech-to-text
    πŸ”— GitHub

Automated Drone Image Analysis

  • Real-time crop disease detection using drone imagery
  • Used OpenCV, RAG, and GANs for synthetic data generation
  • Improved detection accuracy by 15% and reduced processing latency by 70%

πŸ“œ Certifications

  • πŸ† NVIDIA – Building RAG Agents with LLMs
  • πŸ† Google Cloud – Data Engineering Foundations
  • πŸ† AWS – Machine Learning Specialty
  • πŸ† Microsoft – MERN Stack Development
  • πŸ† Snowflake – End-to-End Data Engineering
  • πŸ† Coursera – Machine Learning Specialization
    πŸ”— View All Credentials

πŸ“š Research Publications

  • IEEE BIBM 2024 – β€œLeveraging ML for Predicting Circadian Transcription in mRNAs and lncRNAs”
    DOI: 10.1109/BIBM62325.2024.10822684

  • MLCB – β€œHarnessing DNA Foundation Models for TF Binding Prediction in Plants”


πŸ”— External Links / Contact details