Spaces:
Running
π Hello, I'm Krishna Vamsi Dhulipalla
Iβm a Machine Learning Engineer with over 3 years of experience designing and deploying intelligent AI systems, integrating backend infrastructure, and building real-time data workflows. I specialize in LLM-powered agents, semantic search, bioinformatics AI models, and cloud-native ML infrastructure.
I earned my M.S. in Computer Science from Virginia Tech in December 2024 with a 3.95/4.0 GPA, focusing on large language models, intelligent agents, and scalable data systems. My work spans the full ML lifecycleβfrom research and fine-tuning transformer architectures to deploying production-ready applications on AWS and GCP.
Iβm passionate about LLM-driven systems, multi-agent orchestration, and domain-adaptive ML, particularly in genomic data analysis and real-time analytics.
π― Career Summary
- π¨βπ» 3+ years of experience in ML systems design, LLM-powered applications, and data engineering
- 𧬠Proven expertise in transformer fine-tuning (LoRA, soft prompting) for genomic classification
- π€ Skilled in LangChain, LangGraph, AutoGen, and CrewAI for intelligent agent workflows
- βοΈ Deep knowledge of AWS (S3, Glue, Lambda, SageMaker, ECS, CloudWatch) and GCP (BigQuery, Dataflow, Composer)
- β‘ Experienced in real-time data pipelines using Apache Kafka, Spark, Airflow, and dbt
- π Strong foundation in synthetic data generation, domain adaptation, and cross-domain NER
π Areas of Current Focus
- Developing LLM-powered mobile automation agents for UI task execution
- Architecting retrieval-augmented generation (RAG) systems with hybrid retrieval and cross-encoder reranking
- Fine-tuning DNA foundation models like DNABERT & HyenaDNA for plant genomics
- Building real-time analytics pipelines integrating Kafka, Spark, Airflow, and cloud services
π Education
Virginia Tech β M.S. in Computer Science
π Blacksburg, VA | Jan 2023 β Dec 2024
GPA: 3.95 / 4.0
Relevant Coursework: Distributed Systems, Machine Learning Optimization, Genomics, LLMs & Transformer Architectures
Anna University β B.Tech in Computer Science and Engineering
π Chennai, India | Jun 2018 β May 2022
GPA: 8.24 / 10
Specialization: Real-Time Analytics, Cloud Systems, Software Engineering Principles
π οΈ Technical Skills
Programming: Python, R, SQL, JavaScript, TypeScript, Node.js, FastAPI, MongoDB
ML Frameworks: PyTorch, TensorFlow, scikit-learn, Hugging Face Transformers
LLM & Agents: LangChain, LangGraph, AutoGen, CrewAI, Prompt Engineering, RAG, LoRA, GANs
ML Techniques: Self-Supervised Learning, Cross-Domain Adaptation, Hyperparameter Optimization, A/B Testing
Data Engineering: Apache Spark, Kafka, dbt, Airflow, ETL Pipelines, Delta Lake, Snowflake
Cloud & Infra: AWS (S3, Glue, Lambda, Redshift, ECS, SageMaker, CloudWatch), GCP (GCS, BigQuery, Dataflow, Composer)
DevOps/MLOps: Docker, Kubernetes, MLflow, CI/CD, Weights & Biases
Visualization: Tableau, Shiny (R), Plotly, Matplotlib
Other Tools: Pandas, NumPy, Git, LangSmith, LangFlow, Linux
πΌ Professional Experience
Cloud Systems LLC β ML Research Engineer (Current role)
π Remote | Jul 2024 β Present
- Designed and optimized SQL-based data retrieval and batch + real-time pipelines
- Built automated ETL workflows integrating multiple data sources
Virginia Tech β ML Research Engineer
π Blacksburg, VA | Sep 2024 β Jul 2024
- Developed DNA sequence classification pipelines using DNABERT & HyenaDNA with LoRA & soft prompting (94%+ accuracy)
- Automated preprocessing of 1M+ genomic sequences with Biopython & Airflow, reducing runtime by 40%
- Built LangChain-based semantic search for genomics literature
- Deployed fine-tuned LLMs using Docker, MLflow, and optionally SageMaker
Virginia Tech β Research Assistant
π Blacksburg, VA | Jun 2023 β May 2024
- Built genomic ETL pipelines (Airflow + AWS Glue) improving research data availability by 50%
- Automated retraining workflows via CI/CD, reducing manual workload by 40%
- Benchmarked compute cluster performance to cut runtime costs by 15%
UJR Technologies Pvt Ltd β Data Engineer
π Hyderabad, India | Jul 2021 β Dec 2022
- Migrated batch ETL to real-time streaming with Kafka & Spark (β latency 30%)
- Deployed Dockerized microservices to AWS ECS, improving deployment speed by 25%
- Optimized Snowflake schemas to improve query performance by 40%
π Highlight Projects
- LLM-Based Android Agent β Multi-step UI automation with memory, self-reflection, and context recovery (80%+ accuracy)
Real-Time IoT-Based Temperature Forecasting
- Kafka-based pipeline for 10K+ sensor readings with LLaMA 2-based time series model (91% accuracy)
- Airflow + Looker dashboards (β manual reporting by 30%)
- S3 lifecycle policies saved 40% storage cost with versioned backups
π GitHub
Proxy TuNER: Cross-Domain NER
- Developed a proxy tuning method for domain-agnostic BERT
- 15% generalization gain using gradient reversal + feature alignment
- 70% cost reduction via logit-level ensembling
π GitHub
IntelliMeet: AI-Powered Conferencing
- Federated learning, end-to-end encrypted platform
- Live attention detection using RetinaFace (<200ms latency)
- Summarization with Transformer-based speech-to-text
π GitHub
Automated Drone Image Analysis
- Real-time crop disease detection using drone imagery
- Used OpenCV, RAG, and GANs for synthetic data generation
- Improved detection accuracy by 15% and reduced processing latency by 70%
π Certifications
- π NVIDIA β Building RAG Agents with LLMs
- π Google Cloud β Data Engineering Foundations
- π AWS β Machine Learning Specialty
- π Microsoft β MERN Stack Development
- π Snowflake β End-to-End Data Engineering
- π Coursera β Machine Learning Specialization
π View All Credentials
π Research Publications
IEEE BIBM 2024 β βLeveraging ML for Predicting Circadian Transcription in mRNAs and lncRNAsβ
DOI: 10.1109/BIBM62325.2024.10822684MLCB β βHarnessing DNA Foundation Models for TF Binding Prediction in Plantsβ
π External Links / Contact details
- π Personal Portfolio/ personal website
- π§ͺ GitHub
- πΌ LinkedIn
- π¬ dhulipallakrishnavamsi@gmail.com
- π€ Personal Chatbot