Spaces:
Running
Running
# π Hello, I'm Krishna Vamsi Dhulipalla | |
Iβm a **Machine Learning Engineer** with over **3 years of experience** designing and deploying intelligent AI systems, integrating backend infrastructure, and building real-time data workflows. I specialize in **LLM-powered agents**, **semantic search**, **bioinformatics AI models**, and **cloud-native ML infrastructure**. | |
I earned my **M.S. in Computer Science** from **Virginia Tech** in December 2024 with a 3.95/4.0 GPA, focusing on large language models, intelligent agents, and scalable data systems. My work spans the full ML lifecycleβfrom research and fine-tuning transformer architectures to deploying production-ready applications on AWS and GCP. | |
Iβm passionate about **LLM-driven systems**, **multi-agent orchestration**, and **domain-adaptive ML**, particularly in **genomic data analysis** and **real-time analytics**. | |
--- | |
## π― Career Summary | |
- π¨βπ» 3+ years of experience in **ML systems design**, **LLM-powered applications**, and **data engineering** | |
- 𧬠Proven expertise in **transformer fine-tuning** (LoRA, soft prompting) for genomic classification | |
- π€ Skilled in **LangChain**, **LangGraph**, **AutoGen**, and **CrewAI** for intelligent agent workflows | |
- βοΈ Deep knowledge of **AWS** (S3, Glue, Lambda, SageMaker, ECS, CloudWatch) and **GCP** (BigQuery, Dataflow, Composer) | |
- β‘ Experienced in **real-time data pipelines** using **Apache Kafka**, **Spark**, **Airflow**, and **dbt** | |
- π Strong foundation in **synthetic data generation**, **domain adaptation**, and **cross-domain NER** | |
## π Areas of Current Focus | |
- Developing **LLM-powered mobile automation agents** for UI task execution | |
- Architecting **retrieval-augmented generation (RAG)** systems with hybrid retrieval and cross-encoder reranking | |
- Fine-tuning **DNA foundation models** like DNABERT & HyenaDNA for plant genomics | |
- Building **real-time analytics pipelines** integrating Kafka, Spark, Airflow, and cloud services | |
--- | |
## π Education | |
### Virginia Tech β M.S. in Computer Science | |
π Blacksburg, VA | Jan 2023 β Dec 2024 | |
**GPA:** 3.95 / 4.0 | |
Relevant Coursework: Distributed Systems, Machine Learning Optimization, Genomics, LLMs & Transformer Architectures | |
### Anna University β B.Tech in Computer Science and Engineering | |
π Chennai, India | Jun 2018 β May 2022 | |
**GPA:** 8.24 / 10 | |
Specialization: Real-Time Analytics, Cloud Systems, Software Engineering Principles | |
--- | |
## π οΈ Technical Skills | |
**Programming:** Python, R, SQL, JavaScript, TypeScript, Node.js, FastAPI, MongoDB | |
**ML Frameworks:** PyTorch, TensorFlow, scikit-learn, Hugging Face Transformers | |
**LLM & Agents:** LangChain, LangGraph, AutoGen, CrewAI, Prompt Engineering, RAG, LoRA, GANs | |
**ML Techniques:** Self-Supervised Learning, Cross-Domain Adaptation, Hyperparameter Optimization, A/B Testing | |
**Data Engineering:** Apache Spark, Kafka, dbt, Airflow, ETL Pipelines, Delta Lake, Snowflake | |
**Cloud & Infra:** AWS (S3, Glue, Lambda, Redshift, ECS, SageMaker, CloudWatch), GCP (GCS, BigQuery, Dataflow, Composer) | |
**DevOps/MLOps:** Docker, Kubernetes, MLflow, CI/CD, Weights & Biases | |
**Visualization:** Tableau, Shiny (R), Plotly, Matplotlib | |
**Other Tools:** Pandas, NumPy, Git, LangSmith, LangFlow, Linux | |
--- | |
## πΌ Professional Experience | |
### Cloud Systems LLC β ML Research Engineer (Current role) | |
π Remote | Jul 2024 β Present | |
- Designed and optimized **SQL-based data retrieval** and **batch + real-time pipelines** | |
- Built automated **ETL workflows** integrating multiple data sources | |
### Virginia Tech β ML Research Engineer | |
π Blacksburg, VA | Sep 2024 β Jul 2024 | |
- Developed **DNA sequence classification pipelines** using DNABERT & HyenaDNA with LoRA & soft prompting (94%+ accuracy) | |
- Automated preprocessing of **1M+ genomic sequences** with Biopython & Airflow, reducing runtime by 40% | |
- Built **LangChain-based semantic search** for genomics literature | |
- Deployed fine-tuned LLMs using Docker, MLflow, and optionally SageMaker | |
### Virginia Tech β Research Assistant | |
π Blacksburg, VA | Jun 2023 β May 2024 | |
- Built **genomic ETL pipelines** (Airflow + AWS Glue) improving research data availability by 50% | |
- Automated retraining workflows via CI/CD, reducing manual workload by 40% | |
- Benchmarked compute cluster performance to cut runtime costs by 15% | |
### UJR Technologies Pvt Ltd β Data Engineer | |
π Hyderabad, India | Jul 2021 β Dec 2022 | |
- Migrated **batch ETL to real-time streaming** with Kafka & Spark (β latency 30%) | |
- Deployed Dockerized microservices to AWS ECS, improving deployment speed by 25% | |
- Optimized Snowflake schemas to improve query performance by 40% | |
--- | |
## π Highlight Projects | |
- **LLM-Based Android Agent** β Multi-step UI automation with memory, self-reflection, and context recovery (80%+ accuracy) | |
### Real-Time IoT-Based Temperature Forecasting | |
- Kafka-based pipeline for 10K+ sensor readings with LLaMA 2-based time series model (91% accuracy) | |
- Airflow + Looker dashboards (β manual reporting by 30%) | |
- S3 lifecycle policies saved 40% storage cost with versioned backups | |
π [GitHub](https://github.com/krishna-creator/Real-Time-IoT-Based-Temperature-Analytics-and-Forecasting) | |
### Proxy TuNER: Cross-Domain NER | |
- Developed a proxy tuning method for domain-agnostic BERT | |
- 15% generalization gain using gradient reversal + feature alignment | |
- 70% cost reduction via logit-level ensembling | |
π [GitHub](https://github.com/krishna-creator/ProxytuNER) | |
### IntelliMeet: AI-Powered Conferencing | |
- Federated learning, end-to-end encrypted platform | |
- Live attention detection using RetinaFace (<200ms latency) | |
- Summarization with Transformer-based speech-to-text | |
π [GitHub](https://github.com/krishna-creator/SE-Project---IntelliMeet) | |
### Automated Drone Image Analysis | |
- Real-time crop disease detection using drone imagery | |
- Used OpenCV, RAG, and GANs for synthetic data generation | |
- Improved detection accuracy by 15% and reduced processing latency by 70% | |
--- | |
## π Certifications | |
- π NVIDIA β Building RAG Agents with LLMs | |
- π Google Cloud β Data Engineering Foundations | |
- π AWS β Machine Learning Specialty | |
- π Microsoft β MERN Stack Development | |
- π Snowflake β End-to-End Data Engineering | |
- π Coursera β Machine Learning Specialization | |
π [View All Credentials](https://www.linkedin.com/in/krishnavamsidhulipalla/) | |
--- | |
## π Research Publications | |
- **IEEE BIBM 2024** β βLeveraging ML for Predicting Circadian Transcription in mRNAs and lncRNAsβ | |
[DOI: 10.1109/BIBM62325.2024.10822684](https://doi.org/10.1109/BIBM62325.2024.10822684) | |
- **MLCB** β βHarnessing DNA Foundation Models for TF Binding Prediction in Plantsβ | |
--- | |
## π External Links / Contact details | |
- π [Personal Portfolio/ personal website](http://krishna-dhulipalla.github.io) | |
- π§ͺ [GitHub](https://github.com/Krishna-dhulipalla) | |
- πΌ [LinkedIn](https://www.linkedin.com/in/krishnavamsidhulipalla) | |
- π¬ dhulipallakrishnavamsi@gmail.com | |
- π€ [Personal Chatbot](https://huggingface.co/spaces/krishnadhulipalla/Personal_ChatBot) | |