Zeyad12 commited on
Commit
45edf4a
·
verified ·
1 Parent(s): dc1ebea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -723
README.md CHANGED
@@ -361,438 +361,6 @@ Epoch 4: Fine-tuned generation quality
361
 
362
  ---
363
 
364
- ## API Documentation
365
-
366
- ### Overview
367
-
368
- The PEGASUS Summarization API provides a RESTful interface for document summarization with comprehensive error handling, performance optimization, and flexible configuration options.
369
-
370
- ### Base URL
371
-
372
- ```
373
- http://localhost:5000
374
- ```
375
-
376
- ### Authentication
377
-
378
- Currently, no authentication is required. For production deployment, consider implementing API key authentication.
379
-
380
- ### Endpoints
381
-
382
- #### 1. Health Check
383
-
384
- ```http
385
- GET /health
386
- ```
387
-
388
- **Response:**
389
-
390
- ```json
391
- {
392
- "status": "healthy",
393
- "model_loaded": true,
394
- "device": "cuda:0",
395
- "timestamp": 1638360000.0
396
- }
397
- ```
398
-
399
- #### 2. Model Information
400
-
401
- ```http
402
- GET /model-info
403
- ```
404
-
405
- **Response:**
406
-
407
- ```json
408
- {
409
- "model_name": "Fine-tuned PEGASUS",
410
- "base_model": "google/pegasus-large",
411
- "fine_tuned_on": "Scientific Papers Dataset (500 documents)",
412
- "max_input_length": 1024,
413
- "max_output_length": 512,
414
- "device": "cuda:0",
415
- "capabilities": {
416
- "chunking": true,
417
- "length_control": true,
418
- "custom_parameters": true,
419
- "batch_processing": false,
420
- "streaming": false
421
- }
422
- }
423
- ```
424
-
425
- #### 3. Summarization
426
-
427
- ```http
428
- POST /summarize
429
- ```
430
-
431
- **Request Body:**
432
-
433
- ```json
434
- {
435
- "text": "Your document text here...",
436
- "max_length": 200,
437
- "config": {
438
- "num_beams": 4,
439
- "length_penalty": 2.0,
440
- "temperature": 1.0,
441
- "top_k": 50,
442
- "top_p": 0.95
443
- }
444
- }
445
- ```
446
-
447
- **Response:**
448
-
449
- ```json
450
- {
451
- "summary": "Generated summary text...",
452
- "input_length": 1250,
453
- "input_tokens": 312,
454
- "output_length": 180,
455
- "output_tokens": 45,
456
- "processing_time": 2.34,
457
- "chunks_processed": 1,
458
- "model_used": "fine-tuned-pegasus",
459
- "success": true
460
- }
461
- ```
462
-
463
- ### Configuration Parameters
464
-
465
- | Parameter | Type | Range | Default | Description |
466
- | ---------------------- | ----- | ------- | ------- | --------------------------------- |
467
- | `max_length` | int | 50-500 | 512 | Maximum summary length in tokens |
468
- | `num_beams` | int | 1-8 | 4 | Beam search width for generation |
469
- | `length_penalty` | float | 0.5-3.0 | 2.0 | Penalty for sequence length |
470
- | `temperature` | float | 0.1-2.0 | 1.0 | Sampling temperature |
471
- | `top_k` | int | 10-100 | 50 | Top-k sampling parameter |
472
- | `top_p` | float | 0.1-1.0 | 0.95 | Nucleus sampling parameter |
473
- | `diversity_penalty` | float | 0.0-2.0 | 0.5 | Diversity penalty for beam groups |
474
- | `no_repeat_ngram_size` | int | 1-5 | 3 | Prevent n-gram repetition |
475
-
476
- ### Error Handling
477
-
478
- **HTTP Status Codes:**
479
-
480
- - `200`: Success
481
- - `400`: Bad Request (invalid parameters)
482
- - `404`: Endpoint not found
483
- - `405`: Method not allowed
484
- - `500`: Internal server error
485
-
486
- **Error Response Format:**
487
-
488
- ```json
489
- {
490
- "error": "Error description",
491
- "success": false,
492
- "details": "Additional error information"
493
- }
494
- ```
495
-
496
- ### Rate Limiting & Performance
497
-
498
- **Current Limitations:**
499
-
500
- - No rate limiting implemented (add for production)
501
- - Single request processing (no batch support)
502
- - Memory usage scales with document length
503
-
504
- **Performance Characteristics:**
505
-
506
- - Short documents (< 500 tokens): ~1-2 seconds
507
- - Medium documents (500-1000 tokens): ~2-4 seconds
508
- - Long documents (> 1000 tokens): ~4-8 seconds
509
-
510
- ---
511
-
512
- ## Installation & Setup
513
-
514
- ### Prerequisites
515
-
516
- **System Requirements:**
517
-
518
- - Python 3.8 or higher
519
- - 8GB RAM minimum (16GB recommended)
520
- - GPU with 6GB VRAM (optional but recommended)
521
- - 10GB free disk space
522
-
523
- **Dependencies:**
524
-
525
- - PyTorch 2.0+
526
- - Transformers 4.30+
527
- - Flask 2.3+
528
- - Other packages listed in requirements.txt
529
-
530
- ### Installation Steps
531
-
532
- #### 1. Clone/Download the Project
533
-
534
- ```powershell
535
- # Navigate to your desired directory
536
- cd "f:\University\GP Final\Summarization_Model"
537
- ```
538
-
539
- #### 2. Create Virtual Environment
540
-
541
- ```powershell
542
- # Create virtual environment
543
- python -m venv venv
544
-
545
- # Activate virtual environment
546
- .\venv\Scripts\Activate.ps1
547
- ```
548
-
549
- #### 3. Install Dependencies
550
-
551
- ```powershell
552
- # Install required packages
553
- pip install -r requirements.txt
554
- ```
555
-
556
- #### 4. Verify Model Files
557
-
558
- Ensure the fine-tuned model is available:
559
-
560
- ```
561
- Pegasus-Fine-Tuned/
562
- └── checkpoint-200/
563
- ├── config.json
564
- ├── model.safetensors
565
- ├── tokenizer_config.json
566
- └── ...
567
- ```
568
-
569
- #### 5. Start the API Server
570
-
571
- ```powershell
572
- # Start the Flask application
573
- python app.py
574
- ```
575
-
576
- The server will start on `http://localhost:5000`
577
-
578
- ### Docker Deployment (Optional)
579
-
580
- Create a `Dockerfile`:
581
-
582
- ```dockerfile
583
- FROM python:3.9-slim
584
-
585
- WORKDIR /app
586
- COPY requirements.txt .
587
- RUN pip install -r requirements.txt
588
-
589
- COPY . .
590
- EXPOSE 5000
591
-
592
- CMD ["python", "app.py"]
593
- ```
594
-
595
- Build and run:
596
-
597
- ```powershell
598
- docker build -t pegasus-api .
599
- docker run -p 5000:5000 pegasus-api
600
- ```
601
-
602
- ---
603
-
604
- ## Usage Examples
605
-
606
- ### 1. Basic Python Client
607
-
608
- ```python
609
- import requests
610
- import json
611
-
612
- def summarize_text(text, max_length=200):
613
- url = "http://localhost:5000/summarize"
614
- payload = {
615
- "text": text,
616
- "max_length": max_length
617
- }
618
-
619
- response = requests.post(url, json=payload)
620
-
621
- if response.status_code == 200:
622
- result = response.json()
623
- if result["success"]:
624
- return result["summary"]
625
- else:
626
- print(f"Error: {result['error']}")
627
- else:
628
- print(f"HTTP Error: {response.status_code}")
629
-
630
- return None
631
-
632
- # Example usage
633
- document = """
634
- Artificial intelligence and machine learning have transformed numerous industries
635
- in recent years. From healthcare to finance, these technologies are enabling
636
- automation and insights that were previously impossible. Deep learning, in
637
- particular, has shown remarkable success in computer vision, natural language
638
- processing, and speech recognition tasks.
639
- """
640
-
641
- summary = summarize_text(document)
642
- print(f"Summary: {summary}")
643
- ```
644
-
645
- ### 2. Advanced Configuration
646
-
647
- ```python
648
- def advanced_summarize(text):
649
- url = "http://localhost:5000/summarize"
650
- payload = {
651
- "text": text,
652
- "max_length": 150,
653
- "config": {
654
- "num_beams": 6,
655
- "length_penalty": 1.5,
656
- "temperature": 0.8,
657
- "top_p": 0.9,
658
- "diversity_penalty": 0.7
659
- }
660
- }
661
-
662
- response = requests.post(url, json=payload)
663
- return response.json()
664
-
665
- # More creative and diverse summaries
666
- result = advanced_summarize(document)
667
- print(f"Advanced Summary: {result['summary']}")
668
- print(f"Processing Time: {result['processing_time']:.2f}s")
669
- ```
670
-
671
- ### 3. Batch Processing
672
-
673
- ```python
674
- def batch_summarize(documents, max_length=200):
675
- """Process multiple documents sequentially"""
676
- results = []
677
-
678
- for i, doc in enumerate(documents):
679
- print(f"Processing document {i+1}/{len(documents)}")
680
- summary = summarize_text(doc, max_length)
681
- results.append({
682
- "document_id": i,
683
- "original_length": len(doc),
684
- "summary": summary
685
- })
686
-
687
- return results
688
-
689
- # Example with multiple documents
690
- documents = [
691
- "Document 1 content...",
692
- "Document 2 content...",
693
- "Document 3 content..."
694
- ]
695
-
696
- batch_results = batch_summarize(documents)
697
- ```
698
-
699
- ### 4. Error Handling
700
-
701
- ```python
702
- def robust_summarize(text, max_retries=3):
703
- """Summarize with retry logic and error handling"""
704
- for attempt in range(max_retries):
705
- try:
706
- response = requests.post(
707
- "http://localhost:5000/summarize",
708
- json={"text": text},
709
- timeout=30
710
- )
711
-
712
- if response.status_code == 200:
713
- result = response.json()
714
- if result["success"]:
715
- return result
716
- else:
717
- print(f"API Error: {result['error']}")
718
- else:
719
- print(f"HTTP Error: {response.status_code}")
720
-
721
- except requests.exceptions.Timeout:
722
- print(f"Timeout on attempt {attempt + 1}")
723
- except requests.exceptions.ConnectionError:
724
- print(f"Connection error on attempt {attempt + 1}")
725
-
726
- if attempt < max_retries - 1:
727
- time.sleep(2 ** attempt) # Exponential backoff
728
-
729
- return None
730
- ```
731
-
732
- ### 5. Performance Monitoring
733
-
734
- ```python
735
- def monitor_performance(text):
736
- """Monitor and log performance metrics"""
737
- import time
738
-
739
- start_time = time.time()
740
- result = summarize_text(text)
741
- client_time = time.time() - start_time
742
-
743
- if result:
744
- server_time = result.get("processing_time", 0)
745
- network_time = client_time - server_time
746
-
747
- print(f"Performance Metrics:")
748
- print(f" Total Time: {client_time:.2f}s")
749
- print(f" Server Time: {server_time:.2f}s")
750
- print(f" Network Time: {network_time:.2f}s")
751
- print(f" Input Tokens: {result.get('input_tokens', 'N/A')}")
752
- print(f" Output Tokens: {result.get('output_tokens', 'N/A')}")
753
- print(f" Chunks Processed: {result.get('chunks_processed', 1)}")
754
-
755
- return result
756
- ```
757
-
758
- ### 6. Integration with File Processing
759
-
760
- ```python
761
- import os
762
- from pathlib import Path
763
-
764
- def process_text_files(directory_path, output_file="summaries.json"):
765
- """Process all text files in a directory"""
766
- results = []
767
- directory = Path(directory_path)
768
-
769
- for file_path in directory.glob("*.txt"):
770
- with open(file_path, 'r', encoding='utf-8') as file:
771
- content = file.read()
772
-
773
- print(f"Processing: {file_path.name}")
774
- summary = summarize_text(content)
775
-
776
- results.append({
777
- "filename": file_path.name,
778
- "original_length": len(content),
779
- "summary": summary,
780
- "summary_length": len(summary) if summary else 0
781
- })
782
-
783
- # Save results
784
- with open(output_file, 'w', encoding='utf-8') as f:
785
- json.dump(results, f, indent=2, ensure_ascii=False)
786
-
787
- print(f"Results saved to {output_file}")
788
- return results
789
-
790
- # Process all text files in a directory
791
- results = process_text_files("./documents/")
792
- ```
793
-
794
- ---
795
-
796
  ## Technical Specifications
797
 
798
  ### Model Architecture Details
@@ -847,41 +415,6 @@ Max Position Embeddings: 1024
847
  - Concurrent requests: Limited by memory (recommend 1-2 concurrent on 16GB RAM)
848
  - Daily capacity: ~1000-5000 documents (depends on length and hardware)
849
 
850
- ### Scalability Considerations
851
-
852
- **Current Limitations:**
853
-
854
- 1. Single-threaded processing
855
- 2. No request queuing
856
- 3. Memory usage scales with document length
857
- 4. No horizontal scaling support
858
-
859
- **Recommended Improvements for Production:**
860
-
861
- 1. Implement request queuing with Redis/RabbitMQ
862
- 2. Add horizontal scaling with load balancer
863
- 3. Implement caching for repeated requests
864
- 4. Add batch processing capabilities
865
- 5. Optimize memory usage with model quantization
866
-
867
- ### Security Considerations
868
-
869
- **Current Security Features:**
870
-
871
- - Input validation and sanitization
872
- - Error message filtering
873
- - Request size limits
874
-
875
- **Production Security Recommendations:**
876
-
877
- 1. **API Authentication**: Implement JWT or API key authentication
878
- 2. **Rate Limiting**: Prevent abuse with request rate limits
879
- 3. **Input Validation**: Comprehensive input sanitization
880
- 4. **HTTPS**: Use SSL/TLS encryption
881
- 5. **Monitoring**: Log all requests and monitor for anomalies
882
- 6. **Network Security**: Use firewalls and VPNs for internal access
883
-
884
- ---
885
 
886
  ## Comparison: Before vs After Fine-tuning
887
 
@@ -1074,262 +607,6 @@ Epoch 4: 2.489 (best model selected)
1074
 
1075
  ---
1076
 
1077
- ## Troubleshooting
1078
-
1079
- ### Common Issues and Solutions
1080
-
1081
- #### 1. Model Loading Issues
1082
-
1083
- **Problem**: Model fails to load or takes too long
1084
-
1085
- ```
1086
- Error: "Failed to load any PEGASUS model"
1087
- ```
1088
-
1089
- **Solutions:**
1090
-
1091
- ```powershell
1092
- # Check if model files exist
1093
- ls "Pegasus-Fine-Tuned\checkpoint-200\"
1094
-
1095
- # Verify file integrity
1096
- # Re-extract checkpoint if corrupted
1097
- Expand-Archive -Path "Pegasus-Fine-Tuned\checkpoint-200.zip" -DestinationPath "." -Force
1098
-
1099
- # Check available memory
1100
- Get-WmiObject -Class Win32_ComputerSystem | Select-Object TotalPhysicalMemory
1101
-
1102
- # Free up memory if needed
1103
- [System.GC]::Collect()
1104
- ```
1105
-
1106
- #### 2. CUDA/GPU Issues
1107
-
1108
- **Problem**: GPU not detected or CUDA errors
1109
-
1110
- ```
1111
- Error: "CUDA out of memory" or "CUDA device not available"
1112
- ```
1113
-
1114
- **Solutions:**
1115
-
1116
- ```python
1117
- # Check CUDA availability
1118
- import torch
1119
- print(f"CUDA available: {torch.cuda.is_available()}")
1120
- print(f"CUDA devices: {torch.cuda.device_count()}")
1121
-
1122
- # Clear GPU cache
1123
- torch.cuda.empty_cache()
1124
-
1125
- # Force CPU usage if needed
1126
- device = torch.device("cpu")
1127
- ```
1128
-
1129
- #### 3. Memory Issues
1130
-
1131
- **Problem**: Out of memory errors during processing
1132
-
1133
- ```
1134
- Error: "RuntimeError: CUDA out of memory"
1135
- ```
1136
-
1137
- **Solutions:**
1138
-
1139
- 1. **Reduce batch size**: Set `batch_size = 1` in config
1140
- 2. **Enable gradient checkpointing**: Add to training args
1141
- 3. **Use CPU fallback**: Force CPU processing for large documents
1142
- 4. **Implement chunking**: Process documents in smaller pieces
1143
-
1144
- ```python
1145
- # Memory-efficient processing
1146
- def process_large_document(text):
1147
- # Split into smaller chunks
1148
- chunks = chunk_text(text, max_chunk_length=500)
1149
- summaries = []
1150
-
1151
- for chunk in chunks:
1152
- summary = summarize_chunk(chunk)
1153
- summaries.append(summary)
1154
-
1155
- # Clear cache after each chunk
1156
- torch.cuda.empty_cache()
1157
-
1158
- return combine_summaries(summaries)
1159
- ```
1160
-
1161
- #### 4. API Connection Issues
1162
-
1163
- **Problem**: Cannot connect to API or timeouts
1164
-
1165
- **Solutions:**
1166
-
1167
- ```powershell
1168
- # Check if server is running
1169
- netstat -an | findstr :5000
1170
-
1171
- # Test basic connectivity
1172
- curl http://localhost:5000/health
1173
-
1174
- # Check firewall settings
1175
- netsh advfirewall firewall show rule name="Python"
1176
-
1177
- # Restart server with verbose logging
1178
- python app.py --debug
1179
- ```
1180
-
1181
- #### 5. Performance Issues
1182
-
1183
- **Problem**: Slow response times or high resource usage
1184
-
1185
- **Optimization Strategies:**
1186
-
1187
- ```python
1188
- # 1. Optimize generation parameters
1189
- config = {
1190
- "num_beams": 2, # Reduce from 4
1191
- "max_length": 256, # Reduce if appropriate
1192
- "early_stopping": True
1193
- }
1194
-
1195
- # 2. Implement caching
1196
- from functools import lru_cache
1197
-
1198
- @lru_cache(maxsize=100)
1199
- def cached_summarize(text_hash):
1200
- return summarize(text)
1201
-
1202
- # 3. Use model quantization
1203
- from transformers import AutoModelForSeq2SeqLM
1204
- model = AutoModelForSeq2SeqLM.from_pretrained(
1205
- model_path,
1206
- torch_dtype=torch.float16 # Use half precision
1207
- )
1208
- ```
1209
-
1210
- #### 6. Text Processing Issues
1211
-
1212
- **Problem**: Poor quality summaries or encoding errors
1213
-
1214
- **Solutions:**
1215
-
1216
- ```python
1217
- # Text preprocessing improvements
1218
- def robust_preprocess(text):
1219
- # Handle encoding issues
1220
- if isinstance(text, bytes):
1221
- text = text.decode('utf-8', errors='ignore')
1222
-
1223
- # Remove problematic characters
1224
- text = re.sub(r'[^\x00-\x7F]+', ' ', text)
1225
-
1226
- # Normalize whitespace
1227
- text = re.sub(r'\s+', ' ', text)
1228
-
1229
- # Validate minimum length
1230
- if len(text.split()) < 10:
1231
- raise ValueError("Text too short for summarization")
1232
-
1233
- return text.strip()
1234
- ```
1235
-
1236
- ### Performance Debugging
1237
-
1238
- #### Monitoring Tools
1239
-
1240
- **1. GPU Monitoring:**
1241
-
1242
- ```powershell
1243
- # Install NVIDIA monitoring tools
1244
- nvidia-smi
1245
-
1246
- # Continuous monitoring
1247
- nvidia-smi -l 1
1248
- ```
1249
-
1250
- **2. Memory Profiling:**
1251
-
1252
- ```python
1253
- import psutil
1254
- import GPUtil
1255
-
1256
- def monitor_resources():
1257
- # CPU and RAM
1258
- cpu_percent = psutil.cpu_percent()
1259
- memory = psutil.virtual_memory()
1260
-
1261
- # GPU
1262
- gpus = GPUtil.getGPUs()
1263
- if gpus:
1264
- gpu = gpus[0]
1265
- gpu_memory = f"{gpu.memoryUsed}/{gpu.memoryTotal} MB"
1266
- gpu_util = f"{gpu.load * 100:.1f}%"
1267
-
1268
- print(f"CPU: {cpu_percent}%")
1269
- print(f"RAM: {memory.percent}%")
1270
- print(f"GPU Memory: {gpu_memory}")
1271
- print(f"GPU Utilization: {gpu_util}")
1272
- ```
1273
-
1274
- **3. Request Timing:**
1275
-
1276
- ```python
1277
- import time
1278
- from functools import wraps
1279
-
1280
- def timing_decorator(func):
1281
- @wraps(func)
1282
- def wrapper(*args, **kwargs):
1283
- start = time.time()
1284
- result = func(*args, **kwargs)
1285
- end = time.time()
1286
- print(f"{func.__name__} took {end - start:.2f} seconds")
1287
- return result
1288
- return wrapper
1289
-
1290
- @timing_decorator
1291
- def summarize_with_timing(text):
1292
- return summarize(text)
1293
- ```
1294
-
1295
- ### Deployment Issues
1296
-
1297
- #### Production Deployment Checklist
1298
-
1299
- **1. Environment Setup:**
1300
-
1301
- - [ ] Python version compatibility (3.8+)
1302
- - [ ] All dependencies installed
1303
- - [ ] Model files accessible
1304
- - [ ] Sufficient memory available
1305
- - [ ] GPU drivers updated (if using GPU)
1306
-
1307
- **2. Security Configuration:**
1308
-
1309
- - [ ] API authentication implemented
1310
- - [ ] Input validation enabled
1311
- - [ ] Rate limiting configured
1312
- - [ ] HTTPS enabled
1313
- - [ ] Firewall rules set
1314
-
1315
- **3. Performance Optimization:**
1316
-
1317
- - [ ] Model quantization applied
1318
- - [ ] Caching implemented
1319
- - [ ] Request queuing configured
1320
- - [ ] Load balancing set up
1321
- - [ ] Monitoring tools deployed
1322
-
1323
- **4. Error Handling:**
1324
-
1325
- - [ ] Comprehensive logging enabled
1326
- - [ ] Error tracking configured
1327
- - [ ] Graceful degradation implemented
1328
- - [ ] Health checks operational
1329
- - [ ] Backup systems ready
1330
-
1331
- ---
1332
-
1333
  ## Conclusion
1334
 
1335
  This PEGASUS Fine-tuned Document Summarization System represents a significant advancement in domain-specific text summarization. Through careful fine-tuning on scientific papers, the model demonstrates substantial improvements in accuracy, coherence, and domain-appropriate language usage.
 
361
 
362
  ---
363
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
364
  ## Technical Specifications
365
 
366
  ### Model Architecture Details
 
415
  - Concurrent requests: Limited by memory (recommend 1-2 concurrent on 16GB RAM)
416
  - Daily capacity: ~1000-5000 documents (depends on length and hardware)
417
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
418
 
419
  ## Comparison: Before vs After Fine-tuning
420
 
 
607
 
608
  ---
609
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
610
  ## Conclusion
611
 
612
  This PEGASUS Fine-tuned Document Summarization System represents a significant advancement in domain-specific text summarization. Through careful fine-tuning on scientific papers, the model demonstrates substantial improvements in accuracy, coherence, and domain-appropriate language usage.