Pujan-Dev commited on
Commit
87a735b
·
1 Parent(s): 1140ed3

feat: added the proper readme.md

Browse files
features/nepali_text_classifier/controller.py CHANGED
@@ -3,7 +3,6 @@ from io import BytesIO
3
  from fastapi import HTTPException, UploadFile, status, Depends
4
  from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
5
  import os
6
-
7
  from features.nepali_text_classifier.inferencer import classify_text
8
  from features.nepali_text_classifier.preprocess import *
9
  import re
 
3
  from fastapi import HTTPException, UploadFile, status, Depends
4
  from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
5
  import os
 
6
  from features.nepali_text_classifier.inferencer import classify_text
7
  from features.nepali_text_classifier.preprocess import *
8
  import re
features/nepali_text_classifier/preprocess.py CHANGED
@@ -31,8 +31,9 @@ def parse_txt(file: BytesIO):
31
  return file.read().decode("utf-8")
32
 
33
 
34
- def end_symbol_for_NP_text(text):
35
- if not text.endswith("।"):
36
- text += "।"
 
37
 
38
 
 
31
  return file.read().decode("utf-8")
32
 
33
 
34
+ def end_symbol_for_NP_text(text: str) -> str:
35
+ if not text.endswith("।"):
36
+ text += "।"
37
+ return text
38
 
39
 
readme.md CHANGED
@@ -1,339 +1,241 @@
1
- ### **FastAPI AI**
2
 
3
- This FastAPI app loads a GPT-2 model, tokenizes input text, classifies it, and returns whether the text is AI-generated or human-written.
4
-
5
- ### **install Dependencies**
6
-
7
- ```bash
8
- pip install -r requirements.txt
9
-
10
- ```
11
-
12
- This command installs all the dependencies listed in the `requirements.txt` file. It ensures that your environment has the required packages to run the project smoothly.
13
-
14
- **NOTE: IF YOU HAVE DONE ANY CHANGES DON'NT FORGOT TO PUT IT IN THE REQUIREMENTS.TXT USING `bash pip freeze > requirements.txt `**
15
 
16
  ---
17
- ### Files STructure
 
18
 
19
  ```
20
- ├── app.py
21
- ├── features
22
- │   └── text_classifier
23
-    ├── controller.py
24
-    ├── inferencer.py
25
-    ├── __init__.py
26
-    ├── model_loader.py
27
-    ├── preprocess.py
28
-    └── routes.py
29
- ├── __init__.py
30
- ├── Procfile
31
- ├── readme.md
32
- └── requirements.txt
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ```
34
- **`app.py`**: Entry point initializing FastAPI app and routes
35
- **`Procfile`**: Tells Railway how to run the program
36
- **`requirements.txt`**:Have all the packages that we use in our project
37
- **`__init__.py`** : Package initializer for the root module
38
- **FOLDER :features/text_classifier**
39
- **`controller.py`** :Handles logic between routes and model
40
- **`inferencer.py`** : Runs inference and returns predictions as well as files system
41
- **`__init__.py`** :Initializes the module as a package
42
- **`model_loader.py`** : Loads the ML model and tokenizer
43
- **`preprocess.py`** :Prepares input text for the model
44
- **`routes.py`** :Defines API routes for text classification
45
-
46
- ### **Functions**
47
-
48
- 1. **`load_model()`**
49
- Loads the GPT-2 model and tokenizer from the specified directory paths.
50
-
51
- 2. **`lifespan()`**
52
- Manages the application lifecycle. It initializes the model at startup and performs cleanup during shutdown.
53
 
54
- 3. **`classify_text_sync()`**
55
- Synchronously tokenizes the input text and performs classification using the GPT-2 model. Returns both the classification result and perplexity score.
56
 
57
- 4. **`classify_text()`**
58
- Asynchronously runs `classify_text_sync()` in a thread pool for non-blocking text classification.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
- 5. **`analyze_text()`**
61
- **POST** endpoint: Accepts text input, classifies it using `classify_text()`, and returns the result along with perplexity.
62
 
63
- 6. **`health()`**
64
- **GET** endpoint: Performs a simple health check to confirm the API is operational.
65
 
66
- 7. **`parse_docx()`, `parse_pdf()`, `parse_txt()`**
67
- Utility functions to extract and convert the contents of `.docx`, `.pdf`, and `.txt` files into plain text for classification.
68
 
69
- 8. **`warmup()`**
70
- Downloads the model repository and initializes the model and tokenizer using the `load_model()` function.
 
 
71
 
72
- 9. **`download_model_repo()`**
73
- Handles downloading the model files from the designated `MODEL` folder.
74
 
75
- 10. **`get_model_tokenizer()`**
76
- Similar to `warmup()`, but includes a check to see if the model already exists. If not, it downloads the model; otherwise, it uses the previously downloaded one.
 
77
 
78
- 11. **`handle_file_upload()`**
79
- Manages file uploads from the `/upload` route. Extracts text from the uploaded file, classifies it, and returns the results.
80
 
81
- 12. **`extract_file_contents()`**
82
- Extracts and returns plain text content from uploaded files (e.g., PDF, DOCX, TXT).
83
 
84
- 13. **`handle_file_sentence()`**
85
- Processes uploaded files by analyzing each sentence. Ensures the total file text is under 10,000 characters before classification.
 
86
 
87
- 14. **`handle_sentence_level_analysis()`**
88
- Strips and checks each sentence’s length, then evaluates the likelihood of AI vs. human generation for each sentence.
89
 
90
- 15. **`analyze_sentences()`**
91
- Divides long paragraphs into individual sentences, classifies each one, and returns a list of their classification results.
92
 
93
- 16. **`analyze_sentence_file()`**
94
- A route function that analyzes sentences in uploaded files, similar to `handle_file_sentence()`.
95
 
 
 
 
96
 
97
  ---
98
 
99
- ### **Code Overview**
100
 
101
- ### **Running and Load Balancing:**
102
 
103
- To run the app in production with load balancing:
 
104
 
105
- ```bash
106
- uvicorn app:app --host 0.0.0.0 --port 8000
107
- ```
108
-
109
- This command launches the FastAPI app.
110
 
 
111
 
112
- ### **Endpoints**
113
 
114
- #### 1. **`/text/analyze`**
115
 
116
- - **Method:** `POST`
117
- - **Description:** Classifies whether the text is AI-generated or human-written.
118
- - **Request:**
119
- ```json
120
- { "text": "sample text" }
121
- ```
122
- - **Response:**
123
- ```json
124
- { "result": "AI-generated", "perplexity": 55.67,"ai_likelihood":66.6%}
125
- ```
126
 
127
- #### 2. **`/health`**
128
 
129
- - **Method:** `GET`
130
- - **Description:** Returns the status of the API.
131
- - **Response:**
132
- ```json
133
- { "status": "ok" }
134
- ```
135
- #### 3. **`/text/upload`**
136
- - **Method:** `POST`
137
- - **Description:** Takes the files and check the contains inside and returns the results
138
- - **Request:** Files
139
-
140
- - **Response:**
141
- ```json
142
- { "result": "AI-generated", "perplexity": 55.67,"ai_likelihood":66.6%}
143
  ```
144
- #### 4. **`/text/analyze_sentence_file`**
145
- - **Method:** `POST`
146
- - **Description:** Takes the files and check the contains inside and returns the results
147
- - **Request:** Files
148
 
149
- - **Response:**
150
- ```json
151
- {
152
- "content": "Artificial Intelligence (AI) and Machine Learning (ML) are rapidly transforming the way we \ninteract with technology. AI refers to the broader concept of machines being able to carry out \ntasks in a way that we would consider \"smart,\" while ML is a subset of AI that focuses on the \ndevelopment of algorithms that allow computers to learn from and make decisions based on \ndata. These technologies are behind innovations such as voice assistants, recommendation \nsystems, self-driving cars, and medical diagnosis tools. By analyzing large amounts of data, \nAI and ML can identify patterns, make predictions, and continuously improve their \nperformance over time, making them essential tools in modern industries ranging from \nhealthcare and finance to education and entertainment. \n \n",
153
- "analysis": [
154
- {
155
- "sentence": "Artificial Intelligence (AI) and Machine Learning (ML) are rapidly transforming the way we interact with technology.",
156
- "label": "AI-generated",
157
- "perplexity": 8.17,
158
- "ai_likelihood": 100
159
- },
160
- {
161
- "sentence": "AI refers to the broader concept of machines being able to carry out tasks in a way that we would consider \"smart,\" while ML is a subset of AI that focuses on the development of algorithms that allow computers to learn from and make decisions based on data.",
162
- "label": "AI-generated",
163
- "perplexity": 19.34,
164
- "ai_likelihood": 89.62
165
- },
166
- {
167
- "sentence": "These technologies are behind innovations such as voice assistants, recommendation systems, self-driving cars, and medical diagnosis tools.",
168
- "label": "AI-generated",
169
- "perplexity": 40.31,
170
- "ai_likelihood": 66.32
171
- },
172
- {
173
- "sentence": "By analyzing large amounts of data, AI and ML can identify patterns, make predictions, and continuously improve their performance over time, making them essential tools in modern industries ranging from healthcare and finance to education and entertainment.",
174
- "label": "AI-generated",
175
- "perplexity": 26.15,
176
- "ai_likelihood": 82.05
177
- }
178
- ]
179
- }```
180
-
181
- #### 5. **`/text/analyze_sentences`**
182
- - **Method:** `POST`
183
- - **Description:** Takes the text and check the contains inside and returns the results
184
- - **Request:**
185
  ```json
186
  {
187
- "text": "This is an test text. This is an another Text "
 
 
188
  }
189
  ```
190
 
191
- - **Response:**
192
- ```json
193
- {
194
- "analysis": [
195
- {
196
- "sentence": "This is an test text.",
197
- "label": "Human-written",
198
- "perplexity": 510.28,
199
- "ai_likelihood": 0
200
- },
201
- {
202
- "sentence": "This is an another Text",
203
- "label": "Human-written",
204
- "perplexity": 3926.05,
205
- "ai_likelihood": 0
206
- }
207
- ]
208
- }```
209
-
210
-
211
- ---
212
-
213
- ### **Running the API**
214
-
215
- Start the server with:
216
 
217
  ```bash
218
- uvicorn app:app --host 0.0.0.0 --port 8000
 
 
219
  ```
220
 
221
  ---
222
 
223
- ### **🧪 Testing the API**
224
 
225
- You can test the FastAPI endpoint using `curl` like this:
 
 
 
 
 
 
 
 
226
 
227
  ```bash
228
- curl -X POST https://can-org-canspace.hf.space/analyze \
229
- -H "Authorization: Bearer SECRET_CODE" \
230
  -H "Content-Type: application/json" \
231
- -d '{"text": "This is a sample sentence for analysis."}'
232
  ```
233
 
234
- - The `-H "Authorization: Bearer SECRET_CODE"` part is used to simulate the **handshake**.
235
- - FastAPI checks this token against the one loaded from the `.env` file.
236
- - If the token matches, the request is accepted and processed.
237
- - Otherwise, it responds with a `403 Unauthorized` error.
238
-
239
- ---
240
-
241
- ### **API Documentation**
242
-
243
- - **Swagger UI:** `https://can-org-canspace.hf.space/docs` -> `/docs`
244
- - **ReDoc:** `https://can-org-canspace.hf.space/redoc` -> `/redoc`
245
-
246
- ### **🔐 Handshake Mechanism**
247
-
248
- In this part, we're implementing a simple handshake to verify that the request is coming from a trusted source (e.g., our NestJS server). Here's how it works:
249
-
250
- - We load a secret token from the `.env` file.
251
- - When a request is made to the FastAPI server, we extract the `Authorization` header and compare it with our expected secret token.
252
- - If the token does **not** match, we immediately return a **403 Forbidden** response with the message `"Unauthorized"`.
253
- - If the token **does** match, we allow the request to proceed to the next step.
254
-
255
- The verification function looks like this:
256
-
257
- ```python
258
- def verify_token(auth: str):
259
- if auth != f"Bearer {EXPECTED_TOKEN}":
260
- raise HTTPException(status_code=403, detail="Unauthorized")
261
  ```
262
 
263
- This provides a basic but effective layer of security to prevent unauthorized access to the API.
264
 
265
- ### **Implement it with NEST.js**
266
-
267
- NOTE: Make an micro service in NEST.JS and implement it there and call it from app.controller.ts
 
 
268
 
269
- in fastapi.service.ts file what we have done is
270
 
271
- ### Project Structure
272
 
273
- ```files
274
- nestjs-fastapi-bridge/
275
- ├── src/
276
- │ ├── app.controller.ts
277
- │ ├── app.module.ts
278
- │ └── fastapi.service.ts
279
- ├── .env
280
-
281
- ```
282
 
283
  ---
284
 
285
- ### Step-by-Step Setup
286
-
287
- #### 1. `.env`
288
 
289
- Create a `.env` file at the root with the following:
290
 
291
- ```environment
292
- FASTAPI_BASE_URL=https://can-org-canspace.hf.space/
293
- SECRET_TOKEN="SECRET_CODE_TOKEN"
 
294
  ```
295
 
296
- #### 2. `fastapi.service.ts`
297
-
298
- ```javascript
299
- // src/fastapi.service.ts
300
- import { Injectable } from "@nestjs/common";
301
- import { HttpService } from "@nestjs/axios";
302
- import { ConfigService } from "@nestjs/config";
303
- import { firstValueFrom } from "rxjs";
304
-
305
- @Injectable()
306
- export class FastAPIService {
307
- constructor(
308
- private http: HttpService,
309
- private config: ConfigService,
310
- ) {}
311
-
312
- async analyzeText(text: string) {
313
- const url = `${this.config.get("FASTAPI_BASE_URL")}/analyze`;
314
- const token = this.config.get("SECRET_TOKEN");
315
-
316
- const response = await firstValueFrom(
317
- this.http.post(
318
- url,
319
- { text },
320
- {
321
- headers: {
322
- Authorization: `Bearer ${token}`,
323
- },
324
  },
325
- ),
326
- );
 
327
 
328
- return response.data;
329
- }
330
  }
 
331
  ```
332
 
333
- #### 3. `app.module.ts`
334
-
335
- ```javascript
336
- // src/app.module.ts
337
  import { Module } from "@nestjs/common";
338
  import { ConfigModule } from "@nestjs/config";
339
  import { HttpModule } from "@nestjs/axios";
@@ -348,54 +250,95 @@ import { FastAPIService } from "./fastapi.service";
348
  export class AppModule {}
349
  ```
350
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
351
  ---
352
 
353
- #### 4. `app.controller.ts`
354
 
355
- ```javascript
356
- // src/app.controller.ts
357
- import { Body, Controller, Post, Get, Query } from '@nestjs/common';
358
- import { FastAPIService } from './fastapi.service';
359
 
360
- @Controller()
361
- export class AppController {
362
- constructor(private readonly fastapiService: FastAPIService) {}
363
 
364
- @Post('analyze-text')
365
- async callFastAPI(@Body('text') text: string) {
366
- return this.fastapiService.analyzeText(text);
367
- }
368
 
369
- @Get()
370
- getHello(): string {
371
- return 'NestJS is connected to FastAPI ';
372
- }
373
- }
374
- ```
375
 
376
- ### 🚀 How to Run
 
377
 
378
- Run the server of flask and nest.js:
 
379
 
380
- - for nest.js
381
- ```bash
382
- npm run start
383
- ```
384
- - for Fastapi
385
 
386
- ```bash
387
- uvicorn app:app --reload
388
- ```
389
 
390
- Make sure your FastAPI service is running at `http://localhost:8000`.
 
391
 
392
- ### Test with CURL
393
- http://localhost:3000/-> Server of nest.js
394
- ```bash
395
- curl -X POST http://localhost:3000/analyze-text \
396
- -H 'Content-Type: application/json' \
397
- -d '{"text": "This is a test input"}'
398
- ```
 
 
 
 
 
 
 
399
 
 
 
400
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
401
 
 
1
+ # 🚀 FastAPI AI Text Detector
2
 
3
+ A production-ready FastAPI application for **AI-generated vs. human-written text detection** in both **English** and **Nepali**. Models are auto-managed and endpoints are secured via Bearer token authentication.
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  ---
6
+
7
+ ## 🏗️ Project Structure
8
 
9
  ```
10
+ ├── app.py # Main FastAPI app entrypoint
11
+ ├── config.py # Configuration loader (.env, settings)
12
+ ├── features/
13
+ ├── text_classifier/ # English (GPT-2) classifier
14
+ ├── controller.py
15
+ ├── inferencer.py
16
+ ├── model_loader.py
17
+ ├── preprocess.py
18
+ └── routes.py
19
+ │ └── nepali_text_classifier/ # Nepali (sentencepiece) classifier
20
+ ├── controller.py
21
+ ├── inferencer.py
22
+ │ ├── model_loader.py
23
+ │ ├── preprocess.py
24
+ │ └── routes.py
25
+ ├── np_text_model/ # Nepali model artifacts (auto-downloaded)
26
+ │ ├── classifier/
27
+ │ │ └── sentencepiece.bpe.model
28
+ │ └── model_95_acc.pth
29
+ ├── models/ # English GPT-2 model/tokenizer (auto-downloaded)
30
+ │ ├── merges.txt
31
+ │ ├── tokenizer.json
32
+ │ └── model_weights.pth
33
+ ├── Dockerfile # Container build config
34
+ ├── Procfile # Deployment entrypoint (for PaaS)
35
+ ├── requirements.txt # Python dependencies
36
+ ├── README.md # This file
37
+ └── .env # Secret token(s), environment config
38
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
+ ---
 
41
 
42
+ ### 🌟 Key Files and Their Roles
43
+
44
+ - **`app.py`**: Entry point initializing FastAPI app and routes.
45
+ - **`Procfile`**: Tells Railway (or similar platforms) how to run the program.
46
+ - **`requirements.txt`**: Tracks all Python dependencies for the project.
47
+ - **`__init__.py`**: Package initializer for the root module and submodules.
48
+ - **`features/text_classifier/`**
49
+ - **`controller.py`**: Handles logic between routes and the model.
50
+ - **`inferencer.py`**: Runs inference and returns predictions as well as file system
51
+ utilities.
52
+ - **`features/NP/`**
53
+ - **`controller.py`**: Handles logic between routes and the model.
54
+ - **`inferencer.py`**: Runs inference and returns predictions as well as file system
55
+ utilities.
56
+ - **`model_loader.py`**: Loads the ML model and tokenizer.
57
+ - **`preprocess.py`**: Prepares input text for the model.
58
+ - **`routes.py`**: Defines API routes for text classification.
59
 
60
+ ---
 
61
 
62
+ ## ⚙️ Setup & Installation
 
63
 
64
+ 1. **Clone the repository**
 
65
 
66
+ ```bash
67
+ git clone https://github.com/cyberalertnepal/aiapi
68
+ cd aiapi
69
+ ```
70
 
71
+ 2. **Install dependencies**
 
72
 
73
+ ```bash
74
+ pip install -r requirements.txt
75
+ ```
76
 
77
+ 3. **Configure secrets**
 
78
 
79
+ - Create a `.env` file at the project root:
 
80
 
81
+ ```env
82
+ SECRET_TOKEN=your_secret_token_here
83
+ ```
84
 
85
+ - **All endpoints require `Authorization: Bearer <SECRET_TOKEN>`**
 
86
 
87
+ ---
 
88
 
89
+ ## 🚦 Running the API Server
 
90
 
91
+ ```bash
92
+ uvicorn app:app --host 0.0.0.0 --port 8000
93
+ ```
94
 
95
  ---
96
 
97
+ ## 🔒 Security: Bearer Token Auth
98
 
99
+ All endpoints require authentication via Bearer token:
100
 
101
+ - Set `SECRET_TOKEN` in `.env`
102
+ - Add header: `Authorization: Bearer <SECRET_TOKEN>`
103
 
104
+ Unauthorized requests receive `403 Forbidden`.
 
 
 
 
105
 
106
+ ---
107
 
108
+ ## 🧩 API Endpoints
109
 
110
+ ### English (GPT-2) - `/text/`
111
 
112
+ | Endpoint | Method | Description |
113
+ | --------------------------------- | ------ | ----------------------------------------- |
114
+ | `/text/analyse` | POST | Classify raw English text |
115
+ | `/text/analyse-sentences` | POST | Sentence-by-sentence breakdown |
116
+ | `/text/analyse-sentance-file` | POST | Upload file, per-sentence breakdown |
117
+ | `/text/upload` | POST | Upload file for overall classification |
118
+ | `/text/health` | GET | Health check |
 
 
 
119
 
120
+ #### Example: Classify English text
121
 
122
+ ```bash
123
+ curl -X POST http://localhost:8000/text/analyse \
124
+ -H "Authorization: Bearer <SECRET_TOKEN>" \
125
+ -H "Content-Type: application/json" \
126
+ -d '{"text": "This is a sample text for analysis."}'
 
 
 
 
 
 
 
 
 
127
  ```
 
 
 
 
128
 
129
+ **Response:**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
130
  ```json
131
  {
132
+ "result": "AI-generated",
133
+ "perplexity": 55.67,
134
+ "ai_likelihood": 66.6
135
  }
136
  ```
137
 
138
+ #### Example: File upload
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
  ```bash
141
+ curl -X POST http://localhost:8000/text/upload \
142
+ -H "Authorization: Bearer <SECRET_TOKEN>" \
143
+ -F 'file=@yourfile.txt;type=text/plain'
144
  ```
145
 
146
  ---
147
 
148
+ ### Nepali (SentencePiece) - `/NP/`
149
 
150
+ | Endpoint | Method | Description |
151
+ | --------------------------------- | ------ | ----------------------------------------- |
152
+ | `/NP/analyse` | POST | Classify Nepali text |
153
+ | `/NP/analyse-sentences` | POST | Sentence-by-sentence breakdown |
154
+ | `/NP/upload` | POST | Upload Nepali PDF for classification |
155
+ | `/NP/file-sentences-analyse` | POST | PDF upload, per-sentence breakdown |
156
+ | `/NP/health` | GET | Health check |
157
+
158
+ #### Example: Nepali text classification
159
 
160
  ```bash
161
+ curl -X POST http://localhost:8000/NP/analyse \
162
+ -H "Authorization: Bearer <SECRET_TOKEN>" \
163
  -H "Content-Type: application/json" \
164
+ -d '{"text": "यो उदाहरण वाक्य हो।"}'
165
  ```
166
 
167
+ **Response:**
168
+ ```json
169
+ {
170
+ "label": "Human",
171
+ "confidence": 98.6
172
+ }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
173
  ```
174
 
175
+ #### Example: Nepali PDF upload
176
 
177
+ ```bash
178
+ curl -X POST http://localhost:8000/NP/upload \
179
+ -H "Authorization: Bearer <SECRET_TOKEN>" \
180
+ -F 'file=@NepaliText.pdf;type=application/pdf'
181
+ ```
182
 
183
+ ---
184
 
185
+ ## 📝 API Docs
186
 
187
+ - **Swagger UI:** [http://localhost:8000/docs](http://localhost:8000/docs)
188
+ - **ReDoc:** [http://localhost:8000/redoc](http://localhost:8000/redoc)
 
 
 
 
 
 
 
189
 
190
  ---
191
 
192
+ ## 🧪 Example: Integration with NestJS
 
 
193
 
194
+ You can easily call this API from a NestJS microservice.
195
 
196
+ **.env**
197
+ ```env
198
+ FASTAPI_BASE_URL=http://localhost:8000
199
+ SECRET_TOKEN=your_secret_token_here
200
  ```
201
 
202
+ **fastapi.service.ts**
203
+ ```typescript
204
+ import { Injectable } from "@nestjs/common";
205
+ import { HttpService } from "@nestjs/axios";
206
+ import { ConfigService } from "@nestjs/config";
207
+ import { firstValueFrom } from "rxjs";
208
+
209
+ @Injectable()
210
+ export class FastAPIService {
211
+ constructor(
212
+ private http: HttpService,
213
+ private config: ConfigService,
214
+ ) {}
215
+
216
+ async analyzeText(text: string) {
217
+ const url = `${this.config.get("FASTAPI_BASE_URL")}/text/analyse`;
218
+ const token = this.config.get("SECRET_TOKEN");
219
+
220
+ const response = await firstValueFrom(
221
+ this.http.post(
222
+ url,
223
+ { text },
224
+ {
225
+ headers: {
226
+ Authorization: `Bearer ${token}`,
 
 
 
227
  },
228
+ },
229
+ ),
230
+ );
231
 
232
+ return response.data;
 
233
  }
234
+ }
235
  ```
236
 
237
+ **app.module.ts**
238
+ ```typescript
 
 
239
  import { Module } from "@nestjs/common";
240
  import { ConfigModule } from "@nestjs/config";
241
  import { HttpModule } from "@nestjs/axios";
 
250
  export class AppModule {}
251
  ```
252
 
253
+ **app.controller.ts**
254
+ ```typescript
255
+ import { Body, Controller, Post, Get } from '@nestjs/common';
256
+ import { FastAPIService } from './fastapi.service';
257
+
258
+ @Controller()
259
+ export class AppController {
260
+ constructor(private readonly fastapiService: FastAPIService) {}
261
+
262
+ @Post('analyze-text')
263
+ async callFastAPI(@Body('text') text: string) {
264
+ return this.fastapiService.analyzeText(text);
265
+ }
266
+
267
+ @Get()
268
+ getHello(): string {
269
+ return 'NestJS is connected to FastAPI';
270
+ }
271
+ }
272
+ ```
273
+
274
  ---
275
 
276
+ ## 🧠 Main Functions in Text Classifier (`features/text_classifier/` and `features/text_classifier/`)
277
 
278
+ - **`load_model()`**
279
+ Loads the GPT-2 model and tokenizer from the specified directory paths.
 
 
280
 
281
+ - **`lifespan()`**
282
+ Manages the application lifecycle. Initializes the model at startup and handles cleanup on shutdown.
 
283
 
284
+ - **`classify_text_sync()`**
285
+ Synchronously tokenizes input text and predicts using the GPT-2 model. Returns classification and perplexity.
 
 
286
 
287
+ - **`classify_text()`**
288
+ Asynchronously runs `classify_text_sync()` in a thread pool for non-blocking text classification.
 
 
 
 
289
 
290
+ - **`analyze_text()`**
291
+ **POST** endpoint: Accepts text input, classifies it using `classify_text()`, and returns the result with perplexity.
292
 
293
+ - **`health()`**
294
+ **GET** endpoint: Simple health check for API liveness.
295
 
296
+ - **`parse_docx()`, `parse_pdf()`, `parse_txt()`**
297
+ Utilities to extract and convert `.docx`, `.pdf`, and `.txt` file contents to plain text.
 
 
 
298
 
299
+ - **`warmup()`**
300
+ Downloads the model repository and initializes the model/tokenizer using `load_model()`.
 
301
 
302
+ - **`download_model_repo()`**
303
+ Downloads the model files from the designated `MODEL` folder.
304
 
305
+ - **`get_model_tokenizer()`**
306
+ Checks if the model already exists; if not, downloads it—otherwise, loads the cached model.
307
+
308
+ - **`handle_file_upload()`**
309
+ Handles file uploads from the `/upload` route. Extracts text, classifies, and returns results.
310
+
311
+ - **`extract_file_contents()`**
312
+ Extracts and returns plain text from uploaded files (PDF, DOCX, TXT).
313
+
314
+ - **`handle_file_sentence()`**
315
+ Processes file uploads by analyzing each sentence (under 10,000 chars) before classification.
316
+
317
+ - **`handle_sentence_level_analysis()`**
318
+ Checks/strips each sentence, then computes AI/human likelihood for each.
319
 
320
+ - **`analyze_sentences()`**
321
+ Splits paragraphs into sentences, classifies each, and returns all results.
322
 
323
+ - **`analyze_sentence_file()`**
324
+ Like `handle_file_sentence()`—analyzes sentences in uploaded files.
325
+
326
+ ---
327
+
328
+ ## 🚀 Deployment
329
+
330
+ - **Local**: Use `uvicorn` as above.
331
+ - **Railway/Heroku**: Use the provided `Procfile`.
332
+ - **Hugging Face Spaces**: Use the `Dockerfile` for container deployment.
333
+
334
+ ---
335
+
336
+ ## 💡 Tips
337
+
338
+ - **Model files auto-download at first start** if not found.
339
+ - **Keep `requirements.txt` up-to-date** after adding dependencies.
340
+ - **All endpoints require the correct `Authorization` header**.
341
+ - **For security**: Avoid committing `.env` to public repos.
342
+
343
+ ---
344