Enrique Cardoza commited on
Commit
48bf205
Β·
1 Parent(s): 6c8c34d

docs: update README with comprehensive project documentation

Browse files

- Add detailed project description and features overview
- Document available MCP tools with parameters and return values
- List supported audio/video file formats and size limits
- Provide MCP integration examples for different client types (SSE, stdio)
- Include YAML configuration for ContinueDev extension
- Add authentication methods and usage example
- Organize with clear sections and emoji icons for readability

Files changed (1) hide show
  1. README.md +119 -3
README.md CHANGED
@@ -1,8 +1,7 @@
1
  ---
2
  title: Transcript Generator
 
3
  emoji: πŸ’»
4
- colorFrom: green
5
- colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 5.33.0
8
  app_file: app.py
@@ -15,4 +14,121 @@ tags:
15
  - api
16
  ---
17
 
18
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Transcript Generator
3
+ author: Enrique Cardoza
4
  emoji: πŸ’»
 
 
5
  sdk: gradio
6
  sdk_version: 5.33.0
7
  app_file: app.py
 
14
  - api
15
  ---
16
 
17
+ # Transcript Generator
18
+
19
+ A powerful MCP (Model Context Protocol) server that transcribes audio and video files into text using Groq's Whisper model. This tool enables AI assistants to process audio content, making multimedia data accessible for analysis and understanding.
20
+
21
+ ## πŸ” Project Description
22
+
23
+ Transcript Generator is an AI-powered transcription service built for the Gradio Agents & MCP Hackathon 2025. It leverages Groq's implementation of the Whisper Large V3 Turbo model to accurately convert spoken content from audio and video files into written text.
24
+
25
+ The service supports:
26
+ - File uploads (up to 25MB)
27
+ - Direct URL transcription
28
+ - Various audio/video formats
29
+ - Integration with MCP clients
30
+
31
+ ## πŸ› οΈ Available MCP Tools
32
+
33
+ ### 1. `transcript_generator_transcribe_audio`
34
+
35
+ Transcribes audio/video files uploaded directly to the service (runs in local).
36
+
37
+ **Parameters:**
38
+ - `audio_file` (string): Path to an audio or video file to transcribe
39
+ - `api_key` (string): Your Groq API key, required for authentication
40
+
41
+ **Returns:** A text transcript of the spoken content in the audio file
42
+
43
+ ### 2. `transcript_generator_transcribe_audio_from_url`
44
+
45
+ Transcribes audio/video files from a URL.
46
+
47
+ **Parameters:**
48
+ - `audio_url` (string): URL to an audio or video file to transcribe (http or https)
49
+ - `api_key` (string): Your Groq API key, required for authentication
50
+
51
+ **Returns:** A text transcript of the spoken content in the audio file
52
+
53
+ ## πŸ“‹ Supported File Formats
54
+
55
+ - **Audio formats:** MP3, MPGA, M4A, WAV, FLAC, OGG, AAC
56
+ - **Video formats:** MP4, MPEG, WebM
57
+ - **Maximum file size:** 25MB
58
+
59
+ ## πŸ”Œ MCP Integration
60
+
61
+ ### SSE Configuration (Cursor, Windsurf, Cline)
62
+
63
+ To add this MCP to clients that support SSE, add the following to your MCP config:
64
+
65
+ ```json
66
+ {
67
+ "mcpServers": {
68
+ "gradio": {
69
+ "url": "https://agents-mcp-hackathon-transcript-generator.hf.space/gradio_api/mcp/sse"
70
+ }
71
+ }
72
+ }
73
+ ```
74
+
75
+ ### Stdio Configuration (Node.js required)
76
+
77
+ For clients that only support stdio:
78
+
79
+ ```json
80
+ {
81
+ "mcpServers": {
82
+ "gradio": {
83
+ "command": "npx",
84
+ "args": [
85
+ "mcp-remote",
86
+ "https://agents-mcp-hackathon-transcript-generator.hf.space/gradio_api/mcp/sse",
87
+ "--transport",
88
+ "sse-only"
89
+ ]
90
+ }
91
+ }
92
+ }
93
+ ```
94
+
95
+ ### YAML Configuration (ContinueDev extension)
96
+
97
+ ```yaml
98
+ name: Transcript MCP Server
99
+ description: A new MCP server for handling transcripts.
100
+ version: 0.0.1
101
+ schema: v1
102
+ mcpServers:
103
+ - name: Transcript MCP server
104
+ command: npx
105
+ args:
106
+ - mcp-remote
107
+ - https://agents-mcp-hackathon-transcript-generator.hf.space/gradio_api/mcp/sse
108
+ - --transport
109
+ - sse-only
110
+ ```
111
+
112
+ ## πŸ”‘ Authentication
113
+
114
+ You'll need a Groq API key to use this service. You can obtain one from the [Groq Console](https://console.groq.com/).
115
+
116
+ The API key can be provided in several ways:
117
+ 1. As a parameter in the tool call
118
+ 2. Set as an environment variable (`GROQ_API_KEY`)
119
+ 3. In the request headers (for certain clients)
120
+
121
+ ## πŸ’‘ Usage Example
122
+
123
+ When using with an AI assistant that supports MCP, you can request transcriptions with prompts like:
124
+
125
+ > "Please generate the transcript for this audio file: https://huggingface.co/spaces/anewryzm/transcript-generator-client/resolve/main/test_files/this%20people%203.m4a"
126
+
127
+ The assistant will use the appropriate MCP tool to fetch and return the transcript.
128
+
129
+ ## πŸ”— Useful Links
130
+
131
+ - [Get your Groq API key](https://console.groq.com/)
132
+ - [Groq Documentation](https://console.groq.com/docs)
133
+ - [Supported audio formats](https://platform.openai.com/docs/guides/speech-to-text)
134
+ - [Hugging Face Spaces Configuration](https://huggingface.co/docs/hub/spaces-config-reference)