luojunyu commited on
Commit
f8d35c2
·
1 Parent(s): 94852fa
Files changed (6) hide show
  1. README.md +71 -0
  2. all_papers_0328.csv +0 -0
  3. app.py +363 -199
  4. requirements.txt +4 -16
  5. src/about.py +47 -29
  6. style.css +187 -0
README.md CHANGED
@@ -11,6 +11,77 @@ short_description: LLM Agent Research Collection
11
  sdk_version: 5.19.0
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Start the configuration
15
 
16
  Most of the variables to change for a default leaderboard are in `src/env.py` (replace the path for your leaderboard) and `src/about.py` (for tasks).
 
11
  sdk_version: 5.19.0
12
  ---
13
 
14
+ # Large Language Model Agent Papers Explorer
15
+
16
+ This is a companion application for the paper "Large Language Model Agent: A Survey on Methodology, Applications and Challenges" ([arXiv:2503.21460](https://arxiv.org/abs/2503.21460)).
17
+
18
+ ## About
19
+
20
+ The application provides an interactive interface to explore papers from our comprehensive survey on Large Language Model (LLM) agents. It allows you to search and filter papers across key categories including agent construction, collaboration mechanisms, evolution, tools, security, benchmarks, and applications.
21
+
22
+ ![Screenshot of the application](/screenshots/app-screenshot.png)
23
+
24
+ ## Key Features
25
+
26
+ - **Paper Search**: Find papers by keywords, titles, summaries, or publication venues
27
+ - **Category Filtering**: Browse papers by sections/categories
28
+ - **Year Filtering**: Filter papers by publication year
29
+ - **Sorting Options**: Sort papers by year, title, or section
30
+ - **Paper Statistics**: View distributions of papers across categories and years
31
+ - **Direct Links**: Access original papers through direct links to their sources
32
+
33
+ ## Collection Overview
34
+
35
+ Our paper collection spans multiple categories:
36
+
37
+ - **Introduction**: Survey papers and foundational works introducing LLM agents
38
+ - **Construction**: Papers on building and designing agents
39
+ - **Collaboration**: Multi-agent systems and communication methods
40
+ - **Evolution**: Learning and improvement of agents over time
41
+ - **Tools**: Integration of external tools with LLM agents
42
+ - **Security**: Safety, alignment, and ethical considerations
43
+ - **Datasets & Benchmarks**: Evaluation frameworks and resources
44
+ - **Applications**: Domain-specific uses in science, medicine, etc.
45
+
46
+ ## Related Resources
47
+
48
+ - [Full Survey Paper on arXiv](https://arxiv.org/abs/2503.21460)
49
+ - [Awesome-Agent-Papers GitHub Repository](https://github.com/luo-junyu/Awesome-Agent-Papers)
50
+
51
+ ## How to Contribute
52
+
53
+ If you have a paper that you believe should be included in our collection:
54
+
55
+ 1. Check if the paper is already in our database
56
+ 2. Submit your paper at [this form](https://forms.office.com/r/sW0Zzymi5b) or email us at luo.junyu@outlook.com
57
+ 3. Include the paper's title, authors, abstract, URL, publication venue, and year
58
+ 4. Suggest a section/category for the paper
59
+
60
+ ## Citation
61
+
62
+ If you find our survey helpful, please consider citing our work:
63
+
64
+ ```
65
+ @article{agentsurvey2025,
66
+ title={Large Language Model Agent: A Survey on Methodology, Applications and Challenges},
67
+ author={Junyu Luo and Weizhi Zhang and Ye Yuan and Yusheng Zhao and Junwei Yang and Yiyang Gu and Bohan Wu and Binqi Chen and Ziyue Qiao and Qingqing Long and Rongcheng Tu and Xiao Luo and Wei Ju and Zhiping Xiao and Yifan Wang and Meng Xiao and Chenwu Liu and Jingyang Yuan and Shichang Zhang and Yiqiao Jin and Fan Zhang and Xian Wu and Hanqing Zhao and Dacheng Tao and Philip S. Yu and Ming Zhang},
68
+ journal={arXiv preprint arXiv:2503.21460},
69
+ year={2025}
70
+ }
71
+ ```
72
+
73
+ ## Local Development
74
+
75
+ To run this application locally:
76
+
77
+ 1. Clone this repository
78
+ 2. Install the required dependencies with `pip install -r requirements.txt`
79
+ 3. Run the application with `python app.py`
80
+
81
+ ## License
82
+
83
+ This project is licensed under the MIT License - see the LICENSE file for details.
84
+
85
  # Start the configuration
86
 
87
  Most of the variables to change for a default leaderboard are in `src/env.py` (replace the path for your leaderboard) and `src/about.py` (for tasks).
all_papers_0328.csv ADDED
The diff for this file is too large to render. See raw diff
 
app.py CHANGED
@@ -1,204 +1,368 @@
1
  import gradio as gr
2
- from gradio_leaderboard import Leaderboard, ColumnFilter, SelectColumns
3
  import pandas as pd
4
- from apscheduler.schedulers.background import BackgroundScheduler
5
- from huggingface_hub import snapshot_download
6
-
7
- from src.about import (
8
- CITATION_BUTTON_LABEL,
9
- CITATION_BUTTON_TEXT,
10
- EVALUATION_QUEUE_TEXT,
11
- INTRODUCTION_TEXT,
12
- LLM_BENCHMARKS_TEXT,
13
- TITLE,
14
- )
15
- from src.display.css_html_js import custom_css
16
- from src.display.utils import (
17
- BENCHMARK_COLS,
18
- COLS,
19
- EVAL_COLS,
20
- EVAL_TYPES,
21
- AutoEvalColumn,
22
- ModelType,
23
- fields,
24
- WeightType,
25
- Precision
26
- )
27
- from src.envs import API, EVAL_REQUESTS_PATH, EVAL_RESULTS_PATH, QUEUE_REPO, REPO_ID, RESULTS_REPO, TOKEN
28
- from src.populate import get_evaluation_queue_df, get_leaderboard_df
29
- from src.submission.submit import add_new_eval
30
-
31
-
32
- def restart_space():
33
- API.restart_space(repo_id=REPO_ID)
34
-
35
- ### Space initialisation
36
- try:
37
- print(EVAL_REQUESTS_PATH)
38
- snapshot_download(
39
- repo_id=QUEUE_REPO, local_dir=EVAL_REQUESTS_PATH, repo_type="dataset", tqdm_class=None, etag_timeout=30, token=TOKEN
40
- )
41
- except Exception:
42
- restart_space()
43
- try:
44
- print(EVAL_RESULTS_PATH)
45
- snapshot_download(
46
- repo_id=RESULTS_REPO, local_dir=EVAL_RESULTS_PATH, repo_type="dataset", tqdm_class=None, etag_timeout=30, token=TOKEN
47
- )
48
- except Exception:
49
- restart_space()
50
-
51
-
52
- LEADERBOARD_DF = get_leaderboard_df(EVAL_RESULTS_PATH, EVAL_REQUESTS_PATH, COLS, BENCHMARK_COLS)
53
-
54
- (
55
- finished_eval_queue_df,
56
- running_eval_queue_df,
57
- pending_eval_queue_df,
58
- ) = get_evaluation_queue_df(EVAL_REQUESTS_PATH, EVAL_COLS)
59
-
60
- def init_leaderboard(dataframe):
61
- if dataframe is None or dataframe.empty:
62
- raise ValueError("Leaderboard DataFrame is empty or None.")
63
- return Leaderboard(
64
- value=dataframe,
65
- datatype=[c.type for c in fields(AutoEvalColumn)],
66
- select_columns=SelectColumns(
67
- default_selection=[c.name for c in fields(AutoEvalColumn) if c.displayed_by_default],
68
- cant_deselect=[c.name for c in fields(AutoEvalColumn) if c.never_hidden],
69
- label="Select Columns to Display:",
70
- ),
71
- search_columns=[AutoEvalColumn.model.name, AutoEvalColumn.license.name],
72
- hide_columns=[c.name for c in fields(AutoEvalColumn) if c.hidden],
73
- filter_columns=[
74
- ColumnFilter(AutoEvalColumn.model_type.name, type="checkboxgroup", label="Model types"),
75
- ColumnFilter(AutoEvalColumn.precision.name, type="checkboxgroup", label="Precision"),
76
- ColumnFilter(
77
- AutoEvalColumn.params.name,
78
- type="slider",
79
- min=0.01,
80
- max=150,
81
- label="Select the number of parameters (B)",
82
- ),
83
- ColumnFilter(
84
- AutoEvalColumn.still_on_hub.name, type="boolean", label="Deleted/incomplete", default=True
85
- ),
86
- ],
87
- bool_checkboxgroup_label="Hide models",
88
- interactive=False,
89
- )
90
-
91
-
92
- demo = gr.Blocks(css=custom_css)
93
- with demo:
94
- gr.HTML(TITLE)
95
- gr.Markdown(INTRODUCTION_TEXT, elem_classes="markdown-text")
96
-
97
- with gr.Tabs(elem_classes="tab-buttons") as tabs:
98
- with gr.TabItem("🏅 LLM Benchmark", elem_id="llm-benchmark-tab-table", id=0):
99
- leaderboard = init_leaderboard(LEADERBOARD_DF)
100
-
101
- with gr.TabItem("📝 About", elem_id="llm-benchmark-tab-table", id=2):
102
- gr.Markdown(LLM_BENCHMARKS_TEXT, elem_classes="markdown-text")
103
-
104
- with gr.TabItem("🚀 Submit here! ", elem_id="llm-benchmark-tab-table", id=3):
105
- with gr.Column():
106
- with gr.Row():
107
- gr.Markdown(EVALUATION_QUEUE_TEXT, elem_classes="markdown-text")
108
-
109
- with gr.Column():
110
- with gr.Accordion(
111
- f"✅ Finished Evaluations ({len(finished_eval_queue_df)})",
112
- open=False,
113
- ):
114
- with gr.Row():
115
- finished_eval_table = gr.components.Dataframe(
116
- value=finished_eval_queue_df,
117
- headers=EVAL_COLS,
118
- datatype=EVAL_TYPES,
119
- row_count=5,
120
- )
121
- with gr.Accordion(
122
- f"🔄 Running Evaluation Queue ({len(running_eval_queue_df)})",
123
- open=False,
124
- ):
125
- with gr.Row():
126
- running_eval_table = gr.components.Dataframe(
127
- value=running_eval_queue_df,
128
- headers=EVAL_COLS,
129
- datatype=EVAL_TYPES,
130
- row_count=5,
131
- )
132
-
133
- with gr.Accordion(
134
- f"⏳ Pending Evaluation Queue ({len(pending_eval_queue_df)})",
135
- open=False,
136
- ):
137
- with gr.Row():
138
- pending_eval_table = gr.components.Dataframe(
139
- value=pending_eval_queue_df,
140
- headers=EVAL_COLS,
141
- datatype=EVAL_TYPES,
142
- row_count=5,
143
- )
144
- with gr.Row():
145
- gr.Markdown("# ✉️✨ Submit your model here!", elem_classes="markdown-text")
146
-
147
- with gr.Row():
148
- with gr.Column():
149
- model_name_textbox = gr.Textbox(label="Model name")
150
- revision_name_textbox = gr.Textbox(label="Revision commit", placeholder="main")
151
- model_type = gr.Dropdown(
152
- choices=[t.to_str(" : ") for t in ModelType if t != ModelType.Unknown],
153
- label="Model type",
154
- multiselect=False,
155
- value=None,
156
- interactive=True,
157
- )
158
-
159
- with gr.Column():
160
- precision = gr.Dropdown(
161
- choices=[i.value.name for i in Precision if i != Precision.Unknown],
162
- label="Precision",
163
- multiselect=False,
164
- value="float16",
165
- interactive=True,
166
- )
167
- weight_type = gr.Dropdown(
168
- choices=[i.value.name for i in WeightType],
169
- label="Weights type",
170
- multiselect=False,
171
- value="Original",
172
- interactive=True,
173
- )
174
- base_model_name_textbox = gr.Textbox(label="Base model (for delta or adapter weights)")
175
-
176
- submit_button = gr.Button("Submit Eval")
177
- submission_result = gr.Markdown()
178
- submit_button.click(
179
- add_new_eval,
180
- [
181
- model_name_textbox,
182
- base_model_name_textbox,
183
- revision_name_textbox,
184
- precision,
185
- weight_type,
186
- model_type,
187
- ],
188
- submission_result,
189
- )
190
 
191
- with gr.Row():
192
- with gr.Accordion("📙 Citation", open=False):
193
- citation_button = gr.Textbox(
194
- value=CITATION_BUTTON_TEXT,
195
- label=CITATION_BUTTON_LABEL,
196
- lines=20,
197
- elem_id="citation-button",
198
- show_copy_button=True,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
200
 
201
- scheduler = BackgroundScheduler()
202
- scheduler.add_job(restart_space, "interval", seconds=1800)
203
- scheduler.start()
204
- demo.queue(default_concurrency_limit=40).launch()
 
1
  import gradio as gr
 
2
  import pandas as pd
3
+ import numpy as np
4
+ import os
5
+ from datetime import datetime
6
+
7
+ # Load the papers data
8
+ def load_papers():
9
+ try:
10
+ papers_df = pd.read_csv('all_papers_0328.csv')
11
+ # Clean up columns if needed and handle missing values
12
+ papers_df = papers_df.fillna('')
13
+
14
+ # Filter out papers with empty titles
15
+ papers_df = papers_df[papers_df['Title'].str.strip() != '']
16
+
17
+ # Ensure Year is integer
18
+ papers_df['Year'] = pd.to_numeric(papers_df['Year'], errors='coerce').fillna(0).astype(int)
19
+
20
+ return papers_df
21
+ except Exception as e:
22
+ print(f"Error loading papers: {e}")
23
+ # Return empty dataframe with expected columns
24
+ return pd.DataFrame(columns=['Title', 'TLDR-EN', 'Section', 'url', 'Year', 'Publish Venue'])
25
+
26
+ # Search function
27
+ def search_papers(search_term, section_filter, year_filter, sort_by):
28
+ papers_df = load_papers()
29
+
30
+ if search_term:
31
+ # Case-insensitive search across multiple columns
32
+ search_mask = (
33
+ papers_df['Title'].str.contains(search_term, case=False, na=False, regex=True) |
34
+ papers_df['TLDR-EN'].str.contains(search_term, case=False, na=False, regex=True) |
35
+ papers_df['Section'].str.contains(search_term, case=False, na=False, regex=True) |
36
+ papers_df['Publish Venue'].str.contains(search_term, case=False, na=False, regex=True)
37
+ )
38
+ papers_df = papers_df[search_mask]
39
+
40
+ # Apply section filter if selected
41
+ if section_filter != "All Sections":
42
+ papers_df = papers_df[papers_df['Section'] == section_filter]
43
+
44
+ # Apply year filter if selected
45
+ if year_filter != "All Years":
46
+ papers_df = papers_df[papers_df['Year'] == int(year_filter)]
47
+
48
+ # Sort based on selection
49
+ if sort_by == "Year (newest first)":
50
+ papers_df = papers_df.sort_values(by=['Year', 'Title'], ascending=[False, True])
51
+ elif sort_by == "Year (oldest first)":
52
+ papers_df = papers_df.sort_values(by=['Year', 'Title'], ascending=[True, True])
53
+ elif sort_by == "Title (A-Z)":
54
+ papers_df = papers_df.sort_values(by='Title')
55
+ elif sort_by == "Section":
56
+ papers_df = papers_df.sort_values(by=['Section', 'Year', 'Title'], ascending=[True, False, True])
57
+
58
+ # Format for display
59
+ html_output = "<div class='papers-container'>"
60
+
61
+ if len(papers_df) == 0:
62
+ html_output += "<p>No papers found matching your criteria.</p>"
63
+ else:
64
+ for i, row in papers_df.iterrows():
65
+ html_output += f"""
66
+ <div class='paper-card'>
67
+ <div class='paper-title'>
68
+ <a href='{row['url']}' target='_blank'>{row['Title']}</a>
69
+ </div>
70
+ <div class='paper-tldr'>{row['TLDR-EN']}</div>
71
+ <div class='paper-meta'>
72
+ <span class='meta-item section'>{row['Section']}</span>
73
+ <span class='meta-item year'>{row['Year']}</span>
74
+ <span class='meta-item venue'>{row['Publish Venue']}</span>
75
+ </div>
76
+ </div>
77
+ """
78
+
79
+ html_output += "</div>"
80
+
81
+ # Add paper count
82
+ paper_count = len(papers_df)
83
+ count_text = f"<p><strong>{paper_count} papers</strong> found</p>"
84
+
85
+ return count_text + html_output
86
+
87
+ # Get unique sections and years for filtering
88
+ def get_filter_options():
89
+ papers_df = load_papers()
90
+ sections = ["All Sections"] + sorted(papers_df['Section'].unique().tolist())
91
+ years = ["All Years"] + [str(year) for year in sorted(papers_df['Year'].unique().tolist(), reverse=True) if year > 0]
92
+ return sections, years
93
+
94
+ # Custom CSS
95
+ custom_css = """
96
+ /* Main container */
97
+ body {
98
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
99
+ }
100
+
101
+ .papers-container {
102
+ display: flex;
103
+ flex-direction: column;
104
+ gap: 18px;
105
+ margin-top: 20px;
106
+ }
107
+
108
+ /* Paper card styling */
109
+ .paper-card {
110
+ border: 1px solid #e0e0e0;
111
+ border-radius: 12px;
112
+ padding: 20px;
113
+ background-color: #ffffff;
114
+ box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05);
115
+ transition: all 0.2s ease;
116
+ display: flex;
117
+ flex-direction: column;
118
+ gap: 10px;
119
+ }
120
+
121
+ .paper-card:hover {
122
+ box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1);
123
+ transform: translateY(-2px);
124
+ border-color: #d0d0d0;
125
+ }
126
+
127
+ .paper-title {
128
+ font-size: 18px;
129
+ font-weight: 600;
130
+ line-height: 1.4;
131
+ margin-bottom: 4px;
132
+ }
133
+
134
+ .paper-title a {
135
+ color: #2563EB;
136
+ text-decoration: none;
137
+ }
138
+
139
+ .paper-title a:hover {
140
+ text-decoration: underline;
141
+ }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
142
 
143
+ .paper-tldr {
144
+ font-size: 14px;
145
+ color: #4B5563;
146
+ line-height: 1.5;
147
+ margin: 8px 0;
148
+ }
149
+
150
+ .paper-meta {
151
+ display: flex;
152
+ flex-wrap: wrap;
153
+ gap: 8px;
154
+ margin-top: 4px;
155
+ }
156
+
157
+ .meta-item {
158
+ background-color: #F3F4F6;
159
+ border-radius: 16px;
160
+ padding: 4px 12px;
161
+ font-size: 12px;
162
+ color: #4B5563;
163
+ font-weight: 500;
164
+ }
165
+
166
+ /* Section colors */
167
+ .meta-item.section {
168
+ background-color: #DBEAFE;
169
+ color: #1E40AF;
170
+ }
171
+
172
+ .meta-item.year {
173
+ background-color: #FEE2E2;
174
+ color: #991B1B;
175
+ }
176
+
177
+ .meta-item.venue {
178
+ background-color: #E0E7FF;
179
+ color: #3730A3;
180
+ }
181
+
182
+ /* Responsive design */
183
+ @media (max-width: 768px) {
184
+ .paper-card {
185
+ padding: 16px;
186
+ }
187
+
188
+ .paper-title {
189
+ font-size: 16px;
190
+ }
191
+
192
+ .paper-tldr {
193
+ font-size: 13px;
194
+ }
195
+
196
+ .meta-item {
197
+ font-size: 11px;
198
+ padding: 3px 10px;
199
+ }
200
+ }
201
+
202
+ /* Results count styling */
203
+ p strong {
204
+ color: #2563EB;
205
+ }
206
+ """
207
+
208
+ # Create the Gradio interface
209
+ def create_interface():
210
+ sections, years = get_filter_options()
211
+
212
+ # Get paper statistics
213
+ papers_df = load_papers()
214
+ total_papers = len(papers_df)
215
+ paper_counts_by_section = papers_df['Section'].value_counts().to_dict()
216
+ paper_counts_by_year = papers_df['Year'].value_counts().to_dict()
217
+
218
+ # Filter out year 0 if it exists
219
+ min_year = min([year for year in paper_counts_by_year.keys() if year > 0]) if paper_counts_by_year else 'N/A'
220
+ max_year = max(paper_counts_by_year.keys()) if paper_counts_by_year else 'N/A'
221
+
222
+ # Project description with linked paper
223
+ project_description = f"""
224
+ # Large Language Model Agent: A Survey on Methodology, Applications and Challenges
225
+
226
+ This application showcases papers from our comprehensive survey on Large Language Model (LLM) agents. We organize papers across key categories including agent construction, collaboration mechanisms, evolution, tools, security, benchmarks, and applications.
227
+
228
+ ## About the Survey
229
+
230
+ The era of intelligent agents is upon us, driven by revolutionary advancements in large language models. Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially represent a critical pathway toward artificial general intelligence.
231
+
232
+ This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy, linking architectural foundations, collaboration mechanisms, and evolutionary pathways. We unify fragmented research threads by revealing fundamental connections between agent design principles and their emergent behaviors in complex environments.
233
+
234
+ [View the full paper on arXiv](https://arxiv.org/abs/2503.21460)
235
+ [Explore our GitHub repository](https://github.com/luo-junyu/Awesome-Agent-Papers)
236
+
237
+ ## Submit Your Paper
238
+
239
+ We welcome contributions to expand our collection. To submit your paper:
240
+ - Email us at luo.junyu@outlook.com with your paper details
241
+ - Create a pull request on our [GitHub repository](https://github.com/luo-junyu/Awesome-Agent-Papers)
242
+
243
+ ## Collection Overview
244
+
245
+ - **Total Papers**: {total_papers}
246
+ - **Categories**: {len(paper_counts_by_section)}
247
+ - **Year Range**: {min_year} - {max_year}
248
+ """
249
+
250
+ with gr.Blocks(css=custom_css, theme=gr.themes.Soft()) as demo:
251
+ gr.Markdown(project_description)
252
+
253
+ with gr.Row():
254
+ with gr.Column(scale=3):
255
+ search_input = gr.Textbox(
256
+ label="Search Papers",
257
+ placeholder="Enter keywords to search titles, summaries, sections, or venues",
258
+ show_label=True
259
+ )
260
+
261
+ with gr.Column(scale=1):
262
+ section_dropdown = gr.Dropdown(
263
+ choices=sections,
264
+ value="All Sections",
265
+ label="Filter by Section"
266
+ )
267
+
268
+ with gr.Row():
269
+ with gr.Column(scale=1):
270
+ year_dropdown = gr.Dropdown(
271
+ choices=years,
272
+ value="All Years",
273
+ label="Filter by Year"
274
+ )
275
+
276
+ with gr.Column(scale=1):
277
+ sort_dropdown = gr.Dropdown(
278
+ choices=[
279
+ "Year (newest first)",
280
+ "Year (oldest first)",
281
+ "Title (A-Z)",
282
+ "Section"
283
+ ],
284
+ value="Year (newest first)",
285
+ label="Sort by"
286
+ )
287
+
288
+ search_button = gr.Button("Search", variant="primary")
289
+
290
+ # Results display
291
+ results_html = gr.HTML(label="Search Results")
292
+
293
+ # Section distribution chart
294
+ section_data = [[section, count] for section, count in paper_counts_by_section.items()]
295
+ section_data.sort(key=lambda x: x[1], reverse=True)
296
+
297
+ with gr.Accordion("Paper Distribution by Section", open=False):
298
+ gr.Dataframe(
299
+ headers=["Section", "Count"],
300
+ datatype=["str", "number"],
301
+ value=section_data
302
  )
303
+
304
+ # Year distribution chart
305
+ year_data = [[str(year), count] for year, count in paper_counts_by_year.items() if year > 0]
306
+ year_data.sort(key=lambda x: int(x[0]), reverse=True)
307
+
308
+ with gr.Accordion("Paper Distribution by Year", open=False):
309
+ gr.Dataframe(
310
+ headers=["Year", "Count"],
311
+ datatype=["str", "number"],
312
+ value=year_data
313
+ )
314
+
315
+ # Add example searches
316
+ gr.Examples(
317
+ examples=[
318
+ ["agent collaboration", "All Sections", "All Years", "Year (newest first)"],
319
+ ["security", "Security", "All Years", "Year (newest first)"],
320
+ ["benchmark", "Datasets & Benchmarks", "2024", "Year (newest first)"],
321
+ ["tools", "Tools", "All Years", "Year (newest first)"],
322
+ ],
323
+ inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
324
+ outputs=results_html,
325
+ fn=search_papers,
326
+ cache_examples=True,
327
+ )
328
+
329
+ # Set up search on button click and input changes
330
+ search_button.click(
331
+ fn=search_papers,
332
+ inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
333
+ outputs=results_html
334
+ )
335
+
336
+ # Also search when dropdown values change
337
+ section_dropdown.change(
338
+ fn=search_papers,
339
+ inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
340
+ outputs=results_html
341
+ )
342
+
343
+ year_dropdown.change(
344
+ fn=search_papers,
345
+ inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
346
+ outputs=results_html
347
+ )
348
+
349
+ sort_dropdown.change(
350
+ fn=search_papers,
351
+ inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
352
+ outputs=results_html
353
+ )
354
+
355
+ # Load initial results on page load
356
+ demo.load(
357
+ fn=lambda: search_papers("", "All Sections", "All Years", "Year (newest first)"),
358
+ inputs=None,
359
+ outputs=results_html
360
+ )
361
+
362
+ return demo
363
+
364
+ # Create and launch the interface
365
+ demo = create_interface()
366
 
367
+ if __name__ == "__main__":
368
+ demo.launch()
 
 
requirements.txt CHANGED
@@ -1,16 +1,4 @@
1
- APScheduler
2
- black
3
- datasets
4
- gradio
5
- gradio[oauth]
6
- gradio_leaderboard==0.0.13
7
- gradio_client
8
- huggingface-hub>=0.18.0
9
- matplotlib
10
- numpy
11
- pandas
12
- python-dateutil
13
- tqdm
14
- transformers
15
- tokenizers>=0.15.0
16
- sentencepiece
 
1
+ gradio>=3.50.2
2
+ pandas>=1.3.5
3
+ numpy>=1.21.6
4
+ matplotlib
 
 
 
 
 
 
 
 
 
 
 
 
src/about.py CHANGED
@@ -21,52 +21,70 @@ NUM_FEWSHOT = 0 # Change with your few shot
21
 
22
 
23
  # Your leaderboard name
24
- TITLE = """<h1 align="center" id="space-title">Demo leaderboard</h1>"""
25
 
26
  # What does your leaderboard evaluate?
27
  INTRODUCTION_TEXT = """
28
- Intro text
 
 
 
 
 
 
 
 
29
  """
30
 
31
  # Which evaluations are you running? how can people reproduce what you have?
32
  LLM_BENCHMARKS_TEXT = f"""
33
- ## How it works
34
 
35
- ## Reproducibility
36
- To reproduce our results, here is the commands you can run:
 
 
37
 
38
- """
 
 
39
 
40
- EVALUATION_QUEUE_TEXT = """
41
- ## Some good practices before submitting a model
42
 
43
- ### 1) Make sure you can load your model and tokenizer using AutoClasses:
44
- ```python
45
- from transformers import AutoConfig, AutoModel, AutoTokenizer
46
- config = AutoConfig.from_pretrained("your model name", revision=revision)
47
- model = AutoModel.from_pretrained("your model name", revision=revision)
48
- tokenizer = AutoTokenizer.from_pretrained("your model name", revision=revision)
49
- ```
50
- If this step fails, follow the error messages to debug your model before submitting it. It's likely your model has been improperly uploaded.
51
 
52
- Note: make sure your model is public!
53
- Note: if your model needs `use_remote_code=True`, we do not support this option yet but we are working on adding it, stay posted!
 
 
 
 
 
 
54
 
55
- ### 2) Convert your model weights to [safetensors](https://huggingface.co/docs/safetensors/index)
56
- It's a new format for storing weights which is safer and faster to load and use. It will also allow us to add the number of parameters of your model to the `Extended Viewer`!
 
 
 
 
57
 
58
- ### 3) Make sure your model has an open license!
59
- This is a leaderboard for Open LLMs, and we'd love for as many people as possible to know they can use your model 🤗
60
 
61
- ### 4) Fill up your model card
62
- When we add extra information about models to the leaderboard, it will be automatically taken from the model card
 
 
63
 
64
- ## In case of model failure
65
- If your model is displayed in the `FAILED` category, its execution stopped.
66
- Make sure you have followed the above steps first.
67
- If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
68
  """
69
 
70
- CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
71
  CITATION_BUTTON_TEXT = r"""
 
 
 
 
 
 
72
  """
 
21
 
22
 
23
  # Your leaderboard name
24
+ TITLE = """<h1 align="center" id="space-title">LLM Agent Papers</h1>"""
25
 
26
  # What does your leaderboard evaluate?
27
  INTRODUCTION_TEXT = """
28
+ # Large Language Model Agent: A Survey on Methodology, Applications and Challenges
29
+
30
+ The era of intelligent agents is upon us, driven by revolutionary advancements in large language models.
31
+ Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially
32
+ represent a critical pathway toward artificial general intelligence.
33
+
34
+ This application showcases papers from our comprehensive survey on Large Language Model (LLM) agents.
35
+ We organize papers across key categories including agent construction, collaboration mechanisms, evolution,
36
+ tools, security, benchmarks, and applications.
37
  """
38
 
39
  # Which evaluations are you running? how can people reproduce what you have?
40
  LLM_BENCHMARKS_TEXT = f"""
41
+ ## Survey Overview
42
 
43
+ This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy,
44
+ linking architectural foundations, collaboration mechanisms, and evolutionary pathways.
45
+ We unify fragmented research threads by revealing fundamental connections between agent design
46
+ principles and their emergent behaviors in complex environments.
47
 
48
+ Our work provides a unified architectural perspective, examining how agents are constructed,
49
+ how they collaborate, and how they evolve over time, while also addressing evaluation methodologies,
50
+ tool applications, practical challenges, and diverse application domains.
51
 
52
+ ### Paper Categories
 
53
 
54
+ Our collection organizes papers into several key categories:
 
 
 
 
 
 
 
55
 
56
+ - **Introduction**: Survey papers and foundational works introducing LLM agents
57
+ - **Construction**: Papers on building and designing agents
58
+ - **Collaboration**: Multi-agent systems and communication methods
59
+ - **Evolution**: Learning and improvement of agents over time
60
+ - **Tools**: Integration of external tools with LLM agents
61
+ - **Security**: Safety, alignment, and ethical considerations
62
+ - **Datasets & Benchmarks**: Evaluation frameworks and resources
63
+ - **Applications**: Domain-specific uses in science, medicine, etc.
64
 
65
+ View the full paper on [arXiv](https://arxiv.org/abs/2503.21460) and explore our GitHub repository at
66
+ [https://github.com/luo-junyu/Awesome-Agent-Papers](https://github.com/luo-junyu/Awesome-Agent-Papers)
67
+ """
68
+
69
+ EVALUATION_QUEUE_TEXT = """
70
+ ## How to Contribute
71
 
72
+ If you have a paper that you believe should be included in our collection:
 
73
 
74
+ 1. Check if the paper is already in our database
75
+ 2. Submit your paper at [https://forms.office.com/r/sW0Zzymi5b](https://forms.office.com/r/sW0Zzymi5b) or email us at luo.junyu@outlook.com
76
+ 3. Include the paper's title, authors, abstract, URL, publication venue, and year
77
+ 4. Suggest a section/category for the paper
78
 
79
+ We regularly update the repository and this application with new submissions.
 
 
 
80
  """
81
 
82
+ CITATION_BUTTON_LABEL = "Cite our survey paper"
83
  CITATION_BUTTON_TEXT = r"""
84
+ @article{agentsurvey2025,
85
+ title={Large Language Model Agent: A Survey on Methodology, Applications and Challenges},
86
+ author={Junyu Luo and Weizhi Zhang and Ye Yuan and Yusheng Zhao and Junwei Yang and Yiyang Gu and Bohan Wu and Binqi Chen and Ziyue Qiao and Qingqing Long and Rongcheng Tu and Xiao Luo and Wei Ju and Zhiping Xiao and Yifan Wang and Meng Xiao and Chenwu Liu and Jingyang Yuan and Shichang Zhang and Yiqiao Jin and Fan Zhang and Xian Wu and Hanqing Zhao and Dacheng Tao and Philip S. Yu and Ming Zhang},
87
+ journal={arXiv preprint arXiv:2503.21460},
88
+ year={2025}
89
+ }
90
  """
style.css ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* Main container */
2
+ body {
3
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
4
+ }
5
+
6
+ .papers-container {
7
+ display: flex;
8
+ flex-direction: column;
9
+ gap: 18px;
10
+ margin-top: 20px;
11
+ }
12
+
13
+ /* Paper card styling */
14
+ .paper-card {
15
+ border: 1px solid #e0e0e0;
16
+ border-radius: 12px;
17
+ padding: 20px;
18
+ background-color: #ffffff;
19
+ box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05);
20
+ transition: all 0.2s ease;
21
+ display: flex;
22
+ flex-direction: column;
23
+ gap: 10px;
24
+ }
25
+
26
+ .paper-card:hover {
27
+ box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1);
28
+ transform: translateY(-2px);
29
+ border-color: #d0d0d0;
30
+ }
31
+
32
+ .paper-title {
33
+ font-size: 18px;
34
+ font-weight: 600;
35
+ line-height: 1.4;
36
+ margin-bottom: 4px;
37
+ }
38
+
39
+ .paper-title a {
40
+ color: #2563EB;
41
+ text-decoration: none;
42
+ }
43
+
44
+ .paper-title a:hover {
45
+ text-decoration: underline;
46
+ }
47
+
48
+ .paper-tldr {
49
+ font-size: 14px;
50
+ color: #4B5563;
51
+ line-height: 1.5;
52
+ margin: 8px 0;
53
+ }
54
+
55
+ .paper-meta {
56
+ display: flex;
57
+ flex-wrap: wrap;
58
+ gap: 8px;
59
+ margin-top: 4px;
60
+ }
61
+
62
+ .meta-item {
63
+ background-color: #F3F4F6;
64
+ border-radius: 16px;
65
+ padding: 4px 12px;
66
+ font-size: 12px;
67
+ color: #4B5563;
68
+ font-weight: 500;
69
+ }
70
+
71
+ /* Section colors */
72
+ .meta-item:nth-child(1) {
73
+ background-color: #DBEAFE;
74
+ color: #1E40AF;
75
+ }
76
+
77
+ .meta-item:nth-child(2) {
78
+ background-color: #FEE2E2;
79
+ color: #991B1B;
80
+ }
81
+
82
+ .meta-item:nth-child(3) {
83
+ background-color: #E0E7FF;
84
+ color: #3730A3;
85
+ }
86
+
87
+ /* Search interface */
88
+ .search-container {
89
+ margin-bottom: 24px;
90
+ padding: 16px;
91
+ background-color: #F9FAFB;
92
+ border-radius: 12px;
93
+ border: 1px solid #E5E7EB;
94
+ }
95
+
96
+ /* Button styling */
97
+ .primary-button {
98
+ background-color: #2563EB;
99
+ color: white;
100
+ border: none;
101
+ border-radius: 8px;
102
+ padding: 8px 16px;
103
+ font-weight: 500;
104
+ cursor: pointer;
105
+ transition: background-color 0.2s;
106
+ }
107
+
108
+ .primary-button:hover {
109
+ background-color: #1D4ED8;
110
+ }
111
+
112
+ /* Section headers */
113
+ .section-header {
114
+ border-bottom: 2px solid #E5E7EB;
115
+ padding-bottom: 8px;
116
+ margin: 32px 0 16px 0;
117
+ font-weight: 600;
118
+ color: #1F2937;
119
+ }
120
+
121
+ /* Responsive design */
122
+ @media (max-width: 768px) {
123
+ .paper-card {
124
+ padding: 16px;
125
+ }
126
+
127
+ .paper-title {
128
+ font-size: 16px;
129
+ }
130
+
131
+ .paper-tldr {
132
+ font-size: 13px;
133
+ }
134
+
135
+ .meta-item {
136
+ font-size: 11px;
137
+ padding: 3px 10px;
138
+ }
139
+ }
140
+
141
+ /* Gradio container customization */
142
+ .gradio-container {
143
+ max-width: 1200px !important;
144
+ margin: 0 auto !important;
145
+ }
146
+
147
+ /* Results count styling */
148
+ p strong {
149
+ color: #2563EB;
150
+ }
151
+
152
+ /* Accordion styling */
153
+ .accordion .label {
154
+ font-weight: 600;
155
+ color: #1F2937;
156
+ }
157
+
158
+ /* Table styling */
159
+ table {
160
+ width: 100%;
161
+ border-collapse: collapse;
162
+ }
163
+
164
+ th {
165
+ background-color: #F3F4F6;
166
+ text-align: left;
167
+ padding: 12px;
168
+ font-weight: 600;
169
+ }
170
+
171
+ td {
172
+ padding: 12px;
173
+ border-bottom: 1px solid #E5E7EB;
174
+ }
175
+
176
+ /* Examples styling */
177
+ .examples-panel {
178
+ margin-top: 24px;
179
+ padding: 16px;
180
+ background-color: #F9FAFB;
181
+ border-radius: 12px;
182
+ }
183
+
184
+ .examples-header {
185
+ font-weight: 600;
186
+ margin-bottom: 12px;
187
+ }