Spaces:

luojunyu
/

Agent-Papers

Running

App Files Files Community

luojunyu commited on Mar 28

Commit

f8d35c2

1 Parent(s): 94852fa

update

Browse files

Files changed (6) hide show

README.md +71 -0
all_papers_0328.csv +0 -0
app.py +363 -199
requirements.txt +4 -16
src/about.py +47 -29
style.css +187 -0

README.md CHANGED Viewed

@@ -11,6 +11,77 @@ short_description: LLM Agent Research Collection
 sdk_version: 5.19.0
 ---
 # Start the configuration
 Most of the variables to change for a default leaderboard are in `src/env.py` (replace the path for your leaderboard) and `src/about.py` (for tasks).

 sdk_version: 5.19.0
 ---
+# Large Language Model Agent Papers Explorer
+This is a companion application for the paper "Large Language Model Agent: A Survey on Methodology, Applications and Challenges" ([arXiv:2503.21460](https://arxiv.org/abs/2503.21460)).
+## About
+The application provides an interactive interface to explore papers from our comprehensive survey on Large Language Model (LLM) agents. It allows you to search and filter papers across key categories including agent construction, collaboration mechanisms, evolution, tools, security, benchmarks, and applications.
+![Screenshot of the application](/screenshots/app-screenshot.png)
+## Key Features
+- **Paper Search**: Find papers by keywords, titles, summaries, or publication venues
+- **Category Filtering**: Browse papers by sections/categories
+- **Year Filtering**: Filter papers by publication year
+- **Sorting Options**: Sort papers by year, title, or section
+- **Paper Statistics**: View distributions of papers across categories and years
+- **Direct Links**: Access original papers through direct links to their sources
+## Collection Overview
+Our paper collection spans multiple categories:
+- **Introduction**: Survey papers and foundational works introducing LLM agents
+- **Construction**: Papers on building and designing agents
+- **Collaboration**: Multi-agent systems and communication methods
+- **Evolution**: Learning and improvement of agents over time
+- **Tools**: Integration of external tools with LLM agents
+- **Security**: Safety, alignment, and ethical considerations
+- **Datasets & Benchmarks**: Evaluation frameworks and resources
+- **Applications**: Domain-specific uses in science, medicine, etc.
+## Related Resources
+- [Full Survey Paper on arXiv](https://arxiv.org/abs/2503.21460)
+- [Awesome-Agent-Papers GitHub Repository](https://github.com/luo-junyu/Awesome-Agent-Papers)
+## How to Contribute
+If you have a paper that you believe should be included in our collection:
+1. Check if the paper is already in our database
+2. Submit your paper at [this form](https://forms.office.com/r/sW0Zzymi5b) or email us at luo.junyu@outlook.com
+3. Include the paper's title, authors, abstract, URL, publication venue, and year
+4. Suggest a section/category for the paper
+## Citation
+If you find our survey helpful, please consider citing our work:
+```
+@article{agentsurvey2025,
+  title={Large Language Model Agent: A Survey on Methodology, Applications and Challenges},
+  author={Junyu Luo and Weizhi Zhang and Ye Yuan and Yusheng Zhao and Junwei Yang and Yiyang Gu and Bohan Wu and Binqi Chen and Ziyue Qiao and Qingqing Long and Rongcheng Tu and Xiao Luo and Wei Ju and Zhiping Xiao and Yifan Wang and Meng Xiao and Chenwu Liu and Jingyang Yuan and Shichang Zhang and Yiqiao Jin and Fan Zhang and Xian Wu and Hanqing Zhao and Dacheng Tao and Philip S. Yu and Ming Zhang},
+  journal={arXiv preprint arXiv:2503.21460},
+  year={2025}
+}
+```
+## Local Development
+To run this application locally:
+1. Clone this repository
+2. Install the required dependencies with `pip install -r requirements.txt`
+3. Run the application with `python app.py`
+## License
+This project is licensed under the MIT License - see the LICENSE file for details.
 # Start the configuration
 Most of the variables to change for a default leaderboard are in `src/env.py` (replace the path for your leaderboard) and `src/about.py` (for tasks).

all_papers_0328.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

app.py CHANGED Viewed

@@ -1,204 +1,368 @@
 import gradio as gr
-from gradio_leaderboard import Leaderboard, ColumnFilter, SelectColumns
 import pandas as pd
-from apscheduler.schedulers.background import BackgroundScheduler
-from huggingface_hub import snapshot_download
-from src.about import (
-    CITATION_BUTTON_LABEL,
-    CITATION_BUTTON_TEXT,
-    EVALUATION_QUEUE_TEXT,
-    INTRODUCTION_TEXT,
-    LLM_BENCHMARKS_TEXT,
-    TITLE,
-)
-from src.display.css_html_js import custom_css
-from src.display.utils import (
-    BENCHMARK_COLS,
-    COLS,
-    EVAL_COLS,
-    EVAL_TYPES,
-    AutoEvalColumn,
-    ModelType,
-    fields,
-    WeightType,
-    Precision
-)
-from src.envs import API, EVAL_REQUESTS_PATH, EVAL_RESULTS_PATH, QUEUE_REPO, REPO_ID, RESULTS_REPO, TOKEN
-from src.populate import get_evaluation_queue_df, get_leaderboard_df
-from src.submission.submit import add_new_eval
-def restart_space():
-    API.restart_space(repo_id=REPO_ID)
-### Space initialisation
-try:
-    print(EVAL_REQUESTS_PATH)
-    snapshot_download(
-        repo_id=QUEUE_REPO, local_dir=EVAL_REQUESTS_PATH, repo_type="dataset", tqdm_class=None, etag_timeout=30, token=TOKEN
-    )
-except Exception:
-    restart_space()
-try:
-    print(EVAL_RESULTS_PATH)
-    snapshot_download(
-        repo_id=RESULTS_REPO, local_dir=EVAL_RESULTS_PATH, repo_type="dataset", tqdm_class=None, etag_timeout=30, token=TOKEN
-    )
-except Exception:
-    restart_space()
-LEADERBOARD_DF = get_leaderboard_df(EVAL_RESULTS_PATH, EVAL_REQUESTS_PATH, COLS, BENCHMARK_COLS)
-(
-    finished_eval_queue_df,
-    running_eval_queue_df,
-    pending_eval_queue_df,
-) = get_evaluation_queue_df(EVAL_REQUESTS_PATH, EVAL_COLS)
-def init_leaderboard(dataframe):
-    if dataframe is None or dataframe.empty:
-        raise ValueError("Leaderboard DataFrame is empty or None.")
-    return Leaderboard(
-        value=dataframe,
-        datatype=[c.type for c in fields(AutoEvalColumn)],
-        select_columns=SelectColumns(
-            default_selection=[c.name for c in fields(AutoEvalColumn) if c.displayed_by_default],
-            cant_deselect=[c.name for c in fields(AutoEvalColumn) if c.never_hidden],
-            label="Select Columns to Display:",
-        ),
-        search_columns=[AutoEvalColumn.model.name, AutoEvalColumn.license.name],
-        hide_columns=[c.name for c in fields(AutoEvalColumn) if c.hidden],
-        filter_columns=[
-            ColumnFilter(AutoEvalColumn.model_type.name, type="checkboxgroup", label="Model types"),
-            ColumnFilter(AutoEvalColumn.precision.name, type="checkboxgroup", label="Precision"),
-            ColumnFilter(
-                AutoEvalColumn.params.name,
-                type="slider",
-                min=0.01,
-                max=150,
-                label="Select the number of parameters (B)",
-            ),
-            ColumnFilter(
-                AutoEvalColumn.still_on_hub.name, type="boolean", label="Deleted/incomplete", default=True
-            ),
-        ],
-        bool_checkboxgroup_label="Hide models",
-        interactive=False,
-    )
-demo = gr.Blocks(css=custom_css)
-with demo:
-    gr.HTML(TITLE)
-    gr.Markdown(INTRODUCTION_TEXT, elem_classes="markdown-text")
-    with gr.Tabs(elem_classes="tab-buttons") as tabs:
-        with gr.TabItem("🏅 LLM Benchmark", elem_id="llm-benchmark-tab-table", id=0):
-            leaderboard = init_leaderboard(LEADERBOARD_DF)
-        with gr.TabItem("📝 About", elem_id="llm-benchmark-tab-table", id=2):
-            gr.Markdown(LLM_BENCHMARKS_TEXT, elem_classes="markdown-text")
-        with gr.TabItem("🚀 Submit here! ", elem_id="llm-benchmark-tab-table", id=3):
-            with gr.Column():
-                with gr.Row():
-                    gr.Markdown(EVALUATION_QUEUE_TEXT, elem_classes="markdown-text")
-                with gr.Column():
-                    with gr.Accordion(
-                        f"✅ Finished Evaluations ({len(finished_eval_queue_df)})",
-                        open=False,
-                    ):
-                        with gr.Row():
-                            finished_eval_table = gr.components.Dataframe(
-                                value=finished_eval_queue_df,
-                                headers=EVAL_COLS,
-                                datatype=EVAL_TYPES,
-                                row_count=5,
-                            )
-                    with gr.Accordion(
-                        f"🔄 Running Evaluation Queue ({len(running_eval_queue_df)})",
-                        open=False,
-                    ):
-                        with gr.Row():
-                            running_eval_table = gr.components.Dataframe(
-                                value=running_eval_queue_df,
-                                headers=EVAL_COLS,
-                                datatype=EVAL_TYPES,
-                                row_count=5,
-                            )
-                    with gr.Accordion(
-                        f"⏳ Pending Evaluation Queue ({len(pending_eval_queue_df)})",
-                        open=False,
-                    ):
-                        with gr.Row():
-                            pending_eval_table = gr.components.Dataframe(
-                                value=pending_eval_queue_df,
-                                headers=EVAL_COLS,
-                                datatype=EVAL_TYPES,
-                                row_count=5,
-                            )
-            with gr.Row():
-                gr.Markdown("# ✉️✨ Submit your model here!", elem_classes="markdown-text")
-            with gr.Row():
-                with gr.Column():
-                    model_name_textbox = gr.Textbox(label="Model name")
-                    revision_name_textbox = gr.Textbox(label="Revision commit", placeholder="main")
-                    model_type = gr.Dropdown(
-                        choices=[t.to_str(" : ") for t in ModelType if t != ModelType.Unknown],
-                        label="Model type",
-                        multiselect=False,
-                        value=None,
-                        interactive=True,
-                    )
-                with gr.Column():
-                    precision = gr.Dropdown(
-                        choices=[i.value.name for i in Precision if i != Precision.Unknown],
-                        label="Precision",
-                        multiselect=False,
-                        value="float16",
-                        interactive=True,
-                    )
-                    weight_type = gr.Dropdown(
-                        choices=[i.value.name for i in WeightType],
-                        label="Weights type",
-                        multiselect=False,
-                        value="Original",
-                        interactive=True,
-                    )
-                    base_model_name_textbox = gr.Textbox(label="Base model (for delta or adapter weights)")
-            submit_button = gr.Button("Submit Eval")
-            submission_result = gr.Markdown()
-            submit_button.click(
-                add_new_eval,
-                [
-                    model_name_textbox,
-                    base_model_name_textbox,
-                    revision_name_textbox,
-                    precision,
-                    weight_type,
-                    model_type,
-                ],
-                submission_result,
-            )
-    with gr.Row():
-        with gr.Accordion("📙 Citation", open=False):
-            citation_button = gr.Textbox(
-                value=CITATION_BUTTON_TEXT,
-                label=CITATION_BUTTON_LABEL,
-                lines=20,
-                elem_id="citation-button",
-                show_copy_button=True,
             )
-scheduler = BackgroundScheduler()
-scheduler.add_job(restart_space, "interval", seconds=1800)
-scheduler.start()
-demo.queue(default_concurrency_limit=40).launch()

 import gradio as gr
 import pandas as pd
+import numpy as np
+import os
+from datetime import datetime
+# Load the papers data
+def load_papers():
+    try:
+        papers_df = pd.read_csv('all_papers_0328.csv')
+        # Clean up columns if needed and handle missing values
+        papers_df = papers_df.fillna('')
+        # Filter out papers with empty titles
+        papers_df = papers_df[papers_df['Title'].str.strip() != '']
+        # Ensure Year is integer
+        papers_df['Year'] = pd.to_numeric(papers_df['Year'], errors='coerce').fillna(0).astype(int)
+        return papers_df
+    except Exception as e:
+        print(f"Error loading papers: {e}")
+        # Return empty dataframe with expected columns
+        return pd.DataFrame(columns=['Title', 'TLDR-EN', 'Section', 'url', 'Year', 'Publish Venue'])
+# Search function
+def search_papers(search_term, section_filter, year_filter, sort_by):
+    papers_df = load_papers()
+    if search_term:
+        # Case-insensitive search across multiple columns
+        search_mask = (
+            papers_df['Title'].str.contains(search_term, case=False, na=False, regex=True) |
+            papers_df['TLDR-EN'].str.contains(search_term, case=False, na=False, regex=True) |
+            papers_df['Section'].str.contains(search_term, case=False, na=False, regex=True) |
+            papers_df['Publish Venue'].str.contains(search_term, case=False, na=False, regex=True)
+        )
+        papers_df = papers_df[search_mask]
+    # Apply section filter if selected
+    if section_filter != "All Sections":
+        papers_df = papers_df[papers_df['Section'] == section_filter]
+    # Apply year filter if selected
+    if year_filter != "All Years":
+        papers_df = papers_df[papers_df['Year'] == int(year_filter)]
+    # Sort based on selection
+    if sort_by == "Year (newest first)":
+        papers_df = papers_df.sort_values(by=['Year', 'Title'], ascending=[False, True])
+    elif sort_by == "Year (oldest first)":
+        papers_df = papers_df.sort_values(by=['Year', 'Title'], ascending=[True, True])
+    elif sort_by == "Title (A-Z)":
+        papers_df = papers_df.sort_values(by='Title')
+    elif sort_by == "Section":
+        papers_df = papers_df.sort_values(by=['Section', 'Year', 'Title'], ascending=[True, False, True])
+    # Format for display
+    html_output = "<div class='papers-container'>"
+    if len(papers_df) == 0:
+        html_output += "<p>No papers found matching your criteria.</p>"
+    else:
+        for i, row in papers_df.iterrows():
+            html_output += f"""
+            <div class='paper-card'>
+                <div class='paper-title'>
+                    <a href='{row['url']}' target='_blank'>{row['Title']}</a>
+                </div>
+                <div class='paper-tldr'>{row['TLDR-EN']}</div>
+                <div class='paper-meta'>
+                    <span class='meta-item section'>{row['Section']}</span>
+                    <span class='meta-item year'>{row['Year']}</span>
+                    <span class='meta-item venue'>{row['Publish Venue']}</span>
+                </div>
+            </div>
+            """
+    html_output += "</div>"
+    # Add paper count
+    paper_count = len(papers_df)
+    count_text = f"<p><strong>{paper_count} papers</strong> found</p>"
+    return count_text + html_output
+# Get unique sections and years for filtering
+def get_filter_options():
+    papers_df = load_papers()
+    sections = ["All Sections"] + sorted(papers_df['Section'].unique().tolist())
+    years = ["All Years"] + [str(year) for year in sorted(papers_df['Year'].unique().tolist(), reverse=True) if year > 0]
+    return sections, years
+# Custom CSS
+custom_css = """
+/* Main container */
+body {
+    font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
+}
+.papers-container {
+    display: flex;
+    flex-direction: column;
+    gap: 18px;
+    margin-top: 20px;
+}
+/* Paper card styling */
+.paper-card {
+    border: 1px solid #e0e0e0;
+    border-radius: 12px;
+    padding: 20px;
+    background-color: #ffffff;
+    box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05);
+    transition: all 0.2s ease;
+    display: flex;
+    flex-direction: column;
+    gap: 10px;
+}
+.paper-card:hover {
+    box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1);
+    transform: translateY(-2px);
+    border-color: #d0d0d0;
+}
+.paper-title {
+    font-size: 18px;
+    font-weight: 600;
+    line-height: 1.4;
+    margin-bottom: 4px;
+}
+.paper-title a {
+    color: #2563EB;
+    text-decoration: none;
+}
+.paper-title a:hover {
+    text-decoration: underline;
+}
+.paper-tldr {
+    font-size: 14px;
+    color: #4B5563;
+    line-height: 1.5;
+    margin: 8px 0;
+}
+.paper-meta {
+    display: flex;
+    flex-wrap: wrap;
+    gap: 8px;
+    margin-top: 4px;
+}
+.meta-item {
+    background-color: #F3F4F6;
+    border-radius: 16px;
+    padding: 4px 12px;
+    font-size: 12px;
+    color: #4B5563;
+    font-weight: 500;
+}
+/* Section colors */
+.meta-item.section {
+    background-color: #DBEAFE;
+    color: #1E40AF;
+}
+.meta-item.year {
+    background-color: #FEE2E2;
+    color: #991B1B;
+}
+.meta-item.venue {
+    background-color: #E0E7FF;
+    color: #3730A3;
+}
+/* Responsive design */
+@media (max-width: 768px) {
+    .paper-card {
+        padding: 16px;
+    }
+    .paper-title {
+        font-size: 16px;
+    }
+    .paper-tldr {
+        font-size: 13px;
+    }
+    .meta-item {
+        font-size: 11px;
+        padding: 3px 10px;
+    }
+}
+/* Results count styling */
+p strong {
+    color: #2563EB;
+}
+"""
+# Create the Gradio interface
+def create_interface():
+    sections, years = get_filter_options()
+    # Get paper statistics
+    papers_df = load_papers()
+    total_papers = len(papers_df)
+    paper_counts_by_section = papers_df['Section'].value_counts().to_dict()
+    paper_counts_by_year = papers_df['Year'].value_counts().to_dict()
+    # Filter out year 0 if it exists
+    min_year = min([year for year in paper_counts_by_year.keys() if year > 0]) if paper_counts_by_year else 'N/A'
+    max_year = max(paper_counts_by_year.keys()) if paper_counts_by_year else 'N/A'
+    # Project description with linked paper
+    project_description = f"""
+    # Large Language Model Agent: A Survey on Methodology, Applications and Challenges
+    This application showcases papers from our comprehensive survey on Large Language Model (LLM) agents. We organize papers across key categories including agent construction, collaboration mechanisms, evolution, tools, security, benchmarks, and applications.
+    ## About the Survey
+    The era of intelligent agents is upon us, driven by revolutionary advancements in large language models. Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially represent a critical pathway toward artificial general intelligence.
+    This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy, linking architectural foundations, collaboration mechanisms, and evolutionary pathways. We unify fragmented research threads by revealing fundamental connections between agent design principles and their emergent behaviors in complex environments.
+    [View the full paper on arXiv](https://arxiv.org/abs/2503.21460)
+    [Explore our GitHub repository](https://github.com/luo-junyu/Awesome-Agent-Papers)
+    ## Submit Your Paper
+    We welcome contributions to expand our collection. To submit your paper:
+    - Email us at luo.junyu@outlook.com with your paper details
+    - Create a pull request on our [GitHub repository](https://github.com/luo-junyu/Awesome-Agent-Papers)
+    ## Collection Overview
+    - **Total Papers**: {total_papers}
+    - **Categories**: {len(paper_counts_by_section)}
+    - **Year Range**: {min_year} - {max_year}
+    """
+    with gr.Blocks(css=custom_css, theme=gr.themes.Soft()) as demo:
+        gr.Markdown(project_description)
+        with gr.Row():
+            with gr.Column(scale=3):
+                search_input = gr.Textbox(
+                    label="Search Papers",
+                    placeholder="Enter keywords to search titles, summaries, sections, or venues",
+                    show_label=True
+                )
+            with gr.Column(scale=1):
+                section_dropdown = gr.Dropdown(
+                    choices=sections,
+                    value="All Sections",
+                    label="Filter by Section"
+                )
+        with gr.Row():
+            with gr.Column(scale=1):
+                year_dropdown = gr.Dropdown(
+                    choices=years,
+                    value="All Years",
+                    label="Filter by Year"
+                )
+            with gr.Column(scale=1):
+                sort_dropdown = gr.Dropdown(
+                    choices=[
+                        "Year (newest first)",
+                        "Year (oldest first)",
+                        "Title (A-Z)",
+                        "Section"
+                    ],
+                    value="Year (newest first)",
+                    label="Sort by"
+                )
+        search_button = gr.Button("Search", variant="primary")
+        # Results display
+        results_html = gr.HTML(label="Search Results")
+        # Section distribution chart
+        section_data = [[section, count] for section, count in paper_counts_by_section.items()]
+        section_data.sort(key=lambda x: x[1], reverse=True)
+        with gr.Accordion("Paper Distribution by Section", open=False):
+            gr.Dataframe(
+                headers=["Section", "Count"],
+                datatype=["str", "number"],
+                value=section_data
             )
+        # Year distribution chart
+        year_data = [[str(year), count] for year, count in paper_counts_by_year.items() if year > 0]
+        year_data.sort(key=lambda x: int(x[0]), reverse=True)
+        with gr.Accordion("Paper Distribution by Year", open=False):
+            gr.Dataframe(
+                headers=["Year", "Count"],
+                datatype=["str", "number"],
+                value=year_data
+            )
+        # Add example searches
+        gr.Examples(
+            examples=[
+                ["agent collaboration", "All Sections", "All Years", "Year (newest first)"],
+                ["security", "Security", "All Years", "Year (newest first)"],
+                ["benchmark", "Datasets & Benchmarks", "2024", "Year (newest first)"],
+                ["tools", "Tools", "All Years", "Year (newest first)"],
+            ],
+            inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
+            outputs=results_html,
+            fn=search_papers,
+            cache_examples=True,
+        )
+        # Set up search on button click and input changes
+        search_button.click(
+            fn=search_papers,
+            inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
+            outputs=results_html
+        )
+        # Also search when dropdown values change
+        section_dropdown.change(
+            fn=search_papers,
+            inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
+            outputs=results_html
+        )
+        year_dropdown.change(
+            fn=search_papers,
+            inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
+            outputs=results_html
+        )
+        sort_dropdown.change(
+            fn=search_papers,
+            inputs=[search_input, section_dropdown, year_dropdown, sort_dropdown],
+            outputs=results_html
+        )
+        # Load initial results on page load
+        demo.load(
+            fn=lambda: search_papers("", "All Sections", "All Years", "Year (newest first)"),
+            inputs=None,
+            outputs=results_html
+        )
+    return demo
+# Create and launch the interface
+demo = create_interface()
+if __name__ == "__main__":
+    demo.launch()

requirements.txt CHANGED Viewed

@@ -1,16 +1,4 @@
-APScheduler
-black
-datasets
-gradio
-gradio[oauth]
-gradio_leaderboard==0.0.13
-gradio_client
-huggingface-hub>=0.18.0
-matplotlib
-numpy
-pandas
-python-dateutil
-tqdm
-transformers
-tokenizers>=0.15.0
-sentencepiece

+gradio>=3.50.2
+pandas>=1.3.5
+numpy>=1.21.6
+matplotlib

src/about.py CHANGED Viewed

@@ -21,52 +21,70 @@ NUM_FEWSHOT = 0 # Change with your few shot
 # Your leaderboard name
-TITLE = """<h1 align="center" id="space-title">Demo leaderboard</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
-Intro text
 """
 # Which evaluations are you running? how can people reproduce what you have?
 LLM_BENCHMARKS_TEXT = f"""
-## How it works
-## Reproducibility
-To reproduce our results, here is the commands you can run:
-"""
-EVALUATION_QUEUE_TEXT = """
-## Some good practices before submitting a model
-### 1) Make sure you can load your model and tokenizer using AutoClasses:
-```python
-from transformers import AutoConfig, AutoModel, AutoTokenizer
-config = AutoConfig.from_pretrained("your model name", revision=revision)
-model = AutoModel.from_pretrained("your model name", revision=revision)
-tokenizer = AutoTokenizer.from_pretrained("your model name", revision=revision)
-```
-If this step fails, follow the error messages to debug your model before submitting it. It's likely your model has been improperly uploaded.
-Note: make sure your model is public!
-Note: if your model needs `use_remote_code=True`, we do not support this option yet but we are working on adding it, stay posted!
-### 2) Convert your model weights to [safetensors](https://huggingface.co/docs/safetensors/index)
-It's a new format for storing weights which is safer and faster to load and use. It will also allow us to add the number of parameters of your model to the `Extended Viewer`!
-### 3) Make sure your model has an open license!
-This is a leaderboard for Open LLMs, and we'd love for as many people as possible to know they can use your model 🤗
-### 4) Fill up your model card
-When we add extra information about models to the leaderboard, it will be automatically taken from the model card
-## In case of model failure
-If your model is displayed in the `FAILED` category, its execution stopped.
-Make sure you have followed the above steps first.
-If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
 """
-CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 CITATION_BUTTON_TEXT = r"""
 """

 # Your leaderboard name
+TITLE = """<h1 align="center" id="space-title">LLM Agent Papers</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
+# Large Language Model Agent: A Survey on Methodology, Applications and Challenges
+The era of intelligent agents is upon us, driven by revolutionary advancements in large language models.
+Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially
+represent a critical pathway toward artificial general intelligence.
+This application showcases papers from our comprehensive survey on Large Language Model (LLM) agents.
+We organize papers across key categories including agent construction, collaboration mechanisms, evolution,
+tools, security, benchmarks, and applications.
 """
 # Which evaluations are you running? how can people reproduce what you have?
 LLM_BENCHMARKS_TEXT = f"""
+## Survey Overview
+This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy,
+linking architectural foundations, collaboration mechanisms, and evolutionary pathways.
+We unify fragmented research threads by revealing fundamental connections between agent design
+principles and their emergent behaviors in complex environments.
+Our work provides a unified architectural perspective, examining how agents are constructed,
+how they collaborate, and how they evolve over time, while also addressing evaluation methodologies,
+tool applications, practical challenges, and diverse application domains.
+### Paper Categories
+Our collection organizes papers into several key categories:
+- **Introduction**: Survey papers and foundational works introducing LLM agents
+- **Construction**: Papers on building and designing agents
+- **Collaboration**: Multi-agent systems and communication methods
+- **Evolution**: Learning and improvement of agents over time
+- **Tools**: Integration of external tools with LLM agents
+- **Security**: Safety, alignment, and ethical considerations
+- **Datasets & Benchmarks**: Evaluation frameworks and resources
+- **Applications**: Domain-specific uses in science, medicine, etc.
+View the full paper on [arXiv](https://arxiv.org/abs/2503.21460) and explore our GitHub repository at
+[https://github.com/luo-junyu/Awesome-Agent-Papers](https://github.com/luo-junyu/Awesome-Agent-Papers)
+"""
+EVALUATION_QUEUE_TEXT = """
+## How to Contribute
+If you have a paper that you believe should be included in our collection:
+1. Check if the paper is already in our database
+2. Submit your paper at [https://forms.office.com/r/sW0Zzymi5b](https://forms.office.com/r/sW0Zzymi5b) or email us at luo.junyu@outlook.com
+3. Include the paper's title, authors, abstract, URL, publication venue, and year
+4. Suggest a section/category for the paper
+We regularly update the repository and this application with new submissions.
 """
+CITATION_BUTTON_LABEL = "Cite our survey paper"
 CITATION_BUTTON_TEXT = r"""
+@article{agentsurvey2025,
+  title={Large Language Model Agent: A Survey on Methodology, Applications and Challenges},
+  author={Junyu Luo and Weizhi Zhang and Ye Yuan and Yusheng Zhao and Junwei Yang and Yiyang Gu and Bohan Wu and Binqi Chen and Ziyue Qiao and Qingqing Long and Rongcheng Tu and Xiao Luo and Wei Ju and Zhiping Xiao and Yifan Wang and Meng Xiao and Chenwu Liu and Jingyang Yuan and Shichang Zhang and Yiqiao Jin and Fan Zhang and Xian Wu and Hanqing Zhao and Dacheng Tao and Philip S. Yu and Ming Zhang},
+  journal={arXiv preprint arXiv:2503.21460},
+  year={2025}
+}
 """

style.css ADDED Viewed

	@@ -0,0 +1,187 @@

+/* Main container */
+body {
+    font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
+}
+.papers-container {
+    display: flex;
+    flex-direction: column;
+    gap: 18px;
+    margin-top: 20px;
+}
+/* Paper card styling */
+.paper-card {
+    border: 1px solid #e0e0e0;
+    border-radius: 12px;
+    padding: 20px;
+    background-color: #ffffff;
+    box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05);
+    transition: all 0.2s ease;
+    display: flex;
+    flex-direction: column;
+    gap: 10px;
+}
+.paper-card:hover {
+    box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1);
+    transform: translateY(-2px);
+    border-color: #d0d0d0;
+}
+.paper-title {
+    font-size: 18px;
+    font-weight: 600;
+    line-height: 1.4;
+    margin-bottom: 4px;
+}
+.paper-title a {
+    color: #2563EB;
+    text-decoration: none;
+}
+.paper-title a:hover {
+    text-decoration: underline;
+}
+.paper-tldr {
+    font-size: 14px;
+    color: #4B5563;
+    line-height: 1.5;
+    margin: 8px 0;
+}
+.paper-meta {
+    display: flex;
+    flex-wrap: wrap;
+    gap: 8px;
+    margin-top: 4px;
+}
+.meta-item {
+    background-color: #F3F4F6;
+    border-radius: 16px;
+    padding: 4px 12px;
+    font-size: 12px;
+    color: #4B5563;
+    font-weight: 500;
+}
+/* Section colors */
+.meta-item:nth-child(1) {
+    background-color: #DBEAFE;
+    color: #1E40AF;
+}
+.meta-item:nth-child(2) {
+    background-color: #FEE2E2;
+    color: #991B1B;
+}
+.meta-item:nth-child(3) {
+    background-color: #E0E7FF;
+    color: #3730A3;
+}
+/* Search interface */
+.search-container {
+    margin-bottom: 24px;
+    padding: 16px;
+    background-color: #F9FAFB;
+    border-radius: 12px;
+    border: 1px solid #E5E7EB;
+}
+/* Button styling */
+.primary-button {
+    background-color: #2563EB;
+    color: white;
+    border: none;
+    border-radius: 8px;
+    padding: 8px 16px;
+    font-weight: 500;
+    cursor: pointer;
+    transition: background-color 0.2s;
+}
+.primary-button:hover {
+    background-color: #1D4ED8;
+}
+/* Section headers */
+.section-header {
+    border-bottom: 2px solid #E5E7EB;
+    padding-bottom: 8px;
+    margin: 32px 0 16px 0;
+    font-weight: 600;
+    color: #1F2937;
+}
+/* Responsive design */
+@media (max-width: 768px) {
+    .paper-card {
+        padding: 16px;
+    }
+    .paper-title {
+        font-size: 16px;
+    }
+    .paper-tldr {
+        font-size: 13px;
+    }
+    .meta-item {
+        font-size: 11px;
+        padding: 3px 10px;
+    }
+}
+/* Gradio container customization */
+.gradio-container {
+    max-width: 1200px !important;
+    margin: 0 auto !important;
+}
+/* Results count styling */
+p strong {
+    color: #2563EB;
+}
+/* Accordion styling */
+.accordion .label {
+    font-weight: 600;
+    color: #1F2937;
+}
+/* Table styling */
+table {
+    width: 100%;
+    border-collapse: collapse;
+}
+th {
+    background-color: #F3F4F6;
+    text-align: left;
+    padding: 12px;
+    font-weight: 600;
+}
+td {
+    padding: 12px;
+    border-bottom: 1px solid #E5E7EB;
+}
+/* Examples styling */
+.examples-panel {
+    margin-top: 24px;
+    padding: 16px;
+    background-color: #F9FAFB;
+    border-radius: 12px;
+}
+.examples-header {
+    font-weight: 600;
+    margin-bottom: 12px;
+}