Spaces:

maomlab
/

CryptoCEN-TopHits

Sleeping

File size: 3,357 Bytes

f67eeb9
774bbad
29713c0
 
774bbad
9ce0bc5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3a0d682
9ce0bc5
 
8f33cca
f67eeb9
 
b18f058
 
8f33cca
61ae154
 
 
 
 
 
 
8f33cca
e6b3eda
3a0d682
9ce0bc5
 
 
 
77d0e61
1807dda
61ae154
 
29713c0
61ae154
29713c0
 
8f33cca
8059916
 
 
 
 
 
 
 
1c4595f
 
bd90dd4
1c4595f
 
bd90dd4
 
 
 
 
 
1c4595f
bd90dd4
 
 
 
 
1c4595f
8059916
 
29713c0
12a4145
8059916
29713c0

import datasets
import streamlit as st
from st_aggrid import AgGrid, GridOptionsBuilder


st.markdown("""
# CryptoCEN Top50 co-expressed partners

**CryptoCEN** is a co-expression network for *Cryptococcus neoformans* built on 1,524 RNA-seq runs across 34 studies.
A pair of genes are said to be co-expressed when their expression is correlated across different conditions and
is often a marker for genes to be involved in similar processes. 

To Cite:
MJ O'Meara, JR Rapala, CB Nichols, C Alexandre, B Billmyre, JL Steenwyk, A Alspaugh,
TR O'Meara CryptoCEN: A Co-Expression Network for Cryptococcus neoformans reveals
novel proteins involved in DNA damage repair
* Code available at https://github.com/maomlab/CalCEN/tree/master/vignettes/CryptoCEN
* Full network and dataset: https://huggingface.co/datasets/maomlab/CryptoCEN

## Look up top-coexpressed partners:
Put in the ``CNAG_#####`` gene_id for a gene and expand the table to get the top 50 co-expressed genes.
``coexp_score`` ranges between ``[0-1]``, where ``1`` is the best and greater than ``0.85`` can be considered significant.
""")

top_coexp_hits = datasets.load_dataset(
    path = "maomlab/CryptoCEN",
    data_files = {"top_coexp_hits": "top_coexp_hits.tsv"})
top_coexp_hits = top_coexp_hits["top_coexp_hits"].to_pandas()

col1, col2 = st.columns(spec = [0.7, 0.3])
with col1:
    gene_id = st.text_input(
        label = "Gene ID",
        value = "CNAG_04365",
        max_chars = 10,
        help = "CNAG Gene ID e.g. CNAG_04365")

top_coexp_hits = top_coexp_hits[
    top_coexp_hits.gene_id_1 == gene_id]
top_coexp_hits = top_coexp_hits[[
    'gene_id_1', 'gene_symbol_1', 'description_1',
    'gene_id_2', 'gene_symbol_2', 'description_2',
    'coexp_score', 'blastp_EValue']]
top_coexp_hits.reset_index()

with col2:
    st.download_button(
        label = "Download data as TSV",
        data = top_coexp_hits.to_csv(sep ='\t').encode('utf-8'),
        file_name = f"top_coexp_hits_{gene_id}.tsv",
        mime = "text/csv")


grid_option_builder = GridOptionsBuilder()
grid_option_builder.configure_default_column(
    filterable=False,
    groupable=False,
    editable=False,
    wrapText=True,
    autoHeight=True)
grid_option_builder.configure_column("gene_id_1", header_name="GeneID 1", pinned="left", width=250)
grid_option_builder.configure_column("gene_symbol_1", header_name="Gene 1", pinned="left", width=250)
grid_option_builder.configure_column("description_1", header_name="Description 1", width=500)
grid_option_builder.configure_column("gene_id_2", header_name="GeneID 2", pinned="left", width=250)
grid_option_builder.configure_column("gene_symbol_2", header_name="Gene 2", pinned="left", width=250)
grid_option_builder.configure_column("description_2", header_name="Description 2", width=500)
grid_option_builder.configure_column(
    "coexp_score",
    header_name="Coexp Score",
    type=["numericColumn", "customNumericFormat"],
    precision=3,
    width=200)
grid_option_builder.configure_column(
    "blast_EValue",
    header_name="Blast E-value",
    type=["numericColumn", "customNumericFormat"],
    precision=3,
    width=200)
grid_option_builder.configure_selection(selection_mode=False, use_checkbox=False)

AgGrid(
    data = top_coexp_hits,
    gridOptions = grid_option_builder.build(),
    fit_columns_on_grid_load=True,
    theme="streamlit")