Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
seanpedrickcase
/
document_redaction
like
5
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
document_redaction
/
tools
Ctrl+K
Ctrl+K
3 contributors
History:
137 commits
seanpedrickcase
Updated save options for ocr_outputs_with_words
52e26c1
28 days ago
__init__.py
Safe
0 Bytes
Initial commit
over 1 year ago
auth.py
Safe
2.46 kB
Added compatibility with gradio_image_annotation for passing through id and text properties to annotator. Corrected csv location for Textract api calls. Other minor changes
3 months ago
aws_functions.py
Safe
9.51 kB
Updated duplicate pages functionality. Improve redaction efficiency a little with concat method. Minor modification to documentation and interface
about 2 months ago
aws_textract.py
Safe
27.3 kB
Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation
3 months ago
cli_redact.py
Safe
4.74 kB
More config options. Fixed some bugs with removing elements from review page and Adobe export. Some UI rearrangements
4 months ago
config.py
Safe
19.8 kB
Further updates to line level duplicate identification
28 days ago
custom_csvlogger.py
Safe
12.9 kB
Updated packages. Corrected CSV logger headings, can now submit custom log csv names to S3. Started work on identifying and deduplicating at the line level
29 days ago
custom_image_analyser_engine.py
Safe
53.9 kB
Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation
3 months ago
data_anonymise.py
Safe
36.2 kB
Added possibility of changing model and entity types in config file
2 months ago
file_conversion.py
Safe
104 kB
Updated save options for ocr_outputs_with_words
28 days ago
file_redaction.py
Safe
132 kB
Updated save options for ocr_outputs_with_words
28 days ago
find_duplicate_pages.py
Safe
40.8 kB
Further updates to line level duplicate identification
28 days ago
helper_functions.py
Safe
27.6 kB
Updated save options for ocr_outputs_with_words
28 days ago
load_spacy_model_custom_recognisers.py
Safe
13.7 kB
Major update. General code revision. Improved config variables. Dataframe based review frame now includes text, items can be searched and excluded. Costs now estimated. Option for adding cost codes added. Option to extract text only.
4 months ago
presidio_analyzer_custom.py
Safe
4.92 kB
More config options. Fixed some bugs with removing elements from review page and Adobe export. Some UI rearrangements
4 months ago
redaction_review.py
Safe
79.9 kB
Further updates to line level duplicate identification
28 days ago
textract_batch_call.py
Safe
28 kB
Expanded checks for out of range page cropboxes
2 months ago