项目作者: Nelson-Gon

项目描述 :
Check and Fix Outdated URLs
高级语言: Python
项目地址: git://github.com/Nelson-Gon/urlfix.git
创建时间: 2021-02-07T07:58:46Z
项目社区:https://github.com/Nelson-Gon/urlfix

开源协议:MIT License

下载


urlfix: Check and Fix Outdated URLs

PyPI version fury.io
DOI
Project Status
Codecov
Test-Package
PyPI license
Documentation Status
Total Downloads
Monthly Downloads
Weekly Downloads
Maintenance
GitHub last commit
GitHub issues
GitHub issues-closed

urlfix aims to find all outdated URLs in a given file and fix them.

Features List

  • [x] Commandline and programmer-friendly modes.

  • [x] Replace outdated URLs/links in a single file

  • [x] Replace outdated URLs/links in a directory

  • [x] Replace outdated URLs/links in the same file or in the same files in a directory i.e. inplace.

  • [x] Replace outdated links in files in nested directories

  • Replace outdated links in files in sub-nested directories

Supported file formats

urlfix fixes URLs given a file of the following types:

  • MarkDown (.md)
  • [x] Plain Text files (.txt)

  • [x] RMarkdown (.rmd)

  • [x] ReStructured Text (.rst)

  • [ ] PDF (.pdf)

  • [ ] Word (.docx)

  • [ ] ODF (.odf)

Installation

The simplest way to install the latest release is as follows:

  1. pip install urlfix

To install the development version:

Open the Terminal/CMD/Git bash/shell and enter

  1. pip install git+https://github.com/Nelson-Gon/urlfix.git
  2. # or for the less stable dev version
  3. pip install git+https://github.com/Nelson-Gon/urlfix.git@dev

Otherwise:

  1. # clone the repo
  2. git clone git@github.com:Nelson-Gon/urlfix.git
  3. cd urlfix
  4. pip install -e .

Sample usage

Script Mode

To use at the commandline, please use:

  1. python -m urlfix --mode "f" --verbose 1 --inplace 1 --inpath myfile.md

If not replacing within the same file, then:

  1. python -m urlfix --mode "f" --verbose 1 --inplace 0 --inpath myfile.md --output-file myoutputfile.md

To get help:

  1. python -m urlfix -h
  2. #usage: main.py [-h] -m MODE -in INPUT_FILE [-o OUTPUT_FILE] -v {False,false,0,True,true,1} -i {False,false,0,True,true,1}
  3. #
  4. #optional arguments:
  5. # -h, --help show this help message and exit
  6. # -m MODE, --mode MODE Mode to use. One of f for file or d for directory
  7. # -in INPUT_FILE, --input-file INPUT_FILE
  8. # Input file for which link updates are required.
  9. # -o OUTPUT_FILE, --output-file OUTPUT_FILE
  10. # Output file to write to. Optional, only necessary if not replacing inplace
  11. # -v {False,false,0,True,true,1}, --verbose {False,false,0,True,true,1}
  12. # String to control verbosity. Defaults to True.
  13. # -i {False,false,0,True,true,1}, --inplace {False,false,0,True,true,1}
  14. # Should links be replaced inplace? This should be safe but to be sure, test with an output file first.

Programmer-Friendly Mode

  1. from urlfix.urlfix import URLFix
  2. from urlfix.dirurlfix import DirURLFix

Create an object of class URLFix

  1. urlfix_object = URLFix("testfiles/testurls.txt", output_file="replacement.txt")

Replacing URLs

After creating our object, we can replace outdated URLs as follows:

  1. urlfix_object.replace_urls(verbose=1)

The above uses default arguments and will not replace a file inplace. This is a safety mechanism to ensure one does not
damage their files.

Since we set verbose to True, we get the following output:

  1. urlfix_object.replace_urls()

To replace silently, simply set verbose to False (which is the default).

  1. urlfix_object.replace_urls()

If there are URLs known to be valid, pass these to the correct_urls argument to save some time.

  1. urlfix_object.replace_urls(correct_urls=[urls_here]) # Use a Sequence eg tuple, list, etc

Replacing several files in a directory

To replace several files in a directory, we can use DirURLFix as follows.

  • Instantiate an object of class DirURLFix
  1. replace_in_dir = DirURLFix("path_to_dir")
  • Call replace_urls
  1. replace_in_dir.replace_urls()

Recursively replacing links in nested directories

To replace outdated links in several files located in several directories, we set recursive to True.
Currently, replacing links in directories nested within nested directories is not (yet) supported.

  1. recursive_object = DirURLFix("path_to_root_directory", recursive=True)

We can then proceed as above

  1. recursive_object.replace_urls() # provide other arguments as you may wish.

To report any issues, suggestions or improvement, please do so at issues.

If you would like to cite this work, please use:

Nelson Gonzabato (2021) urlfix: Check and Fix Outdated URLs https://github.com/Nelson-Gon/urlfix

Thank you very much.

“Before software can be reusable it first has to be usable.” – Ralph Johnson