Extract URLs from a File (TubeBuddy Backup) using Command Line

https://www.tubebuddy.com/rickmakes (Affiliate Link)

Installing Windows Subsystem for Linux: https://youtu.be/KpBVUmMvue0

TubeBuddy Playlist: https://www.youtube.com/playlist?list=PLErU2HjQZ_ZNdQHnpuDW0memweWCc2eS4

View Backup File
less tubebuddy_backup.csv
Extract URLs
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv

Source of regular expression: https://unix.stackexchange.com/questions/181254/how-to-use-grep-and-cut-in-script-to-obtain-website-urls-from-an-html-file

Sort URLs
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort
Get Line Count
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort | wc -l
Find Unique URLs
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort | uniq
Find Unique URLs with Count per URL
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort | uniq -c
Find Unique URLS with Count per URL (Sorted)
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort | uniq -c | sort
Filter Out https
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort | uniq | grep -vi 'https'
Save Results to File
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort | uniq > file.txt
Use awk to Make Links for Web Page
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" tubebuddy_backup.csv | sort | uniq | awk 'BEGIN{print "<html>"}{printf "<a href=\"%s\" target=\"_blank\">%s</a>", $0, $0}{gsub("http:","https:",$0)}{printf " | <a href=\"%s\" target=\"_blank\">https</a><br>\n", $0}' > file.html

Leave a comment

Your email address will not be published. Required fields are marked *