Very Silly Method to strip off html to get url
I know this is a very silly method on stripping off html tag from the html source code in order to retrieve the jpg url but it work , so I record down for my own use.
1. Prepare a file with all the html code and save it as txt.txt.
< onblur=" try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="">http://2.bp.blogspot.com/_YF1tNfQVN8w/TN66Xjc_9VI/AAAAAAABjUw/y_ImZpICYEE/s1600/53.jpg" > < style=" float: left; margin: 0pt 10px 10px 0pt; cursor: pointer; width: 266px; height: 400px;" src=" http://2.bp.blogspot.com/_YF1tNfQVN8w/TN66Xjc_9VI/AAAAAAABjUw/y_ImZpICYEE/s400/53.jpg" alt=" " id=" BLOGGER_PHOTO_ID_5539069505528919378" border=" 0"> < /a>
2. awk 'BEGIN { RS="href=\"" } { print $1}' txt.txt >txt2.txt
http://1.bp.blogspot.com/_YF1tNfQVN8w/TN64I5jNF3I/AAAAAAABjOQ/Ykc2T_qJ3k4/s1600/1.jpg" > < style=" float: left; margin: 0pt 10px 10px 0pt; cursor: pointer; width: 400px; height: 266px;" src=" http://1.bp.blogspot.com/_YF1tNfQVN8w/TN64I5jNF3I/AAAAAAABjOQ/Ykc2T_qJ3k4/s400/1.jpg" alt=" " id=" BLOGGER_PHOTO_ID_5539067054739232626" border=" 0">
3. awk '{ FS="\""; print $1}' txt2.txt >txt3.txt
http://1.bp.blogspot.com/_YF1tNfQVN8w/TN64I5jNF3I/AAAAAAABjOQ/Ykc2T_qJ3k4/s1600/1.jpg
4. Then use the script to generate the picture link that I want.
#!/bin/bash
while read inputline
do
echo '<> < title=" title" rel=" lightbox" href=""> > txt4.txt
echo ${inputline} > > txt4.txt
echo '" > < src=""> > txt4.txt
echo ${inputline} > > txt4.txt
echo '" alt=" alt text" title=" title" width=" 600" /> < /a> < /p> ' > > txt4.txt
# echo ${inputline}
done < txt3.txt
exit 0
That it . I know it can be shorter , but my skill level only up to this. Welcome expert to teach me better way of doing . Thank you.
---
1. Prepare a file with all the html code and save it as txt.txt.
< onblur=" try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="">http://2.bp.blogspot.com/_YF1tNfQVN8w/TN66Xjc_9VI/AAAAAAABjUw/y_ImZpICYEE/s1600/53.jpg" > < style=" float: left; margin: 0pt 10px 10px 0pt; cursor: pointer; width: 266px; height: 400px;" src=" http://2.bp.blogspot.com/_YF1tNfQVN8w/TN66Xjc_9VI/AAAAAAABjUw/y_ImZpICYEE/s400/53.jpg" alt=" " id=" BLOGGER_PHOTO_ID_5539069505528919378" border=" 0"> < /a>
2. awk 'BEGIN { RS="href=\"" } { print $1}' txt.txt >txt2.txt
http://1.bp.blogspot.com/_YF1tNfQVN8w/TN64I5jNF3I/AAAAAAABjOQ/Ykc2T_qJ3k4/s1600/1.jpg" > < style=" float: left; margin: 0pt 10px 10px 0pt; cursor: pointer; width: 400px; height: 266px;" src=" http://1.bp.blogspot.com/_YF1tNfQVN8w/TN64I5jNF3I/AAAAAAABjOQ/Ykc2T_qJ3k4/s400/1.jpg" alt=" " id=" BLOGGER_PHOTO_ID_5539067054739232626" border=" 0">
3. awk '{ FS="\""; print $1}' txt2.txt >txt3.txt
http://1.bp.blogspot.com/_YF1tNfQVN8w/TN64I5jNF3I/AAAAAAABjOQ/Ykc2T_qJ3k4/s1600/1.jpg
4. Then use the script to generate the picture link that I want.
#!/bin/bash
while read inputline
do
echo '<> < title=" title" rel=" lightbox" href=""> > txt4.txt
echo ${inputline} > > txt4.txt
echo '" > < src=""> > txt4.txt
echo ${inputline} > > txt4.txt
echo '" alt=" alt text" title=" title" width=" 600" /> < /a> < /p> ' > > txt4.txt
# echo ${inputline}
done < txt3.txt
exit 0
That it . I know it can be shorter , but my skill level only up to this. Welcome expert to teach me better way of doing . Thank you.
---
Comments
Post a Comment
Feel free to leave your question or comment here, we will reply you as soon as possible.