Je précise bien que je suis pro python, mais dans certaines circonstances, bash s'avère être plus simple, moins rapide, mais bon, on travaille pas pour la NASA 🙂
On pourrait utiliser le même algo en python, là on gagnerais déjà pas mal en vitesse, mais les import, urllib & co, c'est plus lourd qu'un wget et une simple boucle bash (AMHA).
Sinon, pour récup les images de la serie avec xpath:
xpath -e 'feed/entry[1]/content' ./sacbee.xml | sed 's/</</g' | xpath -e 'content/div/img/@src' | xargs -I% echo ____%
Mais ici, vu le nombre de requêtes qu'il faudra envoyer a xpath, le nombre de fois ou il devra charger le fichier, c'est pas rentable.
Donc un simple bash:
#!/bin/bash
Extract() {
var=${@#* }
var=${var#*>}
var=${var%<*}
echo $1: $var
}
Extract_img() {
v=$@
img=${v#*src=\"}
img=${img%%\"*}
echo img: $img
}
flag_entry=false
while read line
do
if $flag_entry; then
[[ $line =~ '<img ' ]] && Extract_img $line && continue
[[ $line =~ '<title>' ]] && echo && Extract 'title' $line && continue
[[ $line =~ '<summary>' ]] && Extract 'summary' $line && continue
fi
[[ $line =~ '<entry>' ]] && flag_entry=true
done < <(wget 2>/dev/null http://www.sacbee.com/static/weblogs/photos/atom.xml -O-)
exit
title: Millions mourn genius Steve Jobs
summary: BANGKOK (AP) -- From the titans of high technology to teenagers armed with iPads, millions of people around the world mourned digital-gadget genius Steve Jobs as a man whose wizardry transformed their lives in big ways and small. Google,...
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_01.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_02.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_03.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_04.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_05.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_06.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_07.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_08.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_09.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_10.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_11.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_12.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_13.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_14.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_15.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_16.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_17.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_18.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_19.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_20.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_21.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_22.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_23.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_24.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_25.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_26.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_27.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_28.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_29.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/steve_jobs_react_sm/jobs_reaction_30.jpg
title: Hindus honor the Mother Goddess
summary: Hindus are in the midst of fall festivals during September and October. According to About.com Guide, Subhamoy Das, "Every year during the lunar month of Ashwin or Kartik (September-October), Hindus observe ten days of ceremonies, rituals, fasts and feasts...
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_01.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_02.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_03.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_04.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_05.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_06.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_07.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_08.jpg
img: http://media.sacbee.com/static/weblogs/photos/images/2011/oct11/hindu_festival_sm/hindu_festival_09.jpg
EDIT: Suppression de sed, fullbash now, sauf wget, bien sur ...