Re: Using hpricot to get tables
by Lrlebron@Gmail.Com other posts by this author
Jul 1 2008 2:44PM messages near this date
Re: Using hpricot to get tables
|
options for running code in parallel
On Jul 1, 4:03Â pm, Dan Diebolt <dandieb...@[...].com> wrote:
> [Note: Â parts of this message were removed to make it a legal post.]
>
> >I would like to access each table individually
>
> doc.search returns an array even if there is only one match. The consturct you are using i
terates through this array:
>
> doc.search(strPath) do |div|
>
> end
>
> if you capture the search results into a variable named "divs" you can index it like and a
rray (because it is one)
>
> divs=doc.search(strPath)
>
> If you want to immediately start iterating you can do this:
>
> doc.search(strPath).each_with_index do |div,idiv|
> Â puts idiv if idiv==2
> end
>
> I work with hpricot a lot and I find it is more productive to not use all the fancy ruby i
dioms to shorten your code as you are dealing with pages that are very fragile to parse when
someone changes the page content.
>
> See code below
> ==============
> require 'hpricot'
> require 'open-uri'
>
> strLink ="http://www.sportsline.com/mlb/gamecenter/boxscore/MLB_20080331_ARI@CIN"
> strPath ="//div[@class='SLTables1']/div"
>
> doc = Hpricot(open(strLink))
> divs=doc.search(strPath)
>
> puts "#{divs[0].inner_text.slice(0..70)}\n\n"
> puts "#{divs[1].inner_text.slice(0..70)}\n\n"
> puts "#{divs[2].inner_text.slice(0..70)}\n\n"
> puts "#{divs[3].inner_text.slice(0..70)}\n\n"
This works. Will be very useful for future projects.
I ended up using the xpath for each table which also worked.
Thanks,
Luis
Thread:
Lrlebron@Gmail.Com
Dan Diebolt
Lrlebron@Gmail.Com
|