I was trying to crawl all ship data from the website https://greatlakeships.org/results?q=&st=kw with its detailed pages like https://greatlakeships.org/3721293/data?n=1, but I got empty results all the time I run my code.
import requests
from bs4 import BeautifulSoup
import csv
baseurl ='https://greatlakeships.org/'
headers= {'User-Agent': 'Mozilla/5.0'}
productlinks = [] #put all item in this array
for x in range(1,2 ): # set page range
response = requests.get(f'https://greatlakeships.org/results?bl=and&st=kw&q2=text:(*:*)&rows=20&sort=titleSort asc&p={x}') #url of next page
soup = BeautifulSoup(response.content, 'html.parser')
productlist =soup.find_all('ul', class_='single')
#loop to get all href from ul
for item in productlist:
for link in item.find_all('a', href = True):
productlinks.append(baseurl link['href'])
print(len(productlinks))
#product details pages
#testlink = 'https://greatlakeships.org/3721293/data?n=1'
tabledata = []
for data in productlinks:
response = requests.get(data, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
trs = soup.find_all('div', class_ = 'DetailsSub')
for tr in trs:
tds = tr.find_all('dd')
# column 1 data
also_known_as = tds[0].text
# column 2 data
Year_of_Build = tds[1].text
# column 3 data
Official_Number = tds[2].text
# column 4 data
Built_at = tds[3].text
# column 5 data
Vessel_Type = tds[4].text
# column 6 data
Additional_vessel_types = tds[5].text
# column 7 data
Hull_Materials = tds[6].text
# column 8 data
Builder_Name = tds[7].text
# column 9 data
Original_Owner_and_Location = tds[8].text
# column 10 data
Length = tds[9].text
# column 11 data
Beam = tds[10].text
# column 12 data
Depth = tds[11].text
# column 13 data
Tonnage_gross = tds[12].text
# column 14 data
Tonnage_net = tds[13].text
# column 15 data
Contact = tds[14].text
# save data
tr_data = {'Also known as': also_known_as,
'Year of Build': Year_of_Build,
'Official Number': Official_Number,
'Built at': Built_at,
'Vessel Type': Vessel_Type,
'Additional vessel types': Additional_vessel_types,
'Hull Materials': Hull_Materials,
'Builder Name': Builder_Name,
'Original Owner and Location': Original_Owner_and_Location,
'Length': Length,
'Beam': Beam,
'Depth': Depth,
'Tonnage (gross)': Tonnage_gross,
'Tonnage (net)': Tonnage_net,
'Contact': Contact
}
tabledata.append(tr_data)
print(tabledata)
i get the links from page but can get any data from these links like 'print(len(productlinks))'. anyone help me in that, Thanks in Advance!! and how to save this data in csv tables??
CodePudding user response:
As stated, the issue is the links you are trying to request aren't going to return anything. If you put a print statement to debug (perhaps print(data)
, you'd see it's trying to request from https://greatlakeships.org/https://greatlakeships.org/2899767/data?n=5
Also, There are duplicate links, so you may want to remove those. You can also be a little bit more robust and efficient in coding in the parsing of the data. Give this a try:
import requests
from bs4 import BeautifulSoup
import pandas as pd
baseurl ='https://greatlakeships.org/'
headers= {'User-Agent': 'Mozilla/5.0'}
productlinks = [] #put all item in this array
for x in range(1,2 ): # set page range
response = requests.get(f'https://greatlakeships.org/results?bl=and&st=kw&q2=text:(*:*)&rows=20&sort=titleSort asc&p={x}') #url of next page
soup = BeautifulSoup(response.content, 'html.parser')
productlist =soup.find_all('ul', class_='single')
#loop to get all href from ul
for item in productlist:
for link in item.find_all('a', href = True):
productlinks.append(link['href'])
productlinks = list(set(productlinks))
print(len(productlinks))
#product details pages
#testlink = 'https://greatlakeships.org/3721293/data?n=1'
tabledata = []
for link in productlinks:
print(link)
response = requests.get(link, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
shipName = soup.find('div', {'class':'RecordTitle'}).text.strip()
fieldsets = soup.find_all('fieldset')
row = {'Ship Name':shipName}
for fieldset in fieldsets:
dts = fieldset.find_all('dt')
for dt in dts:
row.update({dt.text.strip(): dt.find_next('dd').text.strip()})
tabledata.append(row)
df = pd.DataFrame(tabledata)
Output:
print(df.to_string())
Ship Name Also known as: Year of Build: Official Number: Built at: Vessel Type: Hull Materials: Number of Decks: Hull Number: Builder Name: Original Owner and Location: Length: Beam: Depth: Tonnage (gross): Tonnage (net): From the Collection of: Contact Final Location: Date: How: Final Cargo: Notes: Capacity: Note on Dimensions: Final Depth: Original Owner: Additional vessel types:
0 132 (1893, Barge) PORTSMOUTH 1893 53279 Superior, WI Barge Steel 1 132 American Steel Barge Company American Steel Barge Company, Buffalo, NY 292' 36' 22' 1,310 1,265 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 105 (1890, Barge) BARONESS 1890 53258 Duluth, MN Barge Steel 1 105 American Steel Barge Company American Steel Barge Company, Buffalo, NY 276.5' 36.1' 18.9' 1,295.44 1,230.69 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 10-15 miles Southwest by West of Fire Island, NY Lightship\nAtlantic Ocean 19 Nov 1910 collision with french barque ELISABETH coal being towed by BAYPORT NaN NaN NaN NaN NaN
2 102 (1889, Barge) WHITWORTH, SIR JOSEPH;BATH 1889 53255 Duluth, MN Barge Steel 1 102 American Steel Barge Co. American Steel Barge Co, Buffalo, NY 253' 36.1' 18.8' 1,192.20 1,132.56 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 off Cape Charles, VA\nAtlantic Ocean 15 Dec 1905 foundered NaN all 5 lives lost 100,000 bushels NaN NaN NaN NaN
3 116 (1891, Barge) BRITANNIA;PURE TIOLENE 1891 53269 Superior, WI Barge Steel 1 116 American Steel Barge Company American Steel Barge Company, Buffalo, NY 262' 36' 22' 1,169.11 1,110.66 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Huston, TX 1946 scrapped NaN NaN NaN NaN NaN NaN NaN
4 131 (1893, Barge) SALEM;FREEPORT SULPHUR NO. 4;PURE OIL NO. 10;PURE NULUBE 1893 53278 Superior, WI Barge Steel 1 131 American Steel Barge Company American Steel Barge Company, Buffalo, NY 292' 36' 22' 1,310 1,265 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Huston, TX 1946 scrapped NaN NaN NaN NaN NaN NaN NaN
5 107 (1890, Barge) BOMBAY 1890 53260 Duluth, MN Barge Steel 1 107 American Steel Barge Company American Steel Barge Company, Buffalo, NY 276.5' 36.1' 18.9' 1,295.44 1,230.69 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 off Nantucket Shoals\nAtlantic Ocean 3 Jan 1913 foundered in storm NaN NaN NaN NaN NaN NaN NaN
6 104 (1890, Barge) NaN 1890 53257 Duluth, MN Barge Steel 1 104 American Steel Barge Company American Steel Barge Company, Buffalo, NY 276.5' 36.1' 18.9' 1,295.44 1,230.69 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Cleveland, OH; west breakwater\nLake Erie 10 Nov 1898 ashore on breakwater in gale NaN crew rescued by lifesaving service 3,300 tons NaN NaN NaN NaN
7 118 (1891, Barge) BOSTON;FREEPORT SULLPHUR NO. 3;PURE OIL STEAM SHIP CO. BARGE NO. 9;PURE DETONOX 1891 53272 Superior, WI Barge Steel 1 118 American Steel Barge Company American Steel Barge Company, Buffalo, NY 292' 36' 22' 1,310.82 1,265.92 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Huston, TX 1946 scrapped NaN NaN 4,000 tons alternate length: 320? NaN NaN NaN
8 129 (1893, Barge) NaN 1893 53276 Superior, WI Barge Steel 1 129 American Steel Barge Company American Steel Barge Company, Buffalo, NY 292' 36' 22' 1,310 1,265 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 off Vermillion Point, MI\nLake Superior 13 Oct 1902 rammed by towing steamer MAUNALOA iron ore NaN NaN NaN 125 feet NaN NaN
9 117 (1891, Barge) PROVIDENCE 1891 53271 Superior, WI Barge Steel 1 117 American Steel Barge Company American Steel Barge Company, Buffalo, NY 292' 36' 22' 1,310.82 1,265.92 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 NaN NaN NaN NaN NaN 3,000 tons NaN NaN NaN NaN
10 111 (1891, Barge) IVIE 1891 53267 Superior, WI Barge Steel 1 NaN American Steel Barge Company American Steel Barge Company, Buffalo, NY 265' 36' NaN 1,227 1,167 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Hampton Roads, VA\nAtlantic Ocean 10 May 1916 collision with steamer BERKSHIRE NaN NaN 85,000 bushels NaN NaN NaN NaN
11 130 (1893, Barge) LYNN 1893 53277 Superior, WI Barge Steel 1 130 American Steel Barge Company American Steel Barge Company, Buffalo, NY 292' 36' 22' 1,310 1,265 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
12 115 (1891, Barge) NaN 1891 53268 Superior, WI Barge Steel 1 115 American Steel Barge Company American Steel Barge Company, Buffalo, NY 256' 36.1' 18.9' 1,169.11 1,110.66 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Pic Island near Marathon, ONT\nLake Superior 18 Dec 1899 ashore after breaking towline iron ore last shipwreck of the 19th century NaN NaN NaN NaN NaN
13 101 (1888, Barge) NaN 1888 53249 Duluth, MN Barge Steel 1 101 Alexander McDougall Alexander McDougall, et al, Duluth 178' 25.1' 12.7' 428.30 412.32 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 30 miles off Seal Island, ME\nAtlantic Ocean 3 Dec 1908 NaN NaN NaN NaN NaN NaN NaN NaN
14 109 (1890, Barge) BARAVIA 1890 53265 Superior, WI Barge Steel 1 109 American Steel Barge Company NaN 265' 36' 22' 1,227 1,167 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 off Montauk Point, NY, Long Island Sound\nAtlantic Ocean 23 Jan 1914 foundered NaN NaN NaN NaN NaN American Steel Barge Company NaN
15 127 (1892, Barge) JEANIE, DALLAS 1892 53274 Superior, WI Barge Steel 1 127 American Steel Barge Company American Steel Barge Company, Buffalo, NY 264' 36' 22' 1,128 1,083 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
16 126 (1892, Barge) BADEN 1892 53273 Superior, WI Barge Steel 1 126 American Steel Barge Company American Steel Barge Company, Buffalo, NY 264' 36' 22' 1,128 1,083 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Buzzard's Bay, MA\nAtlantic Ocean 31 Dec 1905 stranded NaN all hands lost NaN NaN NaN NaN NaN
17 110 (1891, Barge) BADGER;PURE LUBWELL 1891 53266 Superior, WI Barge Steel 1 110 American Steel Barge Company American Steel Barge Company, Buffalo, NY 265' 36' 22' 1,227 1,167 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 New Orleans, LA 13 Mar 1932 fire NaN 1 life lost, 7 rescued 85,000 bushels NaN NaN NaN NaN
18 0452 (1957, Scow) C & O 452; KODIAK 1957 275111 Port Deposit, Maryland, United States Scow Steel NaN NaN Wiley Manufacturing Co Baltimore & Ohio Railroad Co, Maryland 90' 30' 9' 222 222 NaN Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 NaN NaN NaN NaN NaN NaN NaN NaN NaN Tug (Towboat)
19 103 (1889, Barge) RUSSELL, JOHN SCOTT;BERKSHIRE 1889 53256 Duluth, MN Barge Steel 1 103 American Steel Barge Co. American Steel Barge Company, Buffalo, NY 253' 36.1' 18.8' 1,192.20 1,132.56 C. Patrick Labadie Alpena County George N. Fletcher Public Library\n\t\tEmail:[email protected]\n\t\tWebsite: http://www.alpenalibrary.org\n\t\tAgency street/mail address: 211 N. First Ave. \r\nAlpena, Michigan 49707 \r\nUSA\r\n(989)356-6188 Off Sandy Hook, NJ\nAtlantic Ocean 23 May 1909 Foundered NaN NaN 3,000 tons NaN NaN NaN NaN