Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to right to DB #27

Open
7109node opened this issue Dec 26, 2014 · 2 comments
Open

Failing to right to DB #27

7109node opened this issue Dec 26, 2014 · 2 comments

Comments

@7109node
Copy link

So I have this script working again using the indexing error a few posts down. Now the script runs start to finish, but doesn't seem to parse the saved html files using beautifulsoup properly. I am receiving a list index out of error when trying to write to the DB:

Writing 0 records to table: photosLiked
[*] Writing 0 record(s) to database table: photosLiked
list index out of range

This is consistent for all categories. I am assuming that the way the file is getting parsed is no longer meeting the list lengths specified in the code, but I'm not sure how to validate my assumption.

Here is the code section defining the columns:

"""def write2Database(dbName,dataList):
try:
cprint("[] Writing "+str(len(dataList))+" record(s) to database table: "+dbName,"white")
#print "[
] Writing "+str(len(dataList))+" record(s) to database table: "+dbName
numOfColumns = len(dataList[0])
c = conn.cursor()
if numOfColumns==3:
for i in dataList:
try:
c.execute('INSERT INTO '+dbName+' VALUES (?,?,?)', i)
conn.commit()
except sqlite3.IntegrityError:
continue
if numOfColumns==4:
for i in dataList:
try:
c.execute('INSERT INTO '+dbName+' VALUES (?,?,?,?)', i)
conn.commit()
except sqlite3.IntegrityError:
continue
if numOfColumns==5:
for i in dataList:
try:
c.execute('INSERT INTO '+dbName+' VALUES (?,?,?,?,?)', i)
conn.commit()
except sqlite3.IntegrityError:
continue
if numOfColumns==9:
for i in dataList:
try:
c.execute('INSERT INTO '+dbName+' VALUES (?,?,?,?,?,?,?,?,?)', i)
conn.commit()
except sqlite3.IntegrityError:
continue
except TypeError as e:
print e
pass
except IndexError as e:
print e
pass"""

Example of the parsing functions:

"""def parsePhotosOf(html):
soup = BeautifulSoup(html)
photoPageLink = soup.findAll("a", {"class" : "23q"})
tempList = []
for i in photoPageLink:
html = str(i)
soup1 = BeautifulSoup(html)
pageName = soup1.findAll("img", {"class" : "img"})
pageName1 = soup1.findAll("img", {"class" : "scaledImageFitWidth img"})
pageName2 = soup1.findAll("img", {"class" : "46-i img"})
for z in pageName2:
if z['src'].endswith('.jpg'):
url1 = i['href']
r = re.compile('fbid=(.*?)&set=bc')
m = r.search(url1)
if m:
filename = 'fbid
'+ m.group(1)+'.html'
filename = filename.replace("profile.php?id=","")
if not os.path.lexists(filename):
#html1 = downloadPage(url1)
html1 = downloadFile(url1)
print "[
] Caching Photo Page: "+m.group(1)
text_file = open(filename, "w")
text_file.write(normalize(html1))
text_file.close()
else:
html1 = open(filename, 'r').read()
soup2 = BeautifulSoup(html1)
username2 = soup2.find("div", {"class" : "fbPhotoContributorName"})
r = re.compile('a href="(.?)"')
m = r.search(str(username2))
if m:
username3 = m.group(1)
username3 = username3.replace("https://www.facebook.com/","")
username3 = username3.replace("profile.php?id=","")
print "[
] Extracting Data from Photo Page: "+username3
tempList.append([str(uid),z['alt'],z['src'],i['href'],username3])
for y in pageName1:
if y['src'].endswith('.jpg'):
url1 = i['href']
r = re.compile('fbid=(.?)&set=bc')
m = r.search(url1)
if m:
filename = 'fbid
'+ m.group(1)+'.html'
filename = filename.replace("profile.php?id=","")
if not os.path.lexists(filename):
#html1 = downloadPage(url1)
html1 = downloadFile(url1)
print "[] Caching Photo Page: "+m.group(1)
text_file = open(filename, "w")
text_file.write(normalize(html1))
text_file.close()
else:
html1 = open(filename, 'r').read()
soup2 = BeautifulSoup(html1)
username2 = soup2.find("div", {"class" : "fbPhotoContributorName"})
r = re.compile('a href="(.
?)"')
m = r.search(str(username2))
if m:
username3 = m.group(1)
username3 = username3.replace("https://www.facebook.com/","")
username3 = username3.replace("profile.php?id=","")
print "[] Extracting Data from Photo Page: "+username3
tempList.append([str(uid),y['alt'],y['src'],i['href'],username3])
for x in pageName:
if x['src'].endswith('.jpg'):
url1 = i['href']
r = re.compile('fbid=(.
?)&set=bc')
m = r.search(url1)
if m:
filename = 'fbid_'+ m.group(1)+'.html'
filename = filename.replace("profile.php?id=","")
if not os.path.lexists(filename):
#html1 = downloadPage(url1)
html1 = downloadFile(url1)
print "[] Caching Photo Page: "+m.group(1)
text_file = open(filename, "w")
text_file.write(normalize(html1))
text_file.close()
else:
html1 = open(filename, 'r').read()
soup2 = BeautifulSoup(html1)
username2 = soup2.find("div", {"class" : "fbPhotoContributorName"})
r = re.compile('a href="(.
?)"')
m = r.search(str(username2))
if m:
username3 = m.group(1)
username3 = username3.replace("https://www.facebook.com/","")
username3 = username3.replace("profile.php?id=","")
print "[*] Extracting Data from Photo Page: "+username3
tempList.append([str(uid),x['alt'],x['src'],i['href'],username3])"""

@davebar
Copy link

davebar commented Feb 25, 2015

Dont get it! Im trying for weeks to get it working with no success.
Any progress about "writing to DB"?

@7109node
Copy link
Author

I honestly got caught up writing a new script that I loaded onto my github. Shameless plug but it detects and exploits any jailbroken IoS device on a LAN. Old exploit, but I thought it would be fun to add a little automation.

I'll reengage this DB problem soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants