-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
so close to run, maybe... #28
Comments
for me, I commented out the following in fbstalker: |
Added: Commented out: |
Changed from pygraphml.GraphMLParser import * to from pygraphml.graphmp_parser import * Now new errors: |
SO I have been working at this for a little while now. I have it running to where it will scrape the pages and complete the run with pictures etc, but when it goes to write the the SQL.db it records nothing. I also get the index out of range error. I believe this is being caused by a change in facebook's backend and the information being parsed out into larger fields than allotted for in the original script. I'm going to give it another look when I get some time. |
Yeah, same clue at my side - sqlite3 problem - still workin on it, no progress so far... |
@davebar all parse definitions is obsolete and not used this way in facebook, so to make it work we need to rewrite def parsePhotosOf(html):
soup = BeautifulSoup(html)
photoPageLink = soup.findAll("a", {"class" : "_23q"})
tempList = []
for i in photoPageLink:
html = str(i)
soup1 = BeautifulSoup(html)
pageName = soup1.findAll("img", {"class" : "img"})
pageName1 = soup1.findAll("img", {"class" : "scaledImageFitWidth img"})
pageName2 = soup1.findAll("img", {"class" : "_46-i img"})
for z in pageName2:
if z['src'].endswith('.jpg'):
url1 = i['href']
r = re.compile('fbid=(.*?)&set=bc')
m = r.search(url1)
if m:
filename = 'fbid_'+ m.group(1)+'.html'
filename = filename.replace("profile.php?id=","")
if not os.path.lexists(filename):
#html1 = downloadPage(url1)
html1 = downloadFile(url1)
print "[*] Caching Photo Page: "+m.group(1)
text_file = open(filename, "w")
text_file.write(normalize(html1))
text_file.close()
else:
html1 = open(filename, 'r').read()
soup2 = BeautifulSoup(html1)
username2 = soup2.find("div", {"class" : "fbPhotoContributorName"})
r = re.compile('a href="(.*?)"')
m = r.search(str(username2))
if m:
username3 = m.group(1)
username3 = username3.replace("https://www.facebook.com/","")
username3 = username3.replace("profile.php?id=","")
print "[*] Extracting Data from Photo Page: "+username3
tempList.append([str(uid),z['alt'],z['src'],i['href'],username3])
for y in pageName1:
if y['src'].endswith('.jpg'):
url1 = i['href']
r = re.compile('fbid=(.*?)&set=bc')
m = r.search(url1)
if m:
filename = 'fbid_'+ m.group(1)+'.html'
filename = filename.replace("profile.php?id=","")
if not os.path.lexists(filename):
#html1 = downloadPage(url1)
html1 = downloadFile(url1)
print "[*] Caching Photo Page: "+m.group(1)
text_file = open(filename, "w")
text_file.write(normalize(html1))
text_file.close()
else:
html1 = open(filename, 'r').read()
soup2 = BeautifulSoup(html1)
username2 = soup2.find("div", {"class" : "fbPhotoContributorName"})
r = re.compile('a href="(.*?)"')
m = r.search(str(username2))
if m:
username3 = m.group(1)
username3 = username3.replace("https://www.facebook.com/","")
username3 = username3.replace("profile.php?id=","")
print "[*] Extracting Data from Photo Page: "+username3
tempList.append([str(uid),y['alt'],y['src'],i['href'],username3])
for x in pageName:
if x['src'].endswith('.jpg'):
url1 = i['href']
r = re.compile('fbid=(.*?)&set=bc')
m = r.search(url1)
if m:
filename = 'fbid_'+ m.group(1)+'.html'
filename = filename.replace("profile.php?id=","")
if not os.path.lexists(filename):
#html1 = downloadPage(url1)
html1 = downloadFile(url1)
print "[*] Caching Photo Page: "+m.group(1)
text_file = open(filename, "w")
text_file.write(normalize(html1))
text_file.close()
else:
html1 = open(filename, 'r').read()
soup2 = BeautifulSoup(html1)
username2 = soup2.find("div", {"class" : "fbPhotoContributorName"})
r = re.compile('a href="(.*?)"')
m = r.search(str(username2))
if m:
username3 = m.group(1)
username3 = username3.replace("https://www.facebook.com/","")
username3 = username3.replace("profile.php?id=","")
print "[*] Extracting Data from Photo Page: "+username3
tempList.append([str(uid),x['alt'],x['src'],i['href'],username3])
return tempList |
anymore progress on this? |
still no progress , working on it |
Something new, 2 years later? Indeed, I can launch fbstalker no problem about that. It launches Chrome, connect to FB and then nothing else. I have always the same message : "Problem converting username to uid". I try with and without Facebook token, I uninstall / reinstall Chrome and Chromedriver but the problem is still there. Anyone has an idea? |
Changed links for chrome, installation seems OK but after: python fbstalker1.py -user blabla
this is what I get
Traceback (most recent call last):
File "fbstalker.py", line 18, in
from pygraphml.GraphMLParser import *
ImportError: No module named GraphMLParser
When droped line, it has same error on next (Graph, Node, Edge)
Any idea?
The text was updated successfully, but these errors were encountered: