Wednesday, February 6, 2013

INSTALLING SCRAPY ON WINDOWS 8 (Web Crawling)

Scrapy is framework to crawl website and extract structured data from their web pages. To install scrapy on Win 8 their are some dependencies that needs to be installed first. This post will list them accordingly. Keep in mind that i have python 2.7 installed on my system and win 8. So if you have a different version of python select packages accordingly

After installing Python, follow these steps before installing Scrapy:
Add the C:\python27\Scripts and C:\python27 folders to the system path by adding those directories to the PATH environment variable. You can do this by control panel > system and security > system > advanced system settings and scroll down to PATH variable, click edit and add the path to the end after adding a semicolon ( ; )

  1. Install OpenSSL by following these steps:
  2. Go to Win32 OpenSSL page


  1. Download Visual C++ 2008 redistributables for your Windows and architecture
  2. Download OpenSSL for your Windows and architecture (the regular version, not the light one)
  3. Add the c:\openssl-win32\bin (or similar) directory to your PATH, the same way you added python27 in the first step`` in the first step
  4. easy install for python in windows. http://pypi.python.org/pypi/setuptools
Use setuptools for installing lxml and zope-interaface. Just got command line, go into directory where packages are placed and then type easy_install "package_name.egg". This will install the file into the windows.

 Some binary packages that Scrapy depends on (like Twisted, lxml and pyOpenSSL) require a compiler available to install, and fail if you don’t have Visual Studio installed. You can find Windows installers for those in the following links.




And its done. Now follow the scrapy tutorial and make spiders to crawl the websites. 
Happy Crawling

No comments:

Post a Comment