Urlib - network library (stdlib), requests - network library,
Grab - network library (based on pycurl), pycurl - network library (binding libcurl)
Ullib3 - Python HTTP library, security connection pool, support files post, high availability, httplib2 a network library,
RoboBrowser - a simple, highly Python Python library, browse the web without independent browser,
MechanicalSoup a website automatic interactive Python library,
Mechanize - there are state, programmable Web browsing library, socket - the underlying network interface (stdlib),
Unirest for Python - Unirest is a lightweight HTTP library can be used in multiple languages,
The hyper - Python HTTP/2 client,
PySocks - SocksiPy updates version, and actively maintain include bug fixes and other characteristics, as direct replacement of the socket module,
Web crawler frame
Grab - web crawler frame (base on pycur/multicur),
Scrapy - web crawler frame (base on the twisted), does not support Python3,
Pyspider - a powerful crawler system, cola - a distributed crawler frame, other
Portia - based on the visual crawler Scrapy,
Restkit - Python HTTP resource kit, it allows you to easily access the HTTP resources, and build objects around it,
Demiurge - based on PyQuery crawler frame, HTML/XML parser
LXML - C language to write efficient HTML/XML processing library, support for XPath,
Cssselect - parsing the DOM tree and CSS selectors, pyquery - parsing the DOM tree and jQuery selector,
BeautifulSoup - inefficient HTML/XML processing library, pure Python implementation,
Html5lib - based on the WHATWG specification generate HTML/XML DOM document, the specification has been used in all browsers now,
Feedparser parsing RSS/ATOM feeds,
MarkupSafe - for XML/HTML/XHTML provides safe escape string,
Xmltodict - one can make you feel like when dealing with XML in dealing with the Python module as JSON,
Xhtml2pdf - will be converted to PDF, HTML/CSS
Untangle - easy to convert the XML file as a Python objects, clean up the
Bleach to clean up HTML (html5lib), sanitize - bring chaos data world qingming festival, text processing
Used to parse and simple text library,
Difflib - (Python standard library) help compared differentiation,
Levenshtein a quick calculation L evenshtein distance and string similarity,
Fuzzywuzzy - fuzzy string matching, esmre - regular expression accelerator,
Ftfy - Unicode text automatic sorting, reduce fragmentation,.
of natural language processing
Deal with the problem of human language library,
Me - write a Python program to deal with human language data platform for the best,
The Pattern of a Python web mining module, he has a natural language processing tools, machine learning, and other,
TextBlob - for deep natural language processing task provides a consistent API, which is based on me, and the Pattern of the development of the shoulders of giants,
Jieba - Chinese word segmentation tools,
SnowNLP - Chinese text processing library,
Loso - another Chinese word segmentation repository browser automation and simulation
Selenium automation real browser (Chrome, firefox, Opera browser, Internet explorer),
Ghost. Py PyQt on its encapsulation (need to PyQt),
Spynner PyQt on its packaging (PyQt),
Splinter - generic API browser simulator (seleniumweb drive, Django client, Zope), multiple processing
Threading - the Python standard library thread running, is effective for the I/0 intensive tasks, useless, for CPU binding tasks because Python GIL,
Multiprocessing - the standard Python library operation process more,
Celery - based on distributed asynchronous messaging task queue/job queue,;
Concurrent, a concurrent futures futures module for invoking asynchronous execution provides a high level of interface,
Asynchronous network programming library
Asyncio - (in Python 3.4 + version above the Python standard library) asynchronous I/O, cycle time, coroutine and tasks,
Twisted a framework based on the event-driven network | qing, Tornado - an asynchronous network library, network framework and pulsar - Python event-driven framework of concurrent,
Diesel - Python's I/O framework based on green events, gevent - a Python USES greenlet coroutines based network library,
Eventlet - have WSGI support asynchronous framework,
Tomorrow - wonderful asynchronous code modify grammar, queue
Celery - based on distributed asynchronous messaging task queue/job queue,
Huey - small multi-threaded task queue,
MRQ - Mr. Queue - use redis & amp; Gevent Python distributed task queue,
RQ - lightweight task queue manager, based on the Redis simpleq - a simple, unlimited extension, based on the Amazon SQS queue,
Python - gearman a gearman python API,
Cloud computing
Execution of Python code picloud - the clouds,
Dominoup.com - the cloud perform R, Python and matlab code web content extraction
Extraction of web content library,
HTML pages of text and metadata
Newspaper news - in Python are extracted, the article I take and content curators,
Html2text - will turn Markdown format text, HTML
Python - chicago-brewed goose an HTML content/extractor, lassie - personalized web content retrieval tools WebSocket
Used for WebSocket library,
Crossbar - the application of open source messaging routers
(which is used in the Python implementation of Autobahn WebSocket and WAMP),
AutobahnPython - provides the WebSocket protocol and WAMP Python implementation and open source,
WebSocket - for - Python - Python 2 and 3 and PyPy WebSocket client and server, DNS
Dnsyo - in the global more than 1, 500 the DNS server. Check your DNS,
Pycares - c - ares interface, c - ares is a DNS request and asynchronous name resolution c library,
Computer vision
OpenCV - open source computer vision library,
SimpleCV - used in camera, image processing, feature extraction, conversion, readable interface (based on OpenCV),
Mahotas - fast computer image processing algorithm (completely using c + + implementation), based entirely on numpy array as its data type,
Some web development framework
1. Django
Django is an open source Web application framework, are written in Python, supports many database engine, and can make Web development rapidly and extensible, and will continue to the latest version update to match the Python version, if it is the novice programmers, can start from this framework,
2. The Flask
Flask is a lightweight Web application framework, use the Python, based on WerkzeugWSGI toolbox and Jinja2 template engine, using the BSD license,
Flask, also known as "microframework", because the core of it is simple to use, with the extension to increase other functions, Flask no default database, form validation tool, however, the Flask to retain the flexibility of amplification, can use a Flask - the extension to join these functions: the ORM, form validation tool, file upload, all kinds of open authentication technology,
3. The Web2py
Web2py is a free open source Web framework written in Python, aimed at agile rapid development of Web applications, have fast, scalable, safe, and the application of portable database driver, follow LGPLv3 open source agreement,
Web2py one-stop solution, the entire development process can be conducted on the browser, provides the Web version of the online development, HTML templates to write static file upload, the function of the writing of the database, log and other functions, and an automated admin interface,
4. The Tornado
Tornado is one. A Web server (for which this article elaborate), but it is also a class Web. Py micro - framework, as the main framework of the Tornado thought comes from Web. Py, everyone on the Web. Also can see the homepage of py Tornado bosses Bret Taylor, so one paragraph (he said FriendFeed here with the framework of Tornado can be thought of as a thing) :
"[web. Py inspired the] web framework we useat FriendFeed [and] the webapp frameworkthat ships with App Engin... "
Because of the relationship, behind no longer separate discussion Tornado,
5. CherryPy