Python WSGI的深入理解 2024/11/17 饿虎岗资源网

前言

本文主要介绍的是Python WSGI相关内容，主要来自以下网址：

What is WSGI"external nofollow" target="_blank" href="http://wsgi.tutorial.codepoint.net/">WSGI Tutorial
An Introduction to the Python Web Server Gateway Interface (WSGI)

可以看成一次简单粗暴的翻译。

什么是WSGI

WSGI的全称是Web Server Gateway Interface，这是一个规范，描述了web server如何与web application交互、web application如何处理请求。该规范的具体描述在PEP 3333。注意，WSGI既要实现web server，也要实现web application。

实现了WSGI的模块/库有wsgiref(python内置)、werkzeug.serving、twisted.web等，具体可见Servers which support WSGI。

当前运行在WSGI之上的web框架有Bottle、Flask、Django等，具体可见Frameworks that run on WSGI。

WSGI server所做的工作仅仅是将从客户端收到的请求传递给WSGI application，然后将WSGI application的返回值作为响应传给客户端。WSGI applications 可以是栈式的，这个栈的中间部分叫做中间件，两端是必须要实现的application和server。

WSGI教程

这部分内容主要来自WSGI Tutorial。

WSGI application接口

WSGI application接口应该实现为一个可调用对象，例如函数、方法、类、含__call__方法的实例。这个可调用对象可以接收2个参数：

一个字典，该字典可以包含了客户端请求的信息以及其他信息，可以认为是请求上下文，一般叫做environment（编码中多简写为environ、env）；
一个用于发送HTTP响应状态（HTTP status ）、响应头（HTTP headers）的回调函数。

同时，可调用对象的返回值是响应正文（response body），响应正文是可迭代的、并包含了多个字符串。

WSGI application结构如下：

def application (environ, start_response):

 response_body = 'Request method: %s' % environ['REQUEST_METHOD']

 # HTTP响应状态
 status = '200 OK'

 # HTTP响应头，注意格式
 response_headers = [
  ('Content-Type', 'text/plain'),
  ('Content-Length', str(len(response_body)))
 ]

 # 将响应状态和响应头交给WSGI server
 start_response(status, response_headers)

 # 返回响应正文
 return [response_body]

Environment

下面的程序可以将environment字典的内容返回给客户端（environment.py）：

# ! /usr/bin/env python
# -*- coding: utf-8 -*- 

# 导入python内置的WSGI server
from wsgiref.simple_server import make_server

def application (environ, start_response):

 response_body = [
  '%s: %s' % (key, value) for key, value in sorted(environ.items())
 ]
 response_body = '\n'.join(response_body) # 由于下面将Content-Type设置为text/plain，所以`\n`在浏览器中会起到换行的作用

 status = '200 OK'
 response_headers = [
  ('Content-Type', 'text/plain'),
  ('Content-Length', str(len(response_body)))
 ]
 start_response(status, response_headers)

 return [response_body]

# 实例化WSGI server
httpd = make_server (
 '127.0.0.1', 
 8051, # port
 application # WSGI application，此处就是一个函数
)

# handle_request函数只能处理一次请求，之后就在控制台`print 'end'`了
httpd.handle_request()

print 'end'

浏览器（或者curl、wget等）访问http://127.0.0.1:8051/，可以看到environment的内容。

另外，浏览器请求一次后，environment.py就结束了，程序在终端中输出内容如下：

127.0.0.1 - - [09/Sep/2015 23:39:09] "GET / HTTP/1.1" 200 5540
end

可迭代的响应

如果把上面的可调用对象application的返回值：

return [response_body]

改成：

return response_body

这会导致WSGI程序的响应变慢。原因是字符串response_body也是可迭代的，它的每一次迭代只能得到1 byte的数据量，这也意味着每一次只向客户端发送1 byte的数据，直到发送完毕为止。所以，推荐使用return [response_body]。

如果可迭代响应含有多个字符串，那么Content-Length应该是这些字符串长度之和：

# ! /usr/bin/env python
# -*- coding: utf-8 -*- 

from wsgiref.simple_server import make_server

def application(environ, start_response):

 response_body = [
  '%s: %s' % (key, value) for key, value in sorted(environ.items())
 ]
 response_body = '\n'.join(response_body)

 response_body = [
  'The Beggining\n',
  '*' * 30 + '\n',
  response_body,
  '\n' + '*' * 30 ,
  '\nThe End'
 ]

 # 求Content-Length
 content_length = sum([len(s) for s in response_body])

 status = '200 OK'
 response_headers = [
  ('Content-Type', 'text/plain'),
  ('Content-Length', str(content_length))
 ]

 start_response(status, response_headers)
 return response_body

httpd = make_server('localhost', 8051, application)
httpd.handle_request()

print 'end'

解析GET请求

运行environment.py，在浏览器中访问http://localhost:8051/"htmlcode">

QUERY_STRING: age=10&hobbies=software&hobbies=tunning
REQUEST_METHOD: GET

cgi.parse_qs()函数可以很方便的处理QUERY_STRING，同时需要cgi.escape()处理特殊字符以防止脚本注入，下面是个例子：

# ! /usr/bin/env python
# -*- coding: utf-8 -*- 
from cgi import parse_qs, escape

QUERY_STRING = 'age=10&hobbies=software&hobbies=tunning'
d = parse_qs(QUERY_STRING)
print d.get('age', [''])[0] # ['']是默认值，如果在QUERY_STRING中没找到age则返回默认值
print d.get('hobbies', [])
print d.get('name', ['unknown'])

print 10 * '*'
print escape('<script>alert(123);</script>')

输出如下：

10
['software', 'tunning']
['unknown']
**********
<script>alert(123);</script>

然后，我们可以写一个基本的处理GET请求的动态网页了：

# ! /usr/bin/env python
# -*- coding: utf-8 -*- 

from wsgiref.simple_server import make_server
from cgi import parse_qs, escape

# html中form的method是get，action是当前页面
html = """
<html>
<body>
 <form method="get" action="">
  <p>
   Age: <input type="text" name="age" value="%(age)s">
  </p>
  <p>
   Hobbies:
   <input
    name="hobbies" type="checkbox" value="software"
    %(checked-software)s
   > Software
   <input
    name="hobbies" type="checkbox" value="tunning"
    %(checked-tunning)s
   > Auto Tunning
  </p>
  <p>
   <input type="submit" value="Submit">
  </p>
 </form>
 <p>
  Age: %(age)s<br>
  Hobbies: %(hobbies)s
 </p>
</body>
</html>
"""

def application (environ, start_response):

 # 解析QUERY_STRING
 d = parse_qs(environ['QUERY_STRING'])

 age = d.get('age', [''])[0] # 返回age对应的值
 hobbies = d.get('hobbies', []) # 以list形式返回所有的hobbies

 # 防止脚本注入
 age = escape(age)
 hobbies = [escape(hobby) for hobby in hobbies]

 response_body = html % { 
  'checked-software': ('', 'checked')['software' in hobbies],
  'checked-tunning': ('', 'checked')['tunning' in hobbies],
  'age': age or 'Empty',
  'hobbies': ', '.join(hobbies or ['No Hobbies"htmlcode">

> "Age: %(age)s" % {'age':12}
'Age: 12'
> 
> hobbies = ['software']
> ('', 'checked')['software' in hobbies]
'checked'
> ('', 'checked')['tunning' in hobbies]
''


解析POST请求
对于POST请求，查询字符串（query string）是放在HTTP请求正文（request body）中的，而不是放在URL中。请求正文在environment字典变量中键wsgi.input对应的值中，这是一个类似file的变量，这个值是一个。The PEP 3333 指出，请求头中CONTENT_LENGTH字段表示正文的大小，但是可能为空、或者不存在，所以读取请求正文时候要用try/except。
下面是一个可以处理POST请求的动态网站：


# ! /usr/bin/env python
# -*- coding: utf-8 -*- 

from wsgiref.simple_server import make_server
from cgi import parse_qs, escape

# html中form的method是post
html = """
<html>
<body>
 <form method="post" action="">
  <p>
   Age: <input type="text" name="age" value="%(age)s">
  </p>
  <p>
   Hobbies:
   <input
    name="hobbies" type="checkbox" value="software"
    %(checked-software)s
   > Software
   <input
    name="hobbies" type="checkbox" value="tunning"
    %(checked-tunning)s
   > Auto Tunning
  </p>
  <p>
   <input type="submit" value="Submit">
  </p>
 </form>
 <p>
  Age: %(age)s<br>
  Hobbies: %(hobbies)s
 </p>
</body>
</html>
"""

def application(environ, start_response):

 # CONTENT_LENGTH 可能为空，或者没有
 try:
  request_body_size = int(environ.get('CONTENT_LENGTH', 0))
 except (ValueError):
  request_body_size = 0

 request_body = environ['wsgi.input'].read(request_body_size)
 d = parse_qs(request_body)

 # 获取数据
 age = d.get('age', [''])[0] 
 hobbies = d.get('hobbies', []) 

 # 转义，防止脚本注入
 age = escape(age)
 hobbies = [escape(hobby) for hobby in hobbies]

 response_body = html % { 
  'checked-software': ('', 'checked')['software' in hobbies],
  'checked-tunning': ('', 'checked')['tunning' in hobbies],
  'age': age or 'Empty',
  'hobbies': ', '.join(hobbies or ['No Hobbies"color: #ff0000">Python WSGI入门
这段内容参考自An Introduction to the Python Web Server Gateway Interface (WSGI) 。
Web server
WSGI server就是一个web server，其处理一个HTTP请求的逻辑如下：


iterable = app(environ, start_response)
for data in iterable:
 # send data to client


app即WSGI application，environ即上文中的environment。可调用对象app返回一个可迭代的值，WSGI server获得这个值后将数据发送给客户端。
Web framework/app
即WSGI application。
中间件（Middleware）
中间件位于WSGI server和WSGI application之间，所以
一个示例
该示例中使用了中间件。


# ! /usr/bin/env python
# -*- coding: utf-8 -*- 

from wsgiref.simple_server import make_server

def application(environ, start_response):

 response_body = 'hello world!'

 status = '200 OK'

 response_headers = [
  ('Content-Type', 'text/plain'),
  ('Content-Length', str(len(response_body)))
 ]

 start_response(status, response_headers)
 return [response_body]

# 中间件
class Upperware:
 def __init__(self, app):
  self.wrapped_app = app

 def __call__(self, environ, start_response):
  for data in self.wrapped_app(environ, start_response):
  yield data.upper()

wrapped_app = Upperware(application)

httpd = make_server('localhost', 8051, wrapped_app)

httpd.serve_forever()

print 'end'


然后
有了这些基础知识，就可以打造一个web框架了。感兴趣的话，可以阅读一下Bottle、Flask等的源码。
在Learn about WSGI还有更多关于WSGI的内容。
总结
以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，如果有疑问大家可以留言交流，谢谢大家对的支持。

上一篇： flask框架中勾子函数的使用详解

下一篇： flask中过滤器的使用详解

Python WSGI的深入理解

最新资源